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Preface 


The history of Automatic Control is both ancient and modern. If we adopt 
the broad view that an automatic control system is any mechanism by which 
an input action and output action are dynamically coupled, then the origins 
of this encyclopedia’s subject matter may be traced back more than 2,000 
years to the era of primitive time-keeping and the clepsydra water clock 
perfected by Ctesibius of Alexandria. In more recent history, frequently cited 
examples of feedback control include the automatically refilling reservoirs of 
flush toilets (perfected in the late nineteenth century) and the celebrated fly- 
ball steam-flow governor described in J.C. Maxwell’s 1868 Royal Society of 
London paper—“On Governors.” 

Although it is useful to keep the technologies of antiquity in mind, the 
history of systems and control as covered in the pages of this encyclopedia 
begins in the twentieth century. The history was profoundly influenced by 
work of Nyqvist, Black, Bode, and others who were developing amplifier 
theory in response to the need to transmit wireline signals over long distances. 
This research provided major conceptual advances in feedback and stability 
that proved to be of interest in the theory of servomechanisms that was being 
developed at the same time. Driven by the need for fast and accurate control of 
weapons systems during World War II, automatic control developed quickly 
as a recognizable discipline. 

While the developments of the first half of the twentieth century are an 
important backdrop for the Encyclopedia of Systems and Control , most of the 
topics directly treat developments from 1948 to the present. The year 1948 
was auspicious for systems and control—and indeed for all the information 
sciences. Norbert Wiener’s book Cybernetics was published by Wiley, the 
transistor was invented (and given its name), and Shannon’s seminal paper 
“A Mathematical Theory of Communication” was published in the Bell 
System Technical Journal. In the years that followed, important ideas of 
Shannon, Wiener, Von Neumann, Turing, and many others changed the way 
people thought about the basic concepts of control systems. The theoretical 
advances have propelled industrial and societal impact as well (and vice 
versa). Today, advanced control is a crucial enabling technology in domains 
as numerous and diverse as aerospace, automotive, and marine vehicles; the 
process industries and manufacturing; electric power systems; homes and 
buildings; robotics; communication networks; economics and finance; and 
biology and biomedical devices. 
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Preface 


It is this incredible broadening of the scope of the field that has motivated 
the editors to assemble the entries that follow. This encyclopedia aims to help 
students, researchers, and practitioners learn the basic elements of a vast array 
of topics that are now considered part of systems and control. The goal is to 
provide entry-level access to subject matter together with cross-references to 
related topics and pointers to original research and source material. 

Entries in the encyclopedia are organized alphabetically by title, and 
extensive links to related entries are included to facilitate topical reading— 
these links are listed in “Cross-References” sections within entries. All cross- 
referenced entries are indicated by a preceding symbol: ►. In the electronic 
version of the encyclopedia these entries are hyperlinked for ease of access. 

The creation of the Encyclopedia of Systems and Control has been a major 
undertaking that has unfolded over a 3-year period. We owe an enormous debt 
to major intellectual leaders in the field who agreed to serve as topical section 
editors. They have ensured the value of the opus by recruiting leading experts 
in each of the covered topics and carefully reviewing drafts. It has been a 
pleasure also to work with Oliver Jackson and Andrew Spencer of Springer, 
who have been unfailingly accommodating and responsive over this time. 

As we reflect back over the course of this project, we are reminded of 
how it began. Gary Balas, one of the world’s experts in robust control and 
aerospace applications, came to one of us after a meeting with Oliver at the 
Springer booth at a conference and suggested this encyclopedia—but was 
adamant that he wasn’t the right person to lead it. The two of us took the 
initiative (ultimately getting Gary to agree to be the section editor for the 
aerospace control entries). Gary died last year after a courageous fight with 
cancer. Our sense of accomplishment is infused with sadness at the loss of a 
close friend and colleague. 

We hope readers find this encyclopedia a useful and valuable compendium 
and we welcome your feedback. 


Boston, USA 
Minneapolis, USA 
May 2015 


John Baillieul 
Tariq Samad 
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Active Power Control of Wind Power 
Plants for Grid Integration 

Lucy Y. Pao 

University of Colorado, Boulder, CO, USA 

Abstract 

Increasing penetrations of intermittent renewable 
energy sources, such as wind, on the utility grid 
have led to concerns over the reliability of the 
grid. One approach for improving grid reliability 
with increasing wind penetrations is to actively 
control the real power output of wind turbines 
and wind power plants. Providing a full range 
of responses requires derating wind power plants 
so that there is headroom to both increase and 
decrease power to provide grid balancing services 
and stabilizing responses. Initial results indicate 
that wind turbines may be able to provide pri¬ 
mary frequency control and frequency regulation 
services more rapidly than conventional power 
plants. 

Keywords 

Active power control; Automatic generation 
control; Frequency regulation; Grid balancing; 
Grid integration; Primary frequency control; 
Wind energy 


Balancing Electrical Generation 
and Load on the Grid 

Wind penetration levels across the world have 
increased dramatically, with installed capacity 
growing at a mean annual rate of 25 % over 
the last decade (Gsanger and Pitteloud 2013). 
Some nations in Western Europe, particularly 
Denmark, Portugal, Spain, and Germany, have 
seen wind provide more than 16% of their an¬ 
nual electrical energy needs (Wiser and Bolinger 
2013). To maintain grid frequency at its nominal 
value, the electrical generation must equal the 
electrical load on the grid. This balancing has 
historically been left up to conventional utilities 
with synchronous generators, which can vary 
their active power output by simply varying their 
fuel input. Grid frequency control is performed 
across a number of regimes and time scales, with 
both manual and automatic control commands. 
Further details can be found in Rebours et al. 
(2007) and Ela etal. (2011). 

Wind turbines and wind power plants are 
now being recognized as having the potential to 
meet demanding grid stabilizing requirements 
set by transmission system operators (Aho et al. 
2013a,b; Buckspan et al. 2012; Ela et al. 2011; 
Miller et al. 201 1). Recent grid code requirements 
have spurred the development of wind turbine 
active power control (APC) systems, which allow 
wind turbines to participate in grid frequency 
regulation and provide stabilizing responses to 
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sudden changes in grid frequency. The ability 
of wind turbines to provide APC services also 
allows them to follow forecast-based power 
production schedules. 

For a wind turbine to fully participate in grid 
frequency control, it must be derated (to ^derated) 
with respect to the maximum power (P max ) that 
can be generated given the available wind, allow¬ 
ing for both increases and decreases in power, if 
necessary. Wind turbines can derate their power 
output by pitching their blades to shed aerody¬ 
namic power or reducing their generator torque 
in order to operate at higher-than-optimal rotor 
speeds. Wind turbines can then respond at dif¬ 
ferent time scales to provide more or less power 
through pitch control (which can provide a power 
response within seconds) and generator torque 
control (which can provide a power response 
within milliseconds). 

Wind Turbine Inertial and Primary 
Frequency Control 

Inertial and primary frequency control is gen¬ 
erally considered to be the first 5-10 s after a 
frequency event occurs. In this regime, the gov¬ 
ernors of capable utilities actuate, allowing for 
a temporary increase or decrease in the utilities’ 
power outputs. The primary frequency control 
(PFC) response provided by conventional syn¬ 
chronous generators can be characterized by a 
droop curve, which relates fluctuations in grid 
frequency to a change in power from the utility. 
For example, a 3 % droop curve means that a 3 % 
change in grid frequency yields a 100% change 
in commanded power. 

Although modern wind turbines do not in¬ 
herently provide inertial or primary frequency 
control responses because their power electronics 
impart a buffer between their generators and the 
grid, such responses can be produced through 
careful design of the wind turbine control sys¬ 
tems. While the physical properties of a con¬ 
ventional synchronous generator yield a static 
droop characteristic, a wind turbine can be con¬ 
trolled to provide a primary frequency response 
via either a static or time-varying droop curve. 
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Active Power Control of Wind Power Plants for Grid 
Integration, Fig. 1 Simulation results showing the ca¬ 
pability of wind power plants to provide APC services 
on a small-scale grid model. The total grid size is 3 GW, 
and a frequency event is induced due to the sudden 
active power imbalance when 5 % of generation is taken 
offline at time = 200 s. Each wind power plant is derated 
to 90 % of its rated capacity. The system response with 
all conventional generation (no wind) is compared to the 
cases when there are wind power plants on the grid at 
10 % penetration (i) with a baseline control system (wind 
baseline) where wind does not provide APC services and 
(ii) with an APC system (wind APC) that uses a 3 % droop 
curve where either 50 % or 100 % of the wind power plants 
provide PFC 

A time-varying droop curve can be designed to be 
more aggressive when the magnitude of the rate 
of change of frequency of the grid is larger. 

Figure 1 shows a simulation of a grid re¬ 
sponse under different scenarios when 5 % of 
the generating capacity suddenly goes offline. 
When the wind power plant (10% of the gen¬ 
eration on the grid) is operating with its normal 
baseline control system that does not provide 
APC services, the frequency response is worse 
than the no-wind scenario, due to the reduced 
amount of conventional generation in the wind- 
baseline scenario that can provide power control 
services. However, compared to both the no-wind 
and wind-baseline cases, using PFC with a droop 
curve results in the frequency decline being ar¬ 
rested at a minimum (nadir) frequency / na dir that 
is closer to the nominal f nom = 60 Hz frequency 
level; further, the steady-state frequency / ss after 
the PFC response is also closer to f nom . It is 
important to prevent the difference f nom —/ na dir 
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from exceeding a threshold that can lead to un¬ 
derfrequency load shedding (UFLS) or rolling 
blackouts. The particular threshold varies across 
utility grids, but the largest such threshold in 
North America is 1.5 Hz. 

Stability issues arising from the altered control 
algorithms must be analyzed (Buckspan et al. 
2013). The trade-offs between aggressive pri¬ 
mary frequency control and resulting structural 
loads also need to be evaluated carefully. Ini¬ 
tial research shows that potential grid support 
can be achieved while not causing any increases 
in structural loading and hence fatigue damage 
and operations and maintenance costs (Buckspan 
et al. 2012). 

Wind Turbine Automatic Generation 
Control 

Secondary frequency control, also known as au¬ 
tomatic generation control (AGC), occurs on a 
slower time scale than PFC. AGC commands can 
be generated from highly damped proportional 
integral (PI) controllers or logic controllers to 
regulate grid frequency and are used to control 
the power output of participating power plants. In 
many geographical regions, frequency regulation 
services are compensated through a competitive 
market, where power plants that provide faster 
and more accurate AGC command tracking are 
paid more. 

An active power control system that combines 
both primary and secondary/AGC frequency con¬ 
trol capabilities has recently been detailed in Aho 
et al. (2013a). Figure 2 presents initial exper¬ 
imental field test results of this active power 
controller, in response to prerecorded frequency 
events, showing how responsive wind turbines 
can be to both manual derating commands as well 
as rapidly changing automatic primary frequency 
control commands generated via a droop curve. 
Overall, results indicate that wind turbines can 
respond more rapidly than conventional power 
plants. However, increasing the power control 
and regulation performance of a wind turbine 
should be carefully considered due to a number 
of complicating factors, including coupling with 


existing control loops, a desire to limit actuator 
usage and structural loading, and wind variability. 

Active Power Control of Wind Power 
Plants 

A wind power plant, often referred to as a wind 
farm, consists of many wind turbines. In wind 
power plants, wake effects can reduce generation 
in downstream turbines to less than 60 % of the 
lead turbine (Barthelmie et al. 2009; Porte-Agel 
et al. 2013). There are many emerging areas 
of active research, including the modeling of 
wakes and wake effects and how these models 
can then be used to coordinate the control of 
individual turbines so that the overall wind power 
plant can reliably track the desired power ref¬ 
erence command. A wind farm controller can 
be interconnected with the utility grid, trans¬ 
mission system operator (TSO), and individual 
turbines as shown in Fig. 3. By properly account¬ 
ing for the wakes, wind farm controllers can 
allocate appropriate power reference commands 
to the individual wind turbines. Individual tur¬ 
bine generator torque and blade pitch controllers, 
as discussed earlier, can be designed so that 
each turbine follows the power reference com¬ 
mand issued by the wind farm controller. Meth¬ 
ods for intelligent, distributed control of entire 
wind farms to rapidly respond to grid frequency 
disturbances could significantly reduce frequency 
deviations and improve recovery speed to such 
disturbances. 


Combining Techniques with Other 
Approaches for Balancing the Grid 

Ultimately, active power control of wind turbines 
and wind power plants should be combined 
with both demand-side management and storage 
to provide a more comprehensive solution 
that enables balancing electrical generation 
and electrical load with large penetrations 
of wind energy on the grid. Demand-side 
management (Callaway and Hiskens 2011; Kowli 
and Meyn 2011; Palensky and Dietrich 2011) 
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Active Power Control of Wind Power Plants for Grid 
Integration, Fig. 2 The frequency data input and power 
that is commanded and generated during a field test 
with a 550 kW research wind turbine at the US National 
Renewable Energy Laboratory (NREL). The frequency 
data was recorded on the Electric Reliability Council of 
Texas (ERCOT) interconnection (data courtesy of Vahan 


Gevorgian, NREL). The upper plot shows the grid fre¬ 
quency, which is passed through a 5 % droop curve with 
a deadband to generate a power command. The high- 
frequency fluctuations in the generated power would be 
smoothed when aggregating the power output of an entire 
wind power plant 



Active Power Control of Wind Power Plants for Grid 
Integration, Fig. 3 Schematic showing the communica¬ 
tion and coupling between the wind farm control system, 
individual wind turbines, utility grid, and the grid operator. 


The wind farm controller uses measurements of the utility 
grid frequency and automatic generation control power 
command signals from the grid operator to determine a 
power reference for each turbine in the wind farm 
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aims to alter the demand in order to mitigate peak 
electrical loads and hence to maintain sufficient 
control authority among generating units. As 
more effective and economical energy storage 
solutions (Pickard and Abbott 2012) at the power 
plant scale are developed, wind (and solar) energy 
can then be stored when wind (and solar) energy 
availability is not well matched with electrical 
demand. Advances in wind forecasting (Giebel 
et al. 2011) will also improve wind power 
forecasts to facilitate more accurate scheduling 
of larger amounts of wind power on the grid. 

Cross-References 

► Control of Fluids and Fluid-Structure Interac¬ 
tions 

► Control Structure Selection 

► Coordination of Distributed Energy Resources 
for Provision of Ancillary Services: Architec¬ 
tures and Algorithms 

► Electric Energy Transfer and Control via Power 
Electronics 

► Networked Control Systems: Architecture and 
Stability Issues 

► Power System Voltage Stability 

► Small Signal Stability in Electric Power Sys¬ 
tems 


Recommended Reading 

A recent comprehensive report on active power 
control that covers topics ranging from control 
design to power system engineering to economics 
can be found in Ela et al. (2014) and the refer¬ 
ences therein. 
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Petros A. Ioannou 

University of Southern California, Los Angeles, 
CA, USA 


Abstract 

Adaptive control of linear time-invariant (LTI) 
systems deals with the control of LTI systems 
whose parameters are constant but otherwise 
completely unknown. In some cases, large norm 
bounds as to where the unknown parameters 
are located in the parameter space are also 
assumed to be known. In general, adaptive 
control deals with LTI plants which cannot 
be controlled with fixed gain controllers, 
i.e., nonadaptive control methods, and their 
parameters even though assumed constant for 
design and analysis purposes may change 
over time in an unpredictable manner. Most 
of the adaptive control approaches for LTI 
systems use the so-called certainty equivalence 
principle where a control law motivated from 
the known parameter case is combined with 
an adaptive law for estimating on line the 
unknown parameters. The control law could 
be associated with different control objectives 
and the adaptive law with different parameter 
estimation techniques. These combinations give 
rise to a wide class of adaptive control schemes. 
The two popular control objectives that led to a 
wide range of adaptive control schemes include 
model reference adaptive control (MRAC) and 
adaptive pole placement control (APPC). In 
MRAC, the control objective is for the plant 
output to track the output of a reference model, 
designed to represent the desired properties 
of the plant, for any reference input signal. 
APPC is more general and is based on control 
laws whose objective is to set the poles of 
the closed loop at desired locations chosen 
based on performance requirements. Another 
class of adaptive controllers for LTI systems 
that involves ideas from MRAC and APPC 


is based on multiple models, search methods, 
and switching logic. In this class of schemes, 
the unknown parameter space is partitioned to 
smaller subsets. For each subset, a parameter 
estimator or a stabilizing controller is designed 
or a combination of the two. The problem then 
is to identify which subset in the parameter 
space the unknown plant model belongs to and/or 
which controller is a stabilizing one and meets the 
control objective. A switching logic is designed 
based on different considerations to identify 
the most appropriate plant model or controller 
from the list of candidate plant models and/or 
controllers. In this entry, we briefly describe the 
above approaches to adaptive control for LTI 
systems. 


Keywords 

Adaptive pole placement control; Direct MRAC; 
Indirect MRAC; LTI systems; Model reference 
adaptive control; Robust adaptive control 


Model Reference Adaptive Control 

In model reference control (MRC), the desired 
plant behavior is described by a reference model 
which is simply an LTI system with a transfer 
function W m (s) and is driven by a reference 
input. The controller transfer function C(s,0*), 
where is a vector with the coefficients of 
C(s), is then developed so that the closed-loop 
plant has a transfer function equal to W m ( s ). This 
transfer function matching guarantees that the 
plant will match the reference model response for 
any reference input signal. In this case the plant 
transfer function G p (s, 6*), where 6* is a vector 
with all the coefficients of G p (s ), together with 
the controller transfer function C(s,Q*) should 
lead to a closed-loop transfer function from the 
reference input r to the plant output y p that is 
equal to W m (s), i.e., 


MO 

r(s) 


= W m (s ) = 


Jm(Q 

r(s) ’ 


( 1 ) 
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where y m is the output of the reference model. 
For this transfer matching to be possible, G p ( s ) 
and W m (s) have to satisfy certain assumptions. 
These assumptions enable the calculation of the 
controller parameter vector (9* as 

<9* = F(0*), (2) 

where F is a function of the plant parameters 
dp, to satisfy the matching equation (1). The 
function in (2) has a special form in the case 
of MRC that allows the design of both direct 
and indirect MRAC. For more general classes 
of controller structures, this is not possible in 
general as the function F is nonlinear. This trans¬ 
fer function matching guarantees that the track¬ 
ing error e\ = y p — y m converges to zero for 
any given reference input signal r. If the plant 
parameter vector 0 * is known, then the controller 
parameters can be calculated using (2), and 
the controller C(s,9 *) can be implemented. We 
are considering the case where 0* is unknown. 
In this case, the use of the certainty equivalence 
(CE) approach, (Astrom and Wittenmark 1995; 
Egardt 1979; Ioannou and Fidan 2006; Ioannou 
and Kokotovic 1983; Ioannou and Sun 1996; 
Landau 1979; Landau et al. 1998; Morse 1996; 
Narendra and Annaswamy 1989; Narendra and 
Balakrishnan 1997; Sastry and Bodson 1989; 
Stefanovic and Safonov 2011; Tao 2003) where 
the unknown parameters are replaced with their 
estimates, leads to the adaptive control scheme 
referred to as indirect MRAC , shown in Fig. la. 

The unknown plant parameter vector 0 * is es¬ 
timated at each time t denoted by 9 p (t), using an 
online parameter estimator referred to as adaptive 
law. The plant parameter estimate 9 p (t) at each 



time t is then used to calculate the controller 
parameter vector 9{t) = F(9 p (t )) used in the 
controller C(s, 9). This class of MRAC is called 
indirect MRAC, because the controller parame¬ 
ters are not updated directly, but calculated at 
each time t using the estimated plant parameters. 
Another way of designing MRAC schemes is to 
parameterize the plant transfer function in terms 
of the desired controller parameter vector (9*. 
This is possible in the MRC case, because the 
structure of the MRC law is such that we can use 
(2) to write 

9* = F~\9 *), (3) 

where F~ l is the inverse of the map¬ 
ping F(-), and then express G p (s,6*) = 

G p (s , F -1 (#*)) = G p (s , 6 *). The adaptive law 
for estimating 0 * online can now be developed by 
using y p = G p (s, 9*)u p to obtain a parametric 
model that is appropriate for estimating the 
controller vector as the unknown parameter 
vector. The MRAC can then be developed using 
the CE approach as shown in Fig. lb. In this case, 
the controller parameter 9(t) is updated directly 
without any intermediate calculations, and for 
this reason, the scheme is called direct MRAC. 

The division of MRAC to indirect and direct 
is, in general, unique to MRC structures, and it is 
possible due to the fact that the inverse maps in 
(2) and (3) exist which is a direct consequence 
of the control objective and the assumptions the 
plant and reference model are required to satisfy 
for the control law to exist. These assumptions 
are summarized below: 

Plant Assumptions : G p (s ) is minimum phase, 
i.e., has stable zeros, its relative degree, n* = 



Adaptive Control of Linear Time-Invariant Systems, Fig. 1 Structure of (a) indirect MRAC, (b) direct MRAC 













































8 


Adaptive Control of Linear Time-Invariant Systems 


number of poles—number of zeros, is known 
and an upper bound n on its order is also 
known. In addition, the sign of its high- 
frequency gain is known even though it can be 
relaxed with additional complexity. 

Reference Model Assumptions: W m (s) has stable 
poles and zeros, its relative degree is equal to 
n* that of the plant, and its order is equal or 
less to the one assumed for the plant, i.e., of n. 
The above assumptions are also used to meet 
the control objective in the case of known pa¬ 
rameters, and therefore the minimum phase and 
relative degree assumptions are characteristics of 
the control objective and do not arise because 
of adaptive control considerations. The relative 
degree matching is used to avoid the need to 
differentiate signals in the control law. The mini¬ 
mum phase assumption comes from the fact that 
the only way for the control law to force the 
closed-loop plant transfer function to be equal 
to that of the reference model is to cancel the 
zeros of the plant using feedback and replace 
them with those of the reference model using a 
feedforward term. Such zero pole cancelations 
are possible if the zeros are stable, i.e., the plant 
is minimum phase; otherwise stability cannot be 
guaranteed for nonzero initial conditions and/or 
inexact cancelations. 

The design of MRAC in Fig. 1 has additional 
variations depending on how the adaptive law 
is designed. If the reference model is chosen to 
be strictly positive real (SPR) which limits its 
transfer function and that of the plant to have 
relative degree 1, the derivation of adaptive law 
and stability analysis is fairly straightforward, 
and for this reason, this class of MRAC schemes 
attracted a lot of interest. As the relative degree 
changes to 2, the design becomes more complex 
as in order to use the SPR property, the CE 
control law has to be modified by adding an extra 
nonlinear term. The stability analysis remains to 
be simple as a single Lyapunov function can be 
used to establish stability. As the relative degree 
increases further, the design complexity increases 
by requiring the addition of more nonlinear terms 
in the CE control law (Ioannou and Fidan 2006; 
Ioannou and Sun 1996). The simplicity of using 
a single Lyapunov function analysis for stability 


remains however. This approach covers both di¬ 
rect and indirect MRAC and lead to adaptive laws 
which contain no normalization signals (Ioannou 
and Fidan 2006; Ioannou and Sun 1996). A more 
straightforward design approach is based on the 
CE principle which separates the control design 
from the parameter estimation part and leads to a 
much wider class of MRAC which can be direct 
or indirect. In this case, the adaptive laws need 
to be normalized for stability, and the analysis is 
far more complicated than the approach based on 
SPR with no normalization. An example of such a 
direct MRAC scheme for the case of known sign 
of the high-frequency gain which is assumed to 
be positive for both plant and reference model is 
listed below: 

Control law : 

u p — ^3 (t)y p 

+co(t)r = 6 T (t)(D , (4) 

where a = a n - 2 (s) = [s n ~ 2 , s n ~ 3 ,..., s, l] T 
for n > 2, and a'(s) = 0 for n = 1, and A(s) 
is a monic polynomial with stable roots and 
degree n — 1 having numerator of W m (s) as a 
factor. 

Adaptive law : 

9 = rsf, (5) 

where r is a positive definite matrix referred 
to as the adaptive gain and p = ye£, e = 
m 2 s = 1 + <P T( t> + u 2 f , £ = 9 T <p + u f , 
(p = — W m (s)co, and up = W m (s)u p . 

The stability properties of the above direct 
MRAC scheme which are typical for all classes 
of MRAC are the following (Ioannou and Fidan 
2006; Ioannou and Sun 1996): (i) All signals 
in the closed-loop plant are hounded, and the 
tracking error e\ converges to zero asymptotically 
and (ii) if the plant transfer function contains no 
zero pole cancelations and r is sufficiently rich 
of order 2 n, i.e., it contains at least n distinct 
frequencies, then the parameter error \9\ = 

1 6 — 9*\ and the tracking error e\ converge to 
zero exponentially fast. 
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Adaptive Pole Placement Control 

Let us consider the SISO LTI plant: 


Op = r £ 0 , £ = 


m2 


m] = 1 + <p T <p, 

(7) 


yp — G p(s) — 


Zpjs) 

Rp(s)’ 


( 6 ) 


where G^s) is proper and R p (s ) is a monic 
polynomial. The control objective is to choose 
the plant input u p so that the closed-loop poles 
are assigned to those of a given monic Hurwitz 
polynomial A*(s), and y p is required to follow 
a certain class of reference signals y m assumed 
to satisfy Q m (s)y m = 0 where Q m (s ) is known 
as the internal model of y m and is designed to 
have all roots in Re{s} < 0 with no repeated 
roots on the jco- axis. The polynomial A*(s), re¬ 
ferred to as the desired closed-loop characteristic 
polynomial, is chosen based on the closed-loop 
performance requirements. To meet the control 
objective, we make the following assumptions 
about the plant: 

PL G p (s ) is strictly proper with known de¬ 
gree, and R p (s ) is a monic polynomial whose 
degree n is known and Q m (s)Z p (s) and R p (s ) 
are coprime. 

Assumption PI allows Z p and R p to be non- 
Hurwitz in contrast to the MRAC case where Z p 
is required to be Hurwitz. 

The design of the APPC scheme is based 
on the CE principle. The plant parameters are 
estimated at each time t and used to calculate the 
controller parameters that meet the control ob¬ 
jective for the estimated plant as follows: Using 
(6) the plant equation can be expressed in a 
form convenient for parameter estimation via the 
model (Goodwin and Sin 1984; Ioannou and 
Fidan 2006; Ioannou and Sun 1996): 


z = 


where z 


= T^yp’ 


e; = 




L A,(A 


A p (s) 


'■yp? 


&n —1 




n —1 


■ l] r , 0* = [a n -u ... ,a 0 ] T , 0£ = 

[b n -\,... ,bo] T , and A p (s) is a Hurwitz monic 
design polynomial. As an example of a parameter 
estimation algorithm, we consider the gradient 
algorithm 


where r = r T > 0 is the adaptive gain 
and 0 P = [b n -\,..., bo, a n -\,..., ao] T are the 
estimated plant parameters which can be used to 
form the estimated plant polynomials R p (s, t ) = 
s n + a n -\{t)s n ~ x + ... + a\(t)s + ao(t) and 
Z p (s,t ) = b n -i(t)s n ~ l + ... + bi(t)s + b 0 (t) 
of R p (s) and Z p (s ), respectively, at each time t. 
The adaptive control law is given as 


u p — ^A(iS') L(s, t)Q m (s)^ 

-PiLO^-^iyp-ym), ( 8 ) 

where L(s,t) and P(s,t ) are obtained by solv¬ 
ing the polynomial equation L(s,t ) • Q m (s ) • 
R p (s,t ) + P(s,t) • Z p (s,t ) = A*(^) at each 
time t. The operation X(s,t ) • Y(s,t ) denotes a 
multiplication of polynomials where ^ is simply 
treated as a variable. The existence and unique¬ 
ness of L(s,t ) and P(s,t) is guaranteed pro¬ 
vided R p (s, t ) • Q m (s ) and Z p (s, t) are coprime 
at each frozen time t. The adaptive laws that 
generate the coefficients of R p (s , t ) and Z p (s, t) 
cannot guarantee this property, which means that 
at certain points in time, the solution L(s,t ), 
P(s,t ) may not exist. This problem is known 
as the stabilizability problem in indirect APPC 
and further modifications are needed in order to 
handle it (Goodwin and Sin 1984; Ioannou and 
Fidan 2006; Ioannou and Sun 1996). Assuming 
that the stabilizability condition holds at each 
time t, it can be shown (Goodwin and Sin 1984; 
Ioannou and Fidan 2006; Ioannou and Sun 1996) 
that all signals are bounded and the tracking 
error converges to zero with time. Other indi¬ 
rect adaptive pole placement control schemes 
include adaptive linear quadratic (Ioannou and 
Fidan 2006; Ioannou and Sun 1996). In principle 
any nonadaptive control scheme can be made 
adaptive by replacing the unknown parameters 
with their estimates in the calculation of the 
controller parameters. The design of direct APPC 
schemes is not possible in general as the map 
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between the plant and controller parameters is 
nonlinear, and the plant parameters cannot be 
expressed as a convenient function of the con¬ 
troller parameters. This prevents parametrization 
of the plant transfer function with respect to the 
controller parameters as done in the case of MRC. 
In special cases where such parametrization is 
possible such as in MRAC which can be viewed 
as a special case of APPC, the design of direct 
APPC is possible. Chapters on ► Adaptive Con¬ 
trol, Overview, ► Robust Adaptive Control, and 
► History of Adaptive Control provide additional 
information regarding MRAC and APPC. 

Search Methods, Multiple Models, 
and Switching Schemes 

One of the drawbacks of APPC is the stabilizabil- 
ity condition which requires the estimated plant 
at each time t to satisfy the detectability and 
stabilizability condition that is necessary for the 
controller parameters to exist. Since the adaptive 
law cannot guarantee such a property, an ap¬ 
proach emerged that involves the pre-calculation 
of a set of controllers based on the partition¬ 
ing of the plant parameter space. The problem 
then becomes one of identifying which one of 
the controllers is the most appropriate one. The 
switching to the “best” possible controller could 
be based on some logic that is driven by some 
cost index, multiple estimation models, and other 
techniques (Fekri et al. 2007; Hespanha et al. 
2003; Kuipers and Ioannou 2010; Morse 1996; 
Narendra and Balakrishnan 1997; Stefanovic and 
Safonov 2011). One of the drawbacks of this ap¬ 
proach is that it is difficult if at all possible to find 
a finite set of stabilizing controllers that cover 
the whole unknown parameter space especially 
for high-order plants. If found its dimension may 
be so large that makes it impractical. Another 
drawback that is present in all adaptive schemes 
is that in the absence of persistently exciting 
signals which guarantee that the input/output data 
have sufficient information about the unknown 
plant parameters, there is no guarantee that the 
controller the scheme converged to is indeed a 
stabilizing one. In other words, if switching is 


disengaged or the adaptive law is switched off, 
there is no guarantee that a small disturbance 
will not drive the corresponding LTI scheme 
unstable. Nevertheless these techniques allow the 
incorporation of well-established robust control 
techniques in designing a priori the set of con¬ 
troller candidates. The problem is that if the 
plant parameters change in a way not accounted 
for a priori, no controller from the set may be 
stabilizing leading to an unstable system. 

Robust Adaptive Control 

The MRAC and APPC schemes presented above 
are designed for LTI systems. Due to the adaptive 
law, the closed-loop system is no longer LTI but 
nonlinear and time varying. It has been shown 
using simple examples that the pure integral ac¬ 
tion of the adaptive law could cause parameter 
drift in the presence of small disturbances and/or 
unmodeled dynamics (Ioannou and Fidan 2006; 
Ioannou and Kokotovic 1983; Ioannou and Sun 
1996) which could then excite the unmodeled 
dynamics and lead to instability. Modifications 
to counteract these possible instabilities led to 
the field of robust adaptive control whose focus 
was to modify the adaptive law in order to guar¬ 
antee robustness with respect to disturbances, 
unmodeled dynamics, time-varying parameters, 
classes of nonlinearities, etc., by using techniques 
such as normalizing signals, projection, fixed and 
switching sigma modification, etc. 

Cross-References 

► Adaptive Control, Overview 

► History of Adaptive Control 

► Model Reference Adaptive Control 

► Robust Adaptive Control 

► Switching Adaptive Control 
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Adaptive Control, Overview 
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Callaghan, NSW, Australia 


Abstract 

Adaptive control describes a range of techniques 
for altering control behavior using measured sig¬ 
nals to achieve high control performance under 
uncertainty. The theory and practice of adaptive 


control has matured in many areas. This entry 
gives an overview of adaptive control with point¬ 
ers to more detailed specific topics. 

Keywords 

Adaptive control; Estimation 

Introduction 

What Is Adaptive Control 

Feedback control has a long history of using sens¬ 
ing, decision, and actuation elements to achieve 
an overall goal. The general structure of a control 
system may be illustrated in Fig. 1. It has long 
been known that high fidelity control relies on 
knowledge of the system to be controlled. For 
example, in most cases, knowledge of the plant 
gain and/or time constants (represented by 9 p in 
Fig. 1) is important in feedback control design. 
In addition, disturbance characteristics (e.g., fre¬ 
quency of a sinusoidal disturbance), 6d in Fig. 1, 
are important in feedback compensator design. 

Many control design and synthesis techniques 
are model based, using prior knowledge of both 
model structure and parameters. In other cases, 
a fixed controller structure is used, and the con¬ 
troller parameters, 6c in Fig. 1, are tuned em¬ 
pirically during control system commissioning. 



r{t) 

Reference Signals 


Adaptive Control, Overview, Fig. 1 General control 
and adaptive control diagram 
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However, if the plant parameters vary widely with 
time or have large uncertainties, these approaches 
may be inadequate for high-performance control. 

There are two main ways of approaching high- 
performance control with unknown plant and 
disturbance characteristics: 

1. Robust control (► Optimization Based Robust 
Control), wherein a controller is designed to 
perform adequately despite the uncertainties. 
Variable structure control may have very high 
levels of robustness in some cases and there¬ 
fore is a special class of robust nonlinear 
control. 

2. Adaptive control, where the controller learns 
and adjusts its strategy based on measured 
data. This frequently takes the form where the 
controller parameters, 6c, are time-varying 
functions that depend on the available data 
( y(t ), u(t ), and r(t)). Adaptive control has 
close links to intelligent control (including 
neural control (► Neural Control and Approx¬ 
imate Dynamic Programming), where specific 
types of learning are considered) and also 
to stochastic adaptive control (►Stochastic 
Adaptive Control). 

Robust control is most useful when there are 
large unmodeled dynamics (i.e., structural un¬ 
certainties), relatively high levels of noise, or 
rapid and unpredictable parameter changes. Con¬ 
versely, for slow or largely predictable parame¬ 
ter variations, with relatively well-known model 
structure and limited noise levels, adaptive con¬ 
trol may provide a very useful tool for high- 
performance control (Astrom and Wittenmark 
2008). 

Varieties of Adaptive Control 

One practical variant of adaptive control is con¬ 
troller auto-tuning (►Autotuning). Auto-tuning 
is particularly useful for PID and similar con¬ 
trollers and involves a specific phase of signal 
injection, followed by analysis, PID gain compu¬ 
tation, and implementation. These techniques are 
an important aid to commissioning and mainte¬ 
nance of distributed control systems. 


There are also large classes of adaptive 
controllers that are continuously monitoring the 
plant input-output signals to adjust the strategy. 
These adjustments are often parametrized by a 
relatively small number of coefficients, 6c . These 
include schemes where the controller parameters 
are directly adjusted using measureable data 
(also referred to as “implicit,” since there 
is no explicit plant model generated). Early 
examples of this often included model reference 
adaptive control (►Model Reference Adaptive 
Control). Other schemes (Middleton et al. 1988) 
explicitly estimate a plant model dp; thereafter, 
performing online control design and, therefore, 
the adaptation of controller parameters 6c 
are indirect. This then led on to a range of 
other adaptive control techniques applicable to 
linear systems (►Adaptive Control of Linear 
Time-Invariant Systems). 

There have been significant questions con¬ 
cerning the sensitivity of some adaptive control 
algorithms to unmodeled dynamics, time-varying 
systems, and noise (Ioannou and Kokotovic 
1984; Rohrs et al. 1985). This prompted a very 
active period of research to analyze and redesign 
adaptive control to provide suitable robustness 
(►Robust Adaptive Control) (e.g., Anderson 
et al. 1986; Ioannou and Sun 2012) and parameter 
tracking for time-varying systems (e.g., Kreis- 
selmeier 1986; Middleton and Goodwin 1988). 

Work in this area further spread to nonpara- 
metric methods, such as switching, or super¬ 
visory adaptive control (►Switching Adaptive 
Control) (e.g., Fu and Barmish 1986; Morse et al. 
1992). In addition, there has been a great deal of 
work on the more difficult problem of adaptive 
control for nonlinear systems (► Nonlinear Adap¬ 
tive Control). 

A further adaptive control technique is ex¬ 
tremum seeking control (►Extremum Seeking 
Control). In extremum seeking (or self optimiz¬ 
ing) control, the desired reference for the system 
is unknown, instead we wish to maximize (or 
minimize) some variable in the system (Ariyur 
and Krstic 2003). These techniques have quite 
distinct modes of operation that have proven 
important in a range of applications. 
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A final control algorithm that has nonpara- 
metric features is iterative learning control 
(►Iterative Learning Control) (Amann et al. 
1996; Moore 1993). This control scheme 
considers a system with a highly structured, 
namely, repetitive finite run, control problem. 
In this case, by taking a nonparametric approach 
of utilizing information from previous run(s), in 
many cases, near-perfect asymptotic tracking can 
be achieved. 

Adaptive control has a rich history (► History 
of Adaptive Control) and has been established 
as an important tool for some classes of control 
problems. 
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► History of Adaptive Control 
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Abstract 

This chapter discusses advanced cruise control 
automotive technologies, including adaptive 
cruise control (ACC) in which spacing control, 
speed control, and a number of transitional 
maneuvers must be performed. The ACC system 
must satisfy difficult performance requirements 
of vehicle stability and string stability. The 
technical challenges involved and the control 
design techniques utilized in ACC system design 
are presented. 
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Collision avoidance; String stability; Traffic sta¬ 
bility; Vehicle following 
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Introduction 

Adaptive cruise control (ACC) is an extension of 
cruise control. An ACC vehicle includes a radar, 
a lidar, or other sensor that measures the distance 
to any preceding vehicle in the same lane on the 
highway. In the absence of preceding vehicles, 
the speed of the car is controlled to a driver- 
desired value. In the presence of a preceding 
vehicle, the controller determines whether the ve¬ 
hicle should switch from speed control to spacing 
control. In spacing control, the distance to the 
preceding car is controlled to a desired value. 

A different form of advanced cruise control is 
a forward collision avoidance (FCA) system. An 
FCA system uses a distance sensor to determine 
if the vehicle is approaching a car ahead too 
quickly and will automatically apply brakes to 
minimize the chances of a forward collision. 
For the 2013 model year, 29 % vehicles have 
forward collision warning as an available option 
and 12% include autonomous braking for a full 
FCA system. Examples of models in which an 
FCA system is standard are the Mercedes Benz 
G-class and the Volvo S-60, S-80, XC-60, and 
XC-70. 

It should be noted that an FCA system does 
not involve steady-state vehicle following. An 
ACC system on the other hand involves control of 
speed and spacing to desired steady-state values. 

ACC systems have been in the market in Japan 
since 1995, in Europe since 1998, and in the US 
since 2000. An ACC system provides enhanced 
driver comfort and convenience by allowing ex¬ 
tended operation of the cruise control option even 
in the presence of other traffic. 


Controller Architecture 

The ACC system has two modes of steady state 
operation: speed control and vehicle following 
(i.e., spacing control). Speed control is traditional 
cruise control and is a well-established tech¬ 
nology. A proportional-integral controller based 
on feedback of vehicle speed (calculated from 
rotational wheel speeds) is used in cruise control 
(Rajamani 2012). 
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desired 
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fault 
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Adaptive Cruise Control, Fig. 1 Structure of longitudi¬ 
nal control system 


Controller design for vehicle following is the 
primary topic of discussion in the sections titled 
“Vehicle Following Requirements” and “String 
Stability Analysis” in this chapter. 

Transitional maneuvers and transitional con¬ 
trol algorithms are discussed in the section titled 
“Transitional Maneuvers” in this chapter. 

The longitudinal control system architecture 
for an ACC vehicle is typically designed to be 
hierarchical, with an upper-level controller and a 
lower-level controller, as shown in Fig. 1. 

The upper-level controller determines the de¬ 
sired acceleration for the vehicle. The lower level 
controller determines the throttle and/or brake 
commands required to track the desired accelera¬ 
tion. Vehicle dynamic models, engine maps, and 
nonlinear control synthesis techniques are used 
in the design of the lower controller (Rajamani 
2012). This chapter will focus only on the design 
of the upper controller, also known as the ACC 
controller. 

As far as the upper-level controller is con¬ 
cerned, the plant model for control design is 

Xi = u ( 1 ) 

where the subscript i denotes the i th car in a string 
of consecutive ACC cars. The acceleration of 
the car is thus assumed to be the control input. 
However, due to the finite bandwidth associated 
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Adaptive Cruise Control, 
Fig. 2 String of adaptive 
cruise control vehicles 



with the lower level controller, each car is ac¬ 
tually expected to track its desired acceleration 
imperfectly. The objective of the upper level con¬ 
troller design is therefore stated as that of meeting 
required performance specifications robustly in 
the presence of a first order lag in the lower-level 
controller performance: 


Xi 


1 _ 1 

: r^_des — ; ~ u i • 

rs + l rs + 1 


( 2 ) 


Equation (1) is thus assumed to be the nominal 
plant model while the performance specifications 
have to be met even if the actual plant model were 
given by Eq. (2). The lag r typically has a value 
between 0.2 and 0.5 s (Rajamani 2012). 


Vehicle Following Requirements 

In the vehicle following mode of operation, the 
ACC vehicle maintains a desired spacing from 
the preceding vehicle. The two important perfor¬ 
mance specifications that the vehicle following 
control system must satisfy are: individual vehi¬ 
cle stability and string stability. 

(a) Individual vehicle stability 
Consider a string of vehicles on the highway 
using a longitudinal control system for vehicle 
following, as shown in Fig. 2. Let Xj be the 
location of the i th vehicle measured from an 
inertial reference. The spacing error for the i th 
vehicle (the ACC vehicle under consideration) is 
then defined as 

— %i Xf— i + Ed es . (3) 

Here, Ld es is the desired spacing and includes 
the preceding vehicle length U~\. L d es could be 
chosen as a function of variables such as the 


vehicle speed i/. The ACC control law is said to 
provide individual vehicle stability if the spacing 
error of the ACC vehicle converges to zero when 
the preceding vehicle is operating at constant 
speed: 

xt-i -> 0 =* 8i -> 0. (4) 


(b) String stability 

The spacing error is expected to be non-zero 
during acceleration or deceleration of the preced¬ 
ing vehicle. It is important then to describe how 
the spacing error would propagate from vehicle 
to vehicle in a string of ACC vehicles during 
acceleration. The string stability of a string of 
ACC vehicles refers to a property in which spac¬ 
ing errors are guaranteed not to amplify as they 
propagate towards the tail of the string (Swaroop 
and Hedrick 1996). 


String Stability Analysis 

In this section, mathematical conditions that en¬ 
sure string stability are provided. 

Let Si and <5/_i be the spacing errors of con¬ 
secutive ACC vehicles in a string. Let H(s) be 
the transfer function relating these errors: 


H(s) = (5) 

°i -1 

The following two conditions can be used to 
determine if the system is string stable: 

(a) The transfer function H (, s ) should satisfy 


H(s) 


< 1 . 

OO 


( 6 ) 
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(b) The impulse response function h(t ) corre¬ 
sponding to H(s ) should not change sign 
(Swaroop and Hedrick 1996), i.e., 

h{t) >0 W > 0. (7) 

The reasons for these two requirements to be 
satisfied are described in Rajamani (2012). 
Roughly speaking, Eq. (6) ensures that 11 <5 Z * 112 < 
HS/- 1 II 2 , which means that the energy in the 
spacing error signal decreases as the spacing 
error propagates towards the tail of the string. 
Equation (7) ensures that the steady state spacing 
errors of the vehicles in the string have the same 
sign. This is important because a positive spacing 
error implies that a vehicle is closer than desired 
while a negative spacing error implies that it is 
further apart than desired. If the steady state value 
of Si is positive while that of Si-\ is negative, then 
this might be dangerous due to the vehicle being 
closer, even though in terms of magnitude Si 
might be smaller than Si- 1 . 

If conditions (6) and (7) are both satisfied, then 
Halloo < I l^-i I loo (Rajamani 2012). 


Equation (9) describes the propagation of spacing 
errors along the vehicle string. 

All positive values of k p and k v guarantee 
individual vehicle stability. However, it can be 
shown that there are no positive values of k p 
and k v for which the magnitude of G(s) can be 
guaranteed to be less than unity at all frequencies. 
The details of this proof are available in Rajamani 
( 2012 ). 

Thus, the constant spacing policy will always 
be string unstable. 

Constant Time-Gap Spacing 

Since the constant spacing policy is unsuitable 
for autonomous control, a better spacing policy 
that can ensure both individual vehicle stability 
and string stability must be used. The constant 
time-gap (CTG) spacing policy is such a spacing 
policy. In the CTG spacing policy, the desired 
inter-vehicle spacing is not constant but varies 
with velocity. The spacing error is defined as 

Si — Xf Xf— 1 T - T(5 es hxf . (10) 


Constant Inter-vehicle Spacing 


The ACC system only utilizes on board sensors 
like radar and does not depend on inter-vehicle 
communication from other vehicles. Hence the 
only variables available as feedback for the up¬ 
per controller are inter-vehicle spacing, relative 
velocity and the ACC vehicle’s own velocity. 

Under the constant spacing policy, the spacing 
error of the i th vehicle was defined in Eq. (3). 

If the acceleration of the vehicle can be instan¬ 
taneously controlled, then it can be shown that a 
linear control system of the type 

Xi = -k p Si - k 1 ;Si (8) 


results in the following closed-loop transfer func¬ 
tion between consecutive spacing errors 


H(s) = f-(s) 

Si -1 


kp 7 T k v s 
s 2 + k v s + k v 


(9) 


The parameter h is referred to as the time-gap. 

The following controller based on the CTG 
spacing policy can be used to regulate the spacing 
error at zero (Swaroop et al. 1994): 

X-i des — ki — 1 + XSi) (11) 

h 

With this control law, it can be shown that the 
spacing errors of successive vehicles Si and Si-\ 
are independent of each other: 

Si = -XSi ( 12 ) 

Thus, Si is independent of 1 and is expected to 
converge to zero as long as A > 0. However, this 
result is only true if any desired acceleration can 
be instantaneously obtained by the vehicle i.e., if 
r = 0. 

In the presence of the lower controller and 
actuator dynamics given by Eq. (2), it can be 
shown that the dynamic relation between <5/ and 
Si -1 in the transfer function domain is 
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hrs 3 + hs 2 + (1 + Xh)s + A 

The string stability of this system can be ana¬ 
lyzed by checking if the magnitude of the above 
transfer function is always less than or equal to 
1. It can be shown that this is the case at all 
frequencies if and only if (Rajamani 2012) 

h > 2r. (14) 

Further, if Eq. (14) is satisfied, then it is also 
guaranteed that one can find a value of A such 
that Eq. (7) is satisfied. Thus the condition (14) is 
necessary (Swaroop and Hedrick 1996) for string 
stability. 

Since the typical value of r is of the order 
of 0.5 s, Eq. (14) implies that ACC vehicles must 
maintain at least a 1-s time gap between vehicles 
for string stability. 

Transitional Maneuvers 

While under speed control, an ACC vehicle might 
suddenly encounter a new vehicle in its lane 
(either due to a lane change or due to a slower 
moving preceding vehicle). The ACC vehicle 
must then decide whether to continue to operate 
under the speed control mode or transition to the 
vehicle following mode or initiate hard braking. 
If a transition to vehicle following is required, a 


transitional trajectory that will bring the ACC ve¬ 
hicle to its steady state following distance needs 
to be designed. Similarly, a decision on the mode 
of operation and design of a transitional trajectory 
are required when an ACC vehicle loses its target. 

The regular CTG control law cannot directly 
be used to follow a newly encountered vehicle, 
see Rajamani (2012) for illustrative examples. 

When a new target vehicle is encountered by 
the ACC vehicle, a “range - range rate” diagram 
can be used (Fancher and Bareket 1994) to decide 
if 

(a) The vehicle should use speed control. 

(b) The vehicle should use spacing control (with 
a defined transition trajectory in which de¬ 
sired spacing varies slowly with time) 

(c) The vehicle should brake as hard as possible 
in order to avoid a crash. 

The maximum allowable values for acceleration 
and deceleration need to be taken into account in 
making these decisions. 

For the range - range rate (R — R) diagram, 
define range R and range rate R as 

R = Xi-\-Xi (15) 

R=x i - 1 -x i = Vi-!-Vi (16) 

where X/_i, X/, E/-i, and V are inertial positions 
and velocities of the preceding vehicle and the 
ACC vehicle respectively. 

A typical R — R diagram is shown in Fig. 3 
(Fancher and Bareket 1994). Depending on the 


Adaptive Cruise Control, 
Fig. 3 Range vs. 
range-rate diagram 
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Adaptive Cruise Control, Fig. 4 Switching line for 
spacing control 

measured real-time values of R and R t and 
the R — R diagram in Fig. 3, the ACC system 
determines the mode of longitudinal control. 
For instance, in region 1, the vehicle continues 
to operate under speed control. In region 2, 
the vehicle operates under spacing control. In 
region 3, the vehicle decelerates at the maximum 
allowed deceleration so as to try and avoid a 
crash. 

The switching line from speed to spacing con¬ 
trol is given by 

R = -TR + R final (17) 

where T is the slope of the switching line. When a 
slower vehicle is encountered at a distance larger 
than the desired final distance Rfi na i, the switching 
line shown in Fig. 4 can be used to determine 
when and whether the vehicle should switch to 
spacing control. If the distance R is greater than 
that given by the line, speed control should be 
used. 

The overall strategy (shown by trajectory 
ABC) is to first reduce gap at constant R and 
then follow the desired spacing given by the 
switching line of Eq. (17). 

The control law during spacing control on this 
transitional trajectory is as follows. Depending on 
the value of R , determine R from Eq. (17). Then 
use R as the desired inter-vehicle spacing in the 
PD control law 

■^des = kp (?Ci R') kd (%i R) (18) 


The trajectory of the ACC vehicle during constant 
deceleration is a parabola on the R — R diagram 
(Rajamani 2012). 

The switching line should be such that travel 
along the line is comfortable and does not con¬ 
stitute high deceleration. The deceleration during 
coasting (zero throttle and zero braking) can be 
used to determine the slope of the switching line 
(Rajamani 2012). 

Note that string stability is not a concern 
during transitional maneuvers (Rajamani 2012). 

Traffic Stability 

In addition to individual vehicle stability and 
string stability, another type of stability analysis 
that has received significant interest in ACC liter¬ 
ature is traffic flow stability. Traffic flow stability 
refers to the stable evolution of traffic velocity 
and traffic density on a highway section, for given 
inflow and outflow conditions. One well-known 
result in this regard in literature is that traffic flow 
is defined to be stable if || is positive, i.e., as 
the density p of traffic increases, traffic flow rate 
q must increase (Swaroop and Rajagopal 1999). 
If this condition is not satisfied, the highway 
section would be unable to accommodate any 
constant inflow of vehicles from an oncoming 
ramp. The steady state traffic flow on the highway 
section would come to a stop, if the ramp inflow 
did not stop (Swaroop and Rajagopal 1999). 

It has been shown that the constant time- 
gap spacing policy used in ACC systems has 
a negative q — p slope and thus does not lead 
to traffic flow stability (Swaroop and Rajagopal 
1999). It has also been shown that it is possible 
to design other spacing policies (in which the 
desired spacing between vehicles is a nonlinear 
function of speed, instead of being proportional 
to speed) that can provide stable traffic flow 
(Santhanakrishnan and Rajamani 2003). 

The importance of traffic flow stability has 
not been fully understood by the research com¬ 
munity. Traffic flow stability is likely to become 
important when the number of ACC vehicles 
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on the highway increase and their penetration 
percentage into vehicles on the road becomes 
significant. 

Recent Automotive Market 
Developments 

The latest versions of ACC systems on the market 
have been enhanced with collision warning, inte¬ 
grated brake support, and stop-and-go operation 
functionality. 

The collision warning feature uses the same 
radar as the ACC system to detect moving 
vehicles ahead and determine whether driver 
intervention is required. In this case, visual 
and audio warnings are provided to alert the 
driver and brakes are pre-charged to allow 
quick deceleration. On Ford’s ACC-equipped 
vehicles, brakes are also automatically applied 
when the driver lifts the foot off from the 
accelerator pedal in a detected collision warning 
scenario. 

When enabled with stop-and-go functional¬ 
ity, the ACC system can also operate at low 
vehicle speeds in heavy traffic. The vehicle can 
be automatically brought to a complete stop when 
needed and restarted automatically. Stop-and-go 
is an expensive option and requires the use of 
multiple radar sensors on each car. For instance, 
the BMW ACC system uses two short range 
and one long range radar sensor for stop-and-go 
operation. 

The 2013 versions of ACC on the Cadillac 
ATS and on the Mercedes Distronic systems are 
also being integrated with camera based lateral 
lane position measurement systems. On the Mer¬ 
cedes Distronic systems, a camera steering assist 
system provides automatic steering, while on the 
Cadillac ATS, a camera based system provides 
lane departure warnings. 

Future Directions 

Current ACC systems use only on-board sensors 
and do not use wireless communication with 


other vehicles. There is a likelihood of evolution 
of current systems into co-operative adaptive 
cruise control (CACC) systems which utilize 
wireless communication with other vehicles 
and highway infrastructure. This evolution 
could be facilitated by the dedicated short- 
range communications (DSRC) capability being 
developed by government agencies in the US, 
Europe and Japan. In the US, DSRC is being 
developed with a primary goal of enabling 
communication between vehicles and with 
infrastructure to reduce collisions and support 
other safety applications. In CACC, wireless 
communication could provide acceleration 
signals from several preceding downstream 
vehicles. These signals could be used in better 
spacing policies and control algorithms to 
improve safety, ensure string stability, and 
improve traffic flow. 


Cross-References 
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► Vehicle Dynamics Control 

► Vehicular Chains 
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Abstract 

This entry deals with the kinematic self¬ 
coordination aspects to be managed by parts 
of underwater floating manipulators, whenever 
employed for sample collections at the seafloor. 

Kinematic self-coordination is here intended 
as the autonomous ability exhibited by the system 
in closed loop specifying the most appropriate 
reference velocities for its main constitutive parts 
(i.e., the supporting vehicle and the arm) in order 
to execute the sample collection with respect to 
both safety and best operability conditions for 
the system while also guaranteeing the needed 
“execution agility” in performing the task, par¬ 
ticularly useful in case of underwater repeated 
collections. To this end, the devising and em¬ 
ployment of a unifying control framework capa¬ 
ble of guaranteeing the above properties will be 
outlined. 

Such a framework is however intended to only 
represent the so-called Kinematic Control Layer 
(KCL) overlaying a Dynamic Control Layer 
(DCL), where the overall system dynamic and 
hydrodynamic effects are suitably accounted 
for, to the benefit of closed loop tracking 
of the reference system velocities. Since the 
DCL design is carried out in a way which 
is substantially independent from the system 
mission(s), it will not constitute a specific topic 
of this entry, even if some orienting references 
about it will be provided. 

At this entry’s end, as a follow-up of the 
resulting structural invariance of the devised KCL 
framework, future challenges addressing much 
wider and complex underwater applications will 
be foreseen, beyond the here-considered sample 
collection one. 


Keywords 

Kinematic control law (KCL); Manipulator; 
Motion priorities 

Introduction 

An automated system for underwater sampling 
is here intended to be an autonomous underwa¬ 
ter floating manipulator (see Fig. 1) capable of 
collecting samples corresponding to an a priori 
assigned template. The snapshots of Fig. 1 outline 
the most recent realization of a system of this 
kind (completed in 2012 within the EU-funded 
project TRIDENT; Sanz et al. 2012) when in 
operation, which is characterized by a vehicle 
and an endowed 7-dof arm exhibiting comparable 
masses and inertia, thus resulting in potentially 
faster and more agile designs than the very few 
similar previous realizations. 

Its general the operational mode consists in 
exploring an assigned area of the seafloor, while 
executing a collection each time a feature corre¬ 
sponding to the assigned template is recognized 
(by the vehicle endowed with a stereovision sys¬ 
tem) as a sample to be collected. 

Thus the autonomous functionalities to be ex¬ 
hibited are the following (to be sequenced as they 
are listed on an event-driven basis): (1) explore an 
assigned seabed area while visually performing 
model-based sample recognitions, (2) suspend 
the exploration and grasping a recognized sam¬ 
ple, (3) deposit the sample inside an endowed 
container, and (4) then restart exploring till the 
next recognized sample. 

Functionalities (1) and (4), since they do not 
require the arm usage, naturally reenter within the 
topics of navigation, patrolling, visual mapping, 
etc., which are typical of traditional AUVs and 
consequently will not be discussed here. Only 
functionality (2) will be discussed, since it is 
most distinctive of the considered system (often 
termed as I-AUV, with “I” for “Intervention”) and 
because functionality (3) can be established along 
the same lines of (2) as a particular simpler case. 
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Advanced Manipulation for Underwater Sampling, Fig. 1 Snapshots showing the underwater floating manipulator 
TRIDENT when autonomously picking an identified object 


By then focusing on functionality (2), we 
must note how the sample grasping ultimate 
objective, which translates into a specific 
position/attitude to be reached by the end- 
effector, must however be achieved within the 
preliminary fulfillment of also other objectives, 
each one reflecting the need of guaranteeing 
the system operating within both its safety 
and best operability conditions. For instance, 
the arm’s joint limits must be respected and 
the arm singular postures avoided. Moreover, 
since the sample position is estimated via the 
vehicle with a stereo camera, the sample must 
stay grossly centered inside its visual cone, 
since otherwise the visual feedback would be 
lost and the sample search would need to start 
again. Also, the sample must stay within suitable 
horizontal and vertical distance limits from the 
camera frame, in order for the vision algorithm 
to be well performing. And furthermore, in these 
conditions the vehicle should be maintained with 
an approximately horizontal attitude, for energy 
savings. 

With the exception of the objective of making 
the end-effector position/attitude reaching the 
grasping position, which is clearly an equality 
condition, its related safety/enabling objectives 
are instead represented by a set of inequality 


conditions (involving various system variables) 
whose achievement (accordingly with their 
safety/enabling role) must therefore deserve the 
highest priority. 

System motions guaranteeing such prioritized 
objective achievements should moreover allow 
for a concurrent management of them (i.e., avoid¬ 
ing a sequential motion management whenever 
possible), which means requiring each objective 
progressing toward its achievement, by at each 
time instant only exploiting the residual system 
mobility allowed by the current progresses of its 
higher priority objectives. Since the available 
system mobility will progressively increase 
during time, accordingly with the progressive 
achievement of all inequality objectives, this 
will guarantee the grasping objective to be also 
completed by eventually progressing within 
adequate system safety and best operability 
conditions. In this way the system will also 
exhibit the necessary “agility” in executing its 
maneuvers, in a way faster than in case they were 
executed on a sequential motion basis. 

The devising of an effective way to incor¬ 
porate all the inequality and equality objectives 
within a uniform and computationally efficient 
task-priority-based algorithmic framework for 
underwater floating manipulators has been the 
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result of the developments outlined in the next 
section. 

The developed framework however solely 
represents the so-called Kinematic Control Layer 
(KCL) of the overall control architecture, that 
is, the one in charge of closed-loop real-time 
control generating the system velocity vector y 
as a reference signal, to be in turn concurrently 
tracked, via the action of the arm joint torques 
and vehicle thrusters, by an adequate underlying 
Dynamic Control Layer (DCL), where the overall 
dynamic and hydrodynamic effects are kept into 
account to the benefit of such velocity tracking. 
Since the DCL can actually be designed in a 
way substantially independent from the system 
mission(s), it will not constitute a specific topic 
of this entry. Its detailed dynamic-hydrodynamic 
model-based structuring, also including a stabil¬ 
ity analysis, can be found in Casalino (2011), 
together with a more detailed description of the 
upper-lying KCL, while more general references 
on underwater dynamic control aspects can be 
found, for instance, in Antonelli (2006). 

Task-Priority-Based Control of 
Floating Manipulators 

The above-outlined typical set of objectives (of 
inequality and/or equality types) to be achieved 
within a sampling mission are here formalized. 
Then some helpful generalizing definitions are 
given, prior to presenting the related unifying 
task-priority-based algorithmic framework to 
be used. 

Inequality and Equality Objectives 

One of the objectives, of inequality type, related 
to both arm safety and its operability is that 
of maintaining each joint within corresponding 
minimum and maximum limits, that is, 

qim < qi < qm\ i = 1,2,...,7 

Moreover, in order to have the arm operating 
with dexterity, its manipulability measure (Naka¬ 
mura 1991; Yoshikawa 1985) must ultimately 
stay above a minimum threshold value, thus also 


requiring the achievement of the inequality type 
objective 

M > Mm 

While the above objectives arise from inherently 
scalar variables, other objectives instead arise as 
conditions to be achieved within the Cartesian 
space, where each one of them can be conve¬ 
niently expressed in terms of the modulus associ¬ 
ated to a corresponding Cartesian vector variable. 

To be more specific, let us, for instance, refer 
to the need of avoiding the occlusions between 
the sample and the stereo camera, which might 
occasionally occur due to the arm link motions. 
Then such need can be, for instance, translated 
into the ultimate achievement of the following set 
of inequalities, for suitable chosen values of the 
boundaries 

||/|| >/m; IMI > r m ; \\rj\\<riM 

where / is the vector lying on the vehicle x-y 
plane, joining the arm elbow with the line parallel 
to the vehicle z-axis and passing through camera 
frame origin, as sketched in Fig. 2a. Moreover r) 
is the misalignment vector formed by vector r 
also lying on the vehicle x-y plane, joining the 
lines parallel to the vehicle z-axis and, respec¬ 
tively, passing through the elbow and the end- 
effector origin. 

As for the vehicle, it must keep the object of 
interest grossly centered in the camera frame (see 
Fig. 2b), thus meaning that the modulus of the 
orientation error £, formed by the unit vector n p 
of vector p from the sample to the camera frame 
and the unit vector k c of the z-axis of the camera 
frame itself, must ultimately satisfy the inequality 

II?II < 

Furthermore, the camera must also be closer than 
a given horizontal distance Mm to the vertical 
line passing through the sample, and it must 
lie between a maximum and minimum height 
with respect to the sample itself, thus implying 
the achievement of the following inequalities 
(Fig. 2c, d): 
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Advanced Manipulation for Underwater Sampling, Fig. 2 Vectors allowing for the defintion of some inequality 
objectives in the Cartesian space: (a) camera occlusion, ( b ) camera centering, (c) camera distance, (d) camera height 


Ikll < d M \ h m < \\h\\ < h M 

Also since the vehicle should exhibit an 
almost horizontal attitude, this further requires 
the achievement of the following additional 
inequality: 

11011 < <Pm 

with 0 the misalignment vector formed by the 
absolute vertical unit vector k 0 with the vehicle 
z-axis one k v . 

And finally the end-effector must eventually 
reach the sample, for then picking it. Thus the fol¬ 
lowing, now of equality type, objectives must also 
be ultimately achieved, where r is the position 


error and 0 the orientation one of the end-effector 
frame with respect to the sample frame 

Ik II = o; 11*11=0 

As already repeatedly remarked, the achievement 
of the above inequality objectives (since related 
to the system safety and/or its best operability) 
must globally deserve a priority higher than the 
last equality. 

Basic Definitions 

The following definitions only regard a generic 
vector s e R 3 characterizing a corresponding 
generic objective defined in the Cartesian space 
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(for instance, with the exclusion of the joint and 
manipulability limits, all the other above-reported 
objectives). In this case the vector is termed to be 
the error vector of the objective, and it is assumed 
measured with components on the vehicle frame. 
Then its modulus 

o=M 

is termed to be the error, while its unit vector 
n=s/a; g ^ 0 

is accordingly denoted as the unit error vector. 
Then the following differential Jacobian relation¬ 
ship can always be evaluated for each of them: 

s = Hy 

where y e R N (N = (7 + 6) for the system 
of Fig. 1) is the stacked vector composed of the 
joint velocity vector q e R 7 , plus the stacked 
vector v e R 6 of the absolute vehicle velocities 
(linear and angular) with components on the 
vehicle frame and with s clearly representing the 
time derivative of vector s itself, as seen from 
the vehicle frame and with components on it 
(see Casalino (2011) for details on the real-time 
evaluation of Jacobian matrices H ). 

Obviously, for the time derivative & of 
the error, also the following differential 
relationship holds 

a — n T Hy 

Further, to each error variable cr, a so-called error 
reference rate is real time assigned of the form 

g = —y(G — g°)(x(g) 

where for equality objectives g° is the target 
value and Q'(cr) = 1, while for inequality ones, 
g° is the threshold value and of (cr) is a left¬ 
cutting or right-cutting (in correspondence of g°) 
smooth sigmoidal activation function, depending 
on whether the objective is to force g to be below 
or above g° , respectively. 


In case g could be exactly assigned to its 
corresponding error rate &, it would consequently 
smoothly drive g toward the achievement of its 
associated objective. Note however that for in¬ 
equality objectives, it would necessarily impose 
& = 0 in correspondence of a point located 
inside the interval of validity of the inequality 
objective itself, while instead such an error rate 
zeroing effect should be relaxed, for allowing 
the helpful subsequent system mobility increase, 
which allows for further progress toward other 
lower priority control objectives. Such a relax¬ 
ation aspect will be dealt with soon. 

Furthermore, in correspondence of a reference 
error rate g, the so-called reference error vector 
rate can also be defined as 

s=ncr 

that for equality objectives requiring the zeroing 
of their error g simply becomes 

s= — ys 

whose evaluation, since not requiring its unit 
vector n , will be useful for managing equality 
objectives. 

Finally note that for each objective not de¬ 
fined in the Cartesian space (like, for instance, 
the above joint limits and manipulability), the 
corresponding scalar error variable, its rate, and 
its reference error rate can instead be managed 
directly, since obviously they do not require any 
preliminary scalar reduction process. 

Managing the Higher Priority Inequality 
Objectives 

A prioritized list of the various scalar inequal¬ 
ity objectives, to be concurrently progressively 
achieved, is suitably established in a descending 
priority order. 

Then, by starting to consider the highest pri¬ 
ority one, we have that the linear manifold of 
the system velocity vector y (i.e., the arm joints 
velocity vector q stacked with vector v of the 
vehicle linear and angular velocities), capable of 
driving toward its achievement, results at each 
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time instant as the set of solution of the following 
minimization problem with scalar argument, with 
row vector Gi=aq n\H\ and scalar ct\ the same 
activation function embedded within the refer¬ 
ence error rate d\ 

Si = j argmin || d\ — G\ y |] | 

y = Gfd i + (/ - G*G\)zi=pi + Qxzw V-, 

( 1 ) 

The above minimization, whose solution man¬ 
ifold appears at the right (also expressed in a 
concise notation with an obvious correspondence 
of terms) parameterized by the arbitrary vector 
Z\ , has to be assumed executed without extracting 
the common factor a\, that is, by evaluating 
the pseudo-inverse matrix Gf via the regularized 
form 

Gf = + p \) 1 a\H^n\ 

with p\, a suitably chosen bell-shaped, finite sup¬ 
port and centered on zero, regularizing function 
of the norm of row vector G \. 

In the above solution manifold, when ot\ = 
1 (i.e., when the first inequality is still far to 
be achieved), the second arbitrary term Q\Z\ is 
orthogonal to the first, thus having no influence 
on the generated &\ = d\ and consequently 
suitable to be used for also progressing toward 
the achievement of other lower priority objec¬ 
tives, without perturbing the current progressive 
achievement of the first one. Note however that, 
since in this condition the span of the second 
term results one dimension less than the whole 
system velocity space y e R N , this implies that 
the lower priority objectives can be progressed 
by only acting within a one-dimension reduced 
system velocity subspace. 

When ai = 0 (i.e., when the first inequality is 
achieved) since Gf = 0 (as granted by the reg¬ 
ularization) and consequently y = z i, the lower 
priority objectives can instead be progressed by 
now exploiting the whole system velocity space. 


When instead ol\ is within its transition zone 
0 < oi\ < 1 (he., when the first inequality is near 
to be achieved), since the two terms of the so¬ 
lution manifold now become only approximately 
orthogonal, this can make the usage of the second 
term for managing lower priority tasks, possibly 
counteracting the first, currently acting in favor 
of the highest priority one, but in any case with¬ 
out any possibility of making the primary error 
variable 0 \ getting out of its enlarged boundaries 
(i.e., the ones inclusive of the transition zone), 
thus meaning that once the primary variable 0 \ 
has entered within such larger boundaries, it will 
definitely never get out of them. 

With the above considerations in mind, 
managing the remaining priority-descending 
sequence of inequality objectives can then be 
done by applying the same philosophy to each 
of them and within the mobility space left free 
by its preceding ones, that is, as the result of 
the following sequence of nested minimization 
problems: 

Si = | argmin || oi — G/ y || 2 > ; i = 1,2 
( yeSi -1 J 

with Gi=otinf H t and with k indexing the low¬ 
est priority inequality objective and where the 
highest priority objective has been also included 
for the sake of completeness (upon letting S 0 = 
R n ). In this way the procedure guarantees the 
concurrent prioritized convergence (actually oc¬ 
curring as a sort of “domino effect” scattering 
along the prioritized objective list) toward the 
ultimate fulfillment of all inequality objectives, 
each one within its enlarged bounds at worse and 
with no possibility of getting out of them, once 
reached. 

Further, a simple algebra allows translating the 
above sequence of k nested minimizations into 
the following algorithmic structure, with initial¬ 
ization po = 0; Qo = I (see Casalino et al. 
2012a,b for more details): 

Gi=G 7 Qi 

Tt = (I — Qi-iGfGt) 
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Pi — Ti Pi—i + Qi—\ Gf<Ji 
Qi = Qi -1 (I - GfG,) 

ending with the last k- th iteration with the solu¬ 
tion manifold 

y = Pk + QkZk\ Vzk 

where the residual arbitrariness space QkZ z has to 
be then used for managing the remaining equality 
objectives, as hereafter indicated. 

Managing the Lower Priority Equality 
Objectives and Subsystem Motion 
Priorities 

For managing the lower priority equality 
objectives when these require the zeroing of 
their associated error cr z (as, for instance, for the 
end-effector sample reaching task), the following 
sequence of nested minimization problems has 
to be instead considered (with initialization pk\ 

QkY 

Sj = < argmin ||^ — Hjy || 2 > ; i = (k+ 1),..., m 
[ yeSi- 1 J 

with m indexing the last priority equality objec¬ 
tive and where the whole reference error vector 
rates Si and associated whole error vectors Sj 
have now to be used, since for = 1 (as it is 
for any equality objective) the otherwise needed 
evaluation of unit vectors rii (which become ill 
defined for the relevant error cr z approaching 
zero) would most probably provoke unwanted 
chattering phenomena around cr z = 0, while 
instead the above avoids such risk (since Sf and Sf 
can be evaluated without requiring rif), even if at 
the cost of requiring, for each equality objective, 
three degrees of mobility instead of a sole one, 
as it instead is for each inequality objectives. 
However, note how the algorithmic translation 
of the above procedure remains structurally the 
same as the one for the inequality objectives 
(obviously with the substitutions Si &i , Hi 
Gi , and with initialization pk, Qk), thus ending in 
correspondence of the m-th last equality objective 
with the solution manifold 


y — Pm T" QmZm ■> ^ Zm 

where the still possibly existing residual arbitrari¬ 
ness space Q m Zm can be further used for assign¬ 
ing motion priorities between the arm and the 
vehicle, for instance, via the following additional 
least-priority ending task 

y = argmin ||v|| 2 = p m+l 

yts m 

whose solution p m +i (with no more arbitrariness 
required) finally assures (while respecting all pre¬ 
vious priorities) a motion minimality of the vehi¬ 
cle, thus implicitly assigning to the arm a greater 
mobility, which in turn allows the exploitation of 
its generally higher motion precision, especially 
during the ultimate convergence toward the final 
grasping. 


Implementations 

The recently realized TRIDENT system of Fig. 1, 
embedding the above introduced task-priority- 
based control architecture, has been operating 
at sea in 2012 (Port Soller Harbor, Mallorca, 
Spain). A detailed presentation of the preliminary 
performed simulations, then followed by pool 
experiments, and finally followed by field trials 
executed within a true underwater sea environ¬ 
ment can be found in Simetti et al. (2013). The 
related EU-funded TRIDENT project (Sanz et al. 
2012) is the first one where agile manipulation 
could be effectively achieved by part of an un¬ 
derwater floating manipulator, not only as the 
consequence of the comparable masses and iner¬ 
tia exhibited by the vehicle and arm, but mainly 
due to the adopted unified task-priority-based 
control framework. Capabilities for autonomous 
underwater floating manipulation were however 
already achieved for the first time in 2009 at 
the University of Hawaii, within the SAUVIM 
project (Marani et al. 2009, 2014; Yuh et al. 
1998) even if without effective agility (the related 
system was in fact a 6-t vehicle endowed with a 
less than 35 kg arm). 
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Future Directions 

The presented task-priority-based KCL structure 
is invariant with the addition, deletion, and substi¬ 
tution (even on-the-fly) of the various objectives, 
as well as invariant to changes in their priority 
ordering, thus constituting an invariant core po¬ 
tentially capable of supporting intervention tasks 
beyond the sole sample collection ones. On this 
basis, more complex systems and operational 
cases, such as, for instance, multi-arm systems 
and/or even cooperating ones, can be foreseen 
to be developed along the lines established by 
the roadmap of Fig. 3 (with case 0 the current 
development state). 

The future availability of agile floating 
single-arm or multi-arm manipulators, also 
implementing cooperative interventions in 
force of a unified control and coordination 
structure (to this aim purposely extended), 
might in fact pave the way toward the 
realization of underwater hard-work robotized 
places, where different intervention agents 
might individually or cooperatively perform 
different object manipulation and transportation 
activities, also including assembly ones, 
thus far beyond the here considered case of 
sample collection. Such scenarios deserve the 
attention not only of the science community 
when needing to execute underwater works 
(excavation, coring, instrument handling, etc., 
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Fig. 3 A sketch of the foreseen roadmap for future de¬ 
velopment of marine intervention robotics 


other than sample collection) at increasing 
depths but obviously also those of the offshore 
industry. 

Moreover, by exploiting the current and future 
developments on underwater exploration and 
survey mission performed by normal AUVs 
(i.e., nonmanipulative), a possible work scenario 
might also include the presence of these lasts, 
for accomplishing different service activities 
supporting the intervention ones, for instance, 
relays with the surface, then informative activities 
(for instance, the delivery of the area model built 
during a previous survey phase or the delivery of 
the intervention mission, both downloaded when 
in surface and then transferred to the intervention 
agents upon docking), or even when hovering 
on the work area (for instance, close to a well- 
recognized feature) behaving as a local reference 
system for the self-localization of the operative 
agents via twin USBL devices. 
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Abstract 

This entry provides a broad overview of how 
air traffic for commercial air travel is scheduled 
and managed throughout the world. The major 
causes of delays and congestion are described, 
which include tight scheduling, safety restric¬ 
tions, infrastructure limitations, and major distur¬ 
bances. The technical and financial challenges to 
air traffic management are outlined, along with 
some of the promising developments for future 
modernization. 
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Introduction: How Does Air Traffic 
Management Work? 

This entry focuses on air traffic management for 
commercial air travel, the passenger- and cargo¬ 
carrying operations with which most of us are 
familiar. This is the air travel with a pressing 
need for modernization to address current and 
future congestion. Passenger and cargo traffic 
is projected to double over the next 20 years, 
with growth rates of 3-4 % annually in developed 
markets such as the USA and Europe and growth 
rates of 6 % and more in developing markets such 
as Asia Pacific and the Middle East. 

In most of the world, air travel is a distributed, 
market-driven system. Airlines schedule flights 
based on when people want to fly and when it is 
optimal to transport cargo. Most passenger flights 
are scheduled during the day; most package car¬ 
rier flights are overnight. Some airports limit 
how many flights can be scheduled by having 
a slot system, others do not. This decentralized 
schedule of flights to and from airports around the 
world is controlled by a network of air navigation 
service providers (ANSPs) staffed with air traffic 
controllers, who ensure that aircraft are separated 
safely. 
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The International Civil Aviation Organization 
(ICAO) has divided the world’s airspace into 
flight information regions. Each region has a 
country that controls the airspace, and the ANSP 
for each country can be a government depart¬ 
ment, state-owned company, or private organiza¬ 
tion. For example, in the United States, the ANSP 
is the Federal Aviation Administration (FAA), 
which is a government department. The Canadian 
ANSP is NAV CANADA, which is a private 
company. 

Each country is different in terms of the ser¬ 
vices provided by the ANSP, how the ANSP 
operates, and the tools available to the controllers. 
In the USA and Europe, the airspace is divided 
into sectors and areas around airports. An air 
traffic control center is responsible for traffic flow 
within its sector and rules and procedures are in 
place to cover transfer of control between sectors. 
The areas around busy airports are usually han¬ 
dled by a terminal radar approach control. The air 
traffic control tower personnel handle departing 
aircraft, landing aircraft, and the movement of 
aircraft on the airport surface. 

Air traffic controllers in developed air travel 
markets like the USA and Europe have tools that 
help them with the business of controlling and 
separating aircraft. Tower controllers operating at 
airports can see aircraft directly through windows 
or on computer screens through surveillance 
technology such as radar and Automatic 
Dependent Surveillance-Broadcast (ADS-B). 
Tower controllers may have additional tools 
to help detect and prevent potential collisions 
on the airport surface. En route controllers can 
see aircraft on computer screens and may have 
additional tools to help detect potential losses 
of separation between aircraft. Controllers can 
communicate with aircraft via radio and some 
have datalink communication available such 
as Controller-Pilot Datalink Communications 
(CPDLC). 

Flight crews have tools to help with navigating 
and flying the airplane. Autopilots and autothrot- 
tles off-load the pilot from having to continuously 
control the aircraft; instead the pilot can specify 
the speed, altitude, and heading and the autopilot 
and autothrottle will maintain those commands. 


Flight management systems (FMS) assist in flight 
planning in addition to providing lateral and ver¬ 
tical control of the airplane. Many aircraft have 
special safety systems such as the Traffic Alert 
and Collision Avoidance System, which alerts the 
flight crew to potential collisions with other air¬ 
borne aircraft, and the Terrain Avoidance Warn¬ 
ing Systems (TAWS), which alert the flight crew 
to potential flight into terrain. 


Causes of Congestion and Delays 

Congestion and delays are caused by multiple 
reasons. These include tight scheduling, safety 
limitations on how quickly aircraft can take off 
and land and how closely they can fly, infras¬ 
tructure limitations such as the number of run¬ 
ways at an airport and the airway structure, and 
disturbances such as weather and unscheduled 
maintenance. 

Tight Scheduling 

Tight scheduling is a major contributor to con¬ 
gestion and delays. The hub and spoke system 
that many major airlines operate with to minimize 
connection times means that aircraft arrive and 
depart in multiple banks during the day. During 
the arrival and departure banks, airports are very 
busy. As mentioned previously, passengers have 
preferred times to travel, which also increase de¬ 
mand at certain times. At airports that do not limit 
flight schedules by using slot scheduling, the 
number of flights scheduled can actually exceed 
the departure and arrival capacity of the airport 
even in best-case conditions. One of the reasons 
that airlines are asked to report on-time statistics 
is to make the published airline schedules more 
reflective of the average time from departure to 
arrival, not the best-case time. 

Aircraft themselves are also tightly scheduled. 
Aircraft are an expensive capital asset. Since cus¬ 
tomers are very sensitive to ticket prices, airlines 
need to have their aircraft flying as many hours as 
possible per day. Airlines also limit the number of 
spare aircraft and flight crews available to fill in 
when operations are disrupted to control costs. 
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Safety Restrictions 

Safety restrictions contribute to congestion. 
There is a limit to how quickly aircraft can 
take off from and land on a runway. Sometimes 
runways are used for both departing and arriving 
aircraft; at other times a runway may be used 
for departures only or arrivals only. Either way, 
the rule that controllers follow for safety is that 
only one aircraft can occupy the runway at one 
time. Thus, a landing aircraft must turn off of 
the runway before another aircraft can take off 
or land. This limitation and other limitations like 
the ability of controllers to manage the arrival 
and departure aircraft propagate backwards from 
the airport. Aircraft need to be spaced in an 
orderly flow and separated no closer than what 
can be supported by airport arrival rates. The 
backward propagation can go all the way to the 
departure airports and cause aircraft to be held on 
the ground as a means to regulate the traffic flow 
into a congested airport or through a congested 
air traffic sector. 

There is a limit on how close aircraft can fly. 
Aircraft produce a wake that can be dangerous 
for other aircraft that are following too closely 
behind. Pilots are aware of this limitation and 
space safely when doing visual separation. Rules 
that controllers apply for separation take into 
account wake turbulence limitations, surveillance 
limitations, and limitations on how well aircraft 
can navigate and conform to the required speed, 
altitude, and heading. 

The human is a safety limitation. Controllers 
and pilots are human. Being human, they have 
excellent reasoning capability. However, they are 
limited as to the number of tasks they can perform 
and are subject to fatigue. The rules and proce¬ 
dures in place to manage and fly aircraft take into 
account human limitations. 


Infrastructure Limitations 

Infrastructure limitations contribute to congestion 
and delays. Airport capacity is one infrastructure 
limitation. The number of runways combined 
with the available aircraft gates and capacity to 
process passengers through the terminal limit the 
airport capacity. 


The airspace itself is a limitation. The airspace 
where controllers provide separation services is 
divided into an orderly structure of airways. 
The airways are like one-way, one-lane roads in 
the sky. They are stacked at different altitudes, 
which are usually separated by either 1,000 ft. 
or 2,000 ft. The width of the airways depends 
on how well aircraft can navigate. In the US 
domestic airspace where there are regular 
navigation aids and direct surveillance of aircraft, 
the airways have a plus or minus 4NM width. 
Over the ocean, airways may need to be separated 
laterally by as much as 120NM since there are 
fewer navigation aids and aircraft are not under 
direct control but separated procedurally. The 
limited number of airways that the airspace can 
support limits available capacity. 

The airways themselves have capacity lim¬ 
itations just as traditional roads do. There are 
special challenges for airways since aircraft need 
a minimum separation distance, aircraft cannot 
slow down to a stop, and airways do not allow 
passing. So, although it may look like there is a 
lot of space in which aircraft can fly, there are 
actually a limited number of routes between a city 
pair or over oceanic airspace. 

The radio that is used for pilots and controllers 
to communicate is another infrastructure limita¬ 
tion. At busy airports, there is significant radio 
congestion and pilots may need to wait to get an 
instruction or response from a controller. 

Disturbances 

Weather is a significant disturbance in air traffic 
management. Weather acts negatively in many 
ways. Wet or icy pavement affects the braking 
ability of aircraft so they cannot vacate a runway 
as quickly as in dry conditions. Low cloud ceil¬ 
ings mean that all approaches must be instrument 
approaches rather than visual approaches, which 
also reduces runway arrival rates. Snow must be 
cleared from runways, closing them for some 
period of time. High winds can mean that certain 
approaches cannot be used because they are not 
safe. In extreme weather, an airport may need to 
close. Weather can block certain airways from 
use, requiring rerouting of aircraft. Rerouting 
increases demand on nearby airways, which may 



Air Traffic Management Modernization: Promise and Challenges 


31 


or may not have the required additional capacity, 
so the rerouting cascades on both sides of the 
weather. 


Why Is Air Traffic Management 
Modernization So Hard? 

Air traffic management modernization is difficult 
for financial and technical reasons. The air traffic 
management system operates around the clock. It 
cannot be taken down for a significant period of 
time without a major effect on commerce and the 
economy. 

Financing is a significant challenge for air 
traffic management modernization. Governments 
worldwide are facing budgetary challenges and 
improvements to air travel are one of many com¬ 
peting financial interests. Local airport authori¬ 
ties have similar challenges in raising money for 
airport improvements. Airlines have competitive 
limitations on how much ticket prices can rise 
and therefore need to see a payback on invest¬ 
ment in aircraft upgrades that can be as short as 
2 years. 

Another financial challenge is that the entity 
that needs to pay for the majority of an improve¬ 
ment may not be the entity that gets the majority 
of the benefit, at least near term. One example of 
this is the installation of ADS-B transmitters on 
aircraft. Buying and installing an ADS-B trans¬ 
mitter costs the aircraft owner money. It benefits 
the ANSPs, who can receive the transmissions 
and have them augment or replace expensive 
radar surveillance, but only if a large number of 
aircraft are equipped. Eventually the ANSP ben¬ 
efit will be seen by the aircraft operator through 
lower operating costs but it takes time. This is 
one reason that ADS-B transmitter equipage was 
mandated in the USA, Europe, and other parts of 
the world rather than letting market forces drive 
equipage. 

All entities, whether governmental or private, 
need some sort of business case to justify invest¬ 
ment, where it can be shown that the benefit of the 
improvement outweighs the cost. The same sys¬ 
tem complexity that makes congestion and delays 
in one region propagate throughout the system 


makes it a challenge to accurately estimate bene¬ 
fits. It is complicated to understand if an improve¬ 
ment in one part of the system will really help 
or just shift where the congestion points are. De¬ 
cisions need to be made on what improvements 
are the best to invest in. For government entities, 
societal benefits can be as important as financial 
payback, and someone needs to decide whose 
interests are more important. For example, the 
people living around an airport might want longer 
arrival paths at night to minimize noise while air 
travelers and the airline want the airline to fly the 
most direct route into an airport. A combination 
of subject matter expertise and simulation can 
provide a starting point to estimate benefit, but 
often only operational deployment will provide 
realistic estimates. 

It is a long process to develop new technolo¬ 
gies and operational procedures even when the 
benefit is clear and financing is available. The 
typical development steps include describing the 
operational concept; developing new controllers 
procedures, pilot procedures, or phraseology if 
needed; performing a safety and performance 
analysis to determine high level requirements; 
performing simulations that at some point may 
include controllers or pilots; designing and build¬ 
ing equipment that can include software, hard¬ 
ware, or both; and field testing or flight testing the 
new equipment. Typically, new ground tools are 
field tested in a shadow mode, where controllers 
can use the tool in a mock situation driven by 
real data before the tool is made fully opera¬ 
tional. Flight testing is performed on aircraft that 
are flying with experimental certificates so that 
equipment can be tested and demonstrated prior 
to formal certification. 

Avionics need to be certified before opera¬ 
tional use to meet the rules established to ensure 
that a high safety standard is applied to air travel. 
To support certification, standards are developed. 
Frequently the standards are developed through 
international cooperation and through consen¬ 
sus decision-making that includes many different 
organizations such as ANSPs, airlines, aircraft 
manufacturers, avionics suppliers, pilot associa¬ 
tions, controller associations, and more. This is 
a slow process but an important one, since it 
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reduces development risk for avionics suppliers 
and helps ensure that equipment can be used 
worldwide. 

Once new avionics or ground tools are avail¬ 
able, it takes time for them to be deployed. 
For example, aircraft fleets are upgraded as air¬ 
craft come in for major maintenance rather than 
pulling them out of scheduled service. Flight 
crews need to be trained on new equipment before 
it can be used, and training takes time. Ground 
tools are typically deployed site by site, and the 
controllers also require training on new equip¬ 
ment and new procedures. 

Promise for the Future 

Despite the challenges and complexity of air 
traffic management, there is a path forward for 
significant improvement in both developed and 
developing air travel markets. Developing air 
travel markets in countries like China and India 
can improve air traffic management using pro¬ 
cedures, tools, and technology that is already 
used in developed markets such as the USA and 
Europe. Emerging markets like China are will¬ 
ing to make significant investments in improving 
air traffic management by building new airports, 
expanding existing airports, changing controller 
procedures, and investing in controller tools. In 
developed markets, new procedures, tools, and 
technologies will need to be implemented. In 
some regions, mandates and financial incentives 
may play a part in enabling infrastructure and 
equipment changes that are not driven by the 
marketplace. 

The USA and Europe are both supporting 
significant research, development, and im¬ 
plementation programs to support air traffic 
management modernization. In the USA, the 
FAA has a program known as NextGen, the 
Next Generation Air Transportation System. In 
Europe, the European Commission oversees 
a program known as SESAR, the Single 
European Sky Air Traffic Management Research, 
which is a joint effort between the European 
Union, EUROCONTROL, and industry partners. 
Both programs have substantial support and 


financing. Each program has organized its efforts 
differently but there are many similarities in the 
operational objectives and improvements being 
developed. 

Airport capacity problems are being addressed 
in multiple ways. Controllers are being provided 
with advanced surface movement guidance and 
control systems that combine radar surveillance, 
ADS-B surveillance, and sensors installed at the 
airport with valued-added tools to assist with traf¬ 
fic control and alert controllers to potential col¬ 
lisions. Datalink communications between con¬ 
trollers and pilots will reduce radio-frequency 
congestion, reduce communication errors, and 
enable more complex communication. The USA 
and Europe have plans to develop a modernized 
datalink communication infrastructure between 
controllers and pilots that would include infor¬ 
mation like departure clearances and the taxiway 
route clearance. Aircraft on arrival to an airport 
will be controlled more precisely by equipping 
aircraft with capabilities such as the ability to fly 
to a required time of arrival and the ability to 
space with respect to another aircraft. 

Domestic airspace congestion is being ad¬ 
dressed in Europe by moving towards a single 
European sky where the ANSPs for the individ¬ 
ual nations coordinate activities and airspace is 
structured not as 27 national regions but operated 
as larger blocks. Similar efforts are undergoing 
in the USA to improve the cooperation and coor¬ 
dination between the individual airspace sectors. 
In some countries, large blocks of airspace are 
reserved for special use by the military. In those 
countries, efforts are in place to have dynamic 
special use airspace that is reserved on an as- 
needed basis but otherwise available for civil 
use. 

Oceanic airspace congestion is being 
addressed by leveraging the improved navigation 
performance of aircraft. Some route structures 
are available only to aircraft that can flight to 
a required navigation performance. These route 
structures have less required lateral separation, 
and thus more routes can be flown in the same 
airspace. Pilot tools that leverage ADS-B are 
allowing aircraft to make flight level changes 
with reduced separation and in the future 
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are expected to allow pilots to do additional 
maneuvering that is restricted today, such as 
passing slower aircraft. 

Weather cannot be controlled but efforts are 
underway to do better prediction and provide 
more accurate and timely information to pilots, 
controllers, and aircraft dispatchers at airlines. 
On-board radars that pilots use to divert around 
weather are adding more sophisticated process¬ 
ing algorithms to better differentiate hazardous 
weather. Future flight management systems will 
have the capability to include additional weather 
information. Datalinks between the air and the 
ground or between aircraft may be updated to 
include information from the on-board radar sys¬ 
tems, allowing aircraft to act as local weather 
sensors. Improved weather information for pilots, 
controllers, and dispatchers improves flight plan¬ 
ning and minimizes the necessary size of devia¬ 
tions around hazardous weather while retaining 
safety. 

Weather is also addressed by providing air¬ 
craft and airports with equipment to improve 
airport access in reduced visibility. Ground-based 
augmentation systems installed at airports pro¬ 
vide aircraft with the capability to do precision- 
based navigation for approaches to airports with 
low weather ceilings. Other technologies like 
enhanced vision and synthetic vision, which can 
be part of a combined vision system, provide the 
capability to land in poor visibility. 

Summary 

Air traffic management is a complex and interest¬ 
ing problem. The expected increase in air travel 
worldwide is driving a need for improvements 
to the existing system so that more passengers 
can be handled while at the same time reducing 
congestion and delays. Significant research and 
development efforts are underway worldwide to 
develop safe and effective solutions that include 
controller tools, pilot tools, aircraft avionics, in¬ 
frastructure improvements, and new procedures. 
Despite the technical and financial challenges, 
many promising technologies and new proce¬ 
dures will be implemented in the near, mid-, 


and far term to support air traffic management 
modernization worldwide. 
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Abstract 

Aircraft flight control is concerned with using 
the control surfaces to change aerodynamic mo¬ 
ments, to change attitude angles of the aircraft 
relative to the air flow, and ultimately change 
the aerodynamic forces to allow the aircraft to 
achieve the desired maneuver or steady condi¬ 
tion. Control laws create the commanded con¬ 
trol surface positions based on pilot and sensor 
inputs. Traditional control laws employ propor¬ 
tional and integral compensation with scheduled 
gains, limiting elements, and cross feeds between 
coupled feedback loops. Dynamic inversion is an 
approach to develop control laws that systemati¬ 
cally addresses the equivalent of gain schedules 
and the multivariable cross feeds, can incorpo¬ 
rate constrained optimization for the limiting ele¬ 
ments, and maintains the use of proportional and 
integral compensation to achieve the benefits of 
feedback. 
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Although the following discussion is applica¬ 
ble to a wide range of flight vehicles including 
gliders, unmanned aerial vehicles, lifting bodies, 
missiles, rockets, helicopters, and satellites, the 
focus of this entry will be on fixed wing commer¬ 
cial and military aircraft with human pilots. 


Introduction 


Flight 


Flying is made possible by flight control and this 
applies to birds and the Wright Flyer, as well as 
modern flight vehicles. In addition to balancing 
lift and weight forces, successful flight also re¬ 
quires a balance of moments or torques about the 
mass center. Control is a means to adjust these 
moments to stay in equilibrium and to perform 
maneuvers. While birds use their feathers and 
the Wright Flyer warped its wings, modern flight 
vehicles utilize hinged control surfaces to adjust 
the moments. The control action can be open 
or closed loop, where closed loop refers to a 
feedback loop consisting of sensors, computer, 
and actuation. A direct connection between the 
cockpit pilot controls and the control surfaces 
without a feedback loop is open loop control. The 
computer in the feedback loop implements a con¬ 
trol law (computer program). The development of 
the control law is discussed in this entry. 


Aircraft are maneuvered by changing the forces 
acting on the mass center, e.g., a steady level 
turn requires a steady force towards the direction 
of turn. The force is the aerodynamic lift force 
(L) and it is banked or rotated into the direction 
of the turn. The direction can be adjusted with 
the bank angle (/x) and for a given airspeed (V) 
and air density (p), the magnitude of the force 
can be adjusted with the angle-of-attack (o'). This 
is called bank-to-turn. Aircraft, e.g., missiles, 
can also skid-to-turn where the aerodynamic side 
force (F) is adjusted with the sideslip angle (/3) 
but this entry will focus on bank-to-turn. 

Equations of motion (Enns et al. 1996; Stevens 
and Lewis 1992) can be used to relate the time 
rates of change of /z, o', and /3 to roll (p), pitch 
(i q ), and yaw (r) rate. See Fig. 1. Approximate 
relations (for near steady level flight with no 
wind) are 
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A = P 


a = q + 


L — mg 
mV 



pitch and heave degrees-of-freedom) for this dy¬ 
namical system are 

q = M a ot + M q q + M§ e 8 e 
dt — Z a ot q Z§ e 8 e 


where m is the aircraft mass, and g is the grav¬ 
itational acceleration. In straight and level flight 
conditions L = mg and Y = 0 so we think of 
these equations as kinematic equations where the 
rates of change of the angles /x, a, and /3 are the 
angular velocities p,q, and r. 

Three moments called roll, pitch, and yaw for 
angular motion to move the right wing up or 
down, nose up or down, and nose right or left, 
respectively create the angular accelerations to 
change p, q, and r, respectively. The equations 
are Newton’s 2 nd law for rotational motion. The 
moments (about the mass center) are dominated 
by aerodynamic contributions and depend on p, 
V, a , /3, p,q,r , and the control surfaces. The 
control surfaces are aileron ( 8 a ), elevator ( 8 e ), 
and rudder (8 r ) and are arranged to contribute pri¬ 
marily roll, pitch, and yaw moments respectively. 

The control surfaces ( 8 a , 8 e , 8 r ) contribute 
to angular accelerations which are integrated to 
obtain the angular rates ( p,q,r ). The integral of 
angular rates contributes to the attitude angles 
( n,a,P ). The direction and magnitude of aero¬ 
dynamic forces can be adjusted with the attitude 
angles. The forces create the maneuvers or steady 
conditions for operation of the aircraft. 

Pure Roll Axis Example 

Consider just the roll motion. The differential 
equation (Newton’s 2 nd law for the roll degree- 
of-freedom) for this dynamical system is 


where M a , M q , Z a are stability derivatives, and 
Mse is the control derivative, all of which can be 
regarded as constants for a given airspeed and air 
density. 

Although Z a < 0 and M q < 0 are stabilizing, 
M a > 0 makes the short period motion inherently 
unstable. In fact, the short period motion of the 
Wright Flyer was unstable. Some modern aircraft 
are also unstable. 

Lateral-Directional Axes Example 

Consider just the roll, yaw, and side motion with 
four state variables (/x, p,r, /3) and two inputs 
(8 a , 8 r ). We will use the standard state space 
equations with matrices A, B,C for this example. 

The short period equations apply for yaw and 
side motion (or dutch roll motion) with appro¬ 
priate replacements, e.g., q with r, a with — /3, 
M with N . We add the term V _1 g/x to the /? 
equation. We include the kinematic equation fi = 
p and add the term L^/3 to the p equation. The 
dutch roll, like the short period, can be unstable 
if Np < 0, e.g., airplanes without a vertical tail. 

There is coupling between the motions asso¬ 
ciated with stability derivatives L r , Lp, N p and 
control derivatives L$ r and N& a - This is a fourth 
order multivariable coupled system where 8 a , 8 r 
are the inputs and we can consider ( p , r ) or 
(ji,P) as the outputs. 


P — T p p + L§ a 8 a 


Control 


where L p is the stability derivative and L§ a is the 
control derivative both of which can be regarded 
as constants for a given airspeed and air density. 

Pitch Axis or Short Period Example 

Consider just the pitch and heave motion. The 
differential equations (Newton’s 2 nd law for the 


The control objectives are to provide stability, 
disturbance rejection, desensitization, and 
satisfactory steady state and transient response 
to commands. Specifications and guidelines for 
these objectives are assessed quantitatively with 
frequency, time, and covariance analyses and 
simulations. 
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Aircraft Flight Control, Fig. 2 Closed loop feedback system and desired dynamics 


Integrator with P +1 Control 

The system to be controlled is the integrator for 
y in Fig. 2 and the output of the integrator (y) 
is the controlled variable. The proportional gain 
(Kb > 0) is a frequency and sets the bandwidth 
or crossover frequency of the feedback loop. The 
value of Kb will be between 1 and 10 rad/s in 
most aircraft applications. Integral action can be 
included with the gain, ft > 0 with a value 
between 0 and 1.5 in most applications. The value 
of the command gain, f c > 0, is set to achieve a 
desired closed loop response from the command 
y c to the output y. Values of fi = 0.25 and f c = 
0.5 are typical. In realistic applications, there is a 
limit that applies at the input to the integrator. In 
these cases, we are obligated to include an anti¬ 
integral windup gain, f a > 0 (typical value of 2) 
to prevent continued integration beyond the limit. 
The input to the limiter (jdes) is called the desired 
rate of change of the controlled variable (Enns 
etal. 1996). 

The closed loop transfer function is 

l_ = KbifcS + fjKb ) 
j c s* + K b s + fiKl 

and the pilot produces the commands, ( y c ) with 
cockpit inceptors, e.g., sticks, pedals. 

The control system robustness can be adjusted 
with the choices made for y, Kb, fi, and f c . 

These desired dynamics are utilized in all of 
the examples to follow. In the following, we use 
dynamic inversion (Enns et al. 1996; Wacker 
et al. 2001) to algebraically manipulate the equa¬ 
tions of motion into the equivalent of the integra¬ 
tor for y in Fig. 2. 


Pure Roll Motion Example 

With algebraic manipulations called dynamic in¬ 
version we can use the pure integrator results 
in the previous section for the pure roll motion 
example. For the controlled variable y = p , 
given a measurement of the state x = p and 
values for L p and L§ a , we simply solve for the 
input (u = 8 a ) that gives the desired rate of 
change of the output ydes = Pdes- The solution 
is 

$a — (pdes — T p p) 

Since L§ a and L p vary with air density and 
airspeed, we are motivated to schedule these 
portions of the control law accordingly. 

Short Period Example 

Similar algebraic manipulations use the general 
state space notation 

x — Ax T- Bu 
y = Cx 

We want to solve for u to achieve a desired rate 
of change of y, so we start with 

y = CAx + CBu 

If we can invert CB , i.e., it is not zero, for the 
short period case, we solve for u with 

u = (C#) -1 (y d es - CAx) 

Implementation requires a measurement of the 
state, v and models for the matrices CA and CB. 
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The closed loop poles include the open loop 
zeros of the transfer function (zero dynamics) 
in addition to the roots of the desired dynamics 
characteristic equation. Closed loop stability re¬ 
quires stable zero dynamics. The zero dynamics 
have an impact on control system robustness and 
can influence the precise choice of y . 

When y = q, the control law includes the 
following dynamic inversion equation 

8 e — (r/des MqCf M a Oi ) 

and the open loop zero is Z a — Z§ e M^ l M a , 
which in almost every case of interest is a neg¬ 
ative number. 

Note that there are no restrictions on the open 
loop poles. This control law is effective and 
practical in stabilization of an aircraft with an 
open loop unstable short period mode. 

Since M$ e , M q and M a vary with air density 
and airspeed we are motivated to schedule these 
portions of the control law accordingly. 

When y = a, the zero dynamics are not 
suitable as closed loop poles. In this case, the 
pitch rate controller described above is the inner 
loop and we apply dynamic inversion a second 
time as an outer loop (Enns and Keviczky 2006) 
where we approximate the angle-of-attack dy¬ 
namics with the simplification that pitch rate has 
reached steady state, i.e., q = 0 and regard pitch 
rate as the input ( u = q) and angle-of-attack as 
the controlled variable (y = a). The approximate 
equation of motion is 

a = Z a a + q — Z$ e M^ ] ( M a a + M q q) 

= (Z a - Z& e M^M a )a 
+ (l-Z Se Mf e l M q )q 

This equation is inverted to give 

q c = {l-Z Se Mf e l M q y l 

[ddes - (Z a ~ Z Se M^' M a ) a] 

q c obtained from this equation is passed to the 
inner loop as a command, i.e., y c of the inner 
loop. 


Lateral-Directional Example 

If we choose the two angular rates as the con¬ 
trolled variables ( p,r ), then the zero dynamics 
are favorable. We use the same proportional plus 
integral desired dynamics in Fig. 2 but there are 
two signals represented by each wire (one associ¬ 
ated with p and the other r). 

The same state space equations are used for 
the dynamic inversion step but now CA and CB 
are 2 x 4 and 2x2 matrices, respectively instead 
of scalars. The superscript in u = ( CB)~ l (jdes — 
CAx) now means matrix inverse instead of recip¬ 
rocal. The zero dynamics are assessed with the 
transmission zeros of the matrix transfer function 

In the practical case where the aileron and 
rudder are limited, it is possible to place a higher 
priority on solving one equation vs. another if the 
equations are coupled, by proper allocation of the 
commands to the control surfaces which is called 
control allocation (Enns 1998). In these cases, we 
use a constrained optimization approach 

min 11 CBu - ( j) d es “ CAx) \ \ 

M min— li —Wmax 

instead of the matrix inverse followed by a lim¬ 
iter. In cases where there are redundant controls, 
i.e., the matrix CB has more columns than rows, 
we introduce a preferred solution, u p and solve a 
different constrained optimization problem 

min 11 u — Up \ \ 

CBu+CAx=yfe s 

to find the solution that solves the equations that 
is closest to the preferred solution. We utilize 
weighted norms to accomplish the desired prior¬ 
ity. 

An outer loop to control the attitude angles 
(/x, /3) can be obtained with an approach analo¬ 
gous to the one used in the previous section. 

Nonlinear Example 

Dynamic inversion can be used directly with the 
nonlinear equations of motion (Enns et al. 1996; 
Wacker et al. 2001). General equations of mo¬ 
tion, e.g., 6 degree-of-freedom rigid body can be 
expressed with x = f (x,u) and the controlled 
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variable is given by y = h{x). With the chain 
rule of calculus we obtain 

9 h 

y = -7- (x) f(x, u) 

OX 

and for a given y = and (measured) v we 
can solve this equation for u either directly or 
approximately. In practice, the first order Taylor 
Series approximation is effective 

y = a (x, uo) + b (v, uo) ( u — uo ) 

where uo is typically the past value of u , in 
a discrete implementation. As in the previous 
example, Fig. 2 can be used to obtain jdes- The 
terms a (x, mq) — b (x, mq) and b (x, mq) are 
analogous to the terms CAx and the matrix CB , 
respectively. Control allocation can be utilized 
in the same way as discussed above. The zero 
dynamics are evaluated with transmission zeros 
at the intended operating points. Outer loops can 
be employed in the same manner as discussed in 
the previous section. 

The control law with this approach utilizes 
the equations of motion which can include table 
lookup for aerodynamics, propulsion, mass prop¬ 
erties, and reference geometry as appropriate. 
The raw aircraft data or an approximation to the 
data takes the place of gain schedules with this 
approach. 

Summary and Future Directions 

Flight control is concerned with tracking com¬ 
mands for angular rates. The commands may 
come directly from the pilot or indirectly from 
the pilot through an outer loop, where the pilot 
directly commands the outer loop. Feedback con¬ 
trol enables stabilization of aircraft that are inher¬ 
ently unstable and provides disturbance rejection 
and insensitive closed-loop response in the face 
of uncertain or varying vehicle dynamics. Propor¬ 
tional and integral control provide these benefits 
of feedback. The aircraft dynamics are signifi¬ 
cantly different for low altitude and high speed 
compared to high altitude and low speed and so 


portions of the control law are scheduled. Aircraft 
do exhibit coupling between axes and so multi- 
variable feedback loop approaches are effective. 
Nonlinearities in the form of limits (noninvert- 
ible) and nonlinear expressions, e.g., trigonomet¬ 
ric, polynomial, and table look-up (invertible) 
are present in flight control development. The 
dynamic inversion approach has been shown to 
include the traditional feedback control princi¬ 
ples, systematically develops the equivalent of the 
gain schedules, applies to multivariable systems, 
applies to invertible nonlinearities, and can be 
used to avoid issues with noninvertible nonlinear¬ 
ities to the extent it is physically possible. 

Future developments will include adaptation, 
reconfiguration, estimation, and nonlinear 
analyses. Adaptive control concepts will continue 
to mature and become integrated with approaches 
such as dynamic inversion to deal with 
unstructured or nonparameterized uncertainty or 
variations in the aircraft dynamics. Parameterized 
uncertainty will be incorporated with near real 
time reconfiguration of the aircraft model used 
as part of the control law, e.g., reallocation of 
control surfaces after an actuation failure. State 
variables used as measurements in the control law 
will be estimated as well as directly measured 
in nominal and sensor failure cases. Advances 
in nonlinear dynamical systems analyses will 
create improved intuition, understanding, and 
guidelines for control law development. 


Cross-References 

► PID Control 

► Satellite Control 

► Tactical Missile Autopilots 
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Abstract 

This entry provides an overview of the prob¬ 
lems addressed by discrete-event systems (DES) 
theory, with an emphasis on their connection to 
various application contexts. The primary inten¬ 
tions are to reveal the caliber and the strengths 
of this theory and to direct the interested reader, 
through the listed citations, to the corresponding 
literature. The concluding part of the entry also 
identifies some remaining challenges and further 
opportunities for the area. 
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Introduction 

Discrete-event systems (DES) theory (►Models 
for Discrete Event Systems: An Overview) 
(Cassandras and Lafortune 2008) emerged in 
the late 1970s/early 1980s from the effort 


of the controls community to address the 
control needs of applications concerning some 
complex production and service operations, 
like those taking place in manufacturing and 
other workflow systems, telecommunication 
and data-processing systems, and transportation 
systems. These operations were seeking the 
ability to support higher levels of efficiency 
and productivity and more demanding notions 
of quality of product and service. At the same 
time, the thriving computing technologies of 
the era, and in particular the emergence of 
the microprocessor, were cultivating, and to a 
significant extent supporting, visions of ever- 
increasing automation and autonomy for the 
aforementioned operations. The DES community 
set out to provide a systematic and rigorous 
understanding of the dynamics that drive the 
aforementioned operations and their complexity, 
and to develop a control paradigm that would 
define and enforce the target behaviors for those 
environments in an effective and robust manner. 

In order to address the aforementioned objec¬ 
tives, the controls community had to extend its 
methodological base, borrowing concepts, mod¬ 
els, and tools from other disciplines. Among 
these disciplines, the following two played a 
particularly central role in the development of 
the DES theory: (i) the Theoretical Computer 
Science (TCS) and (ii) the Operations Research 
(OR). As a new research area, DES thrived on 
the analytical strength and the synergies that 
resulted from the rigorous integration of the mod¬ 
eling frameworks that were borrowed from TCS 
and OR. Furthermore, the DES community sub¬ 
stantially extended those borrowed frameworks, 
bringing in them many of its control-theoretic 
perspectives and concepts. 

In general, DES-based approaches are charac¬ 
terized by (i) their emphasis on a rigorous and 
formal representation of the investigated systems 
and the underlying dynamics; (ii) a double focus 
on time-related aspects and metrics that define 
traditional/standard notions of performance for 
the considered systems, but also on a more be- 
haviorally oriented analysis that is necessary for 
ensuring fundamental notions of “correctness,” 
“stability,” and “safety” of the system operation, 
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especially in the context of the aspired levels 
of autonomy; (iii) the interplay between the two 
lines of analysis mentioned in item (ii) above 
and the further connection of this analysis to 
structural attributes of the underlying system; 
and (iv) an effort to complement the analytical 
characterizations and developments with design 
procedures and tools that will provide solutions 
provably consistent with the posed specifications 
and effectively implementable within the time 
and other resource constraints imposed by the 
“real-time” nature of the target applications. 

The rest of this entry overviews the current 
achievements of DES theory with respect to 
(w.r.t.) the different classes of problems that 
have been addressed by it and highlights the 
potential that is defined by these achievements 
for a range of motivating applications. On the 
other hand, the constricted nature of this entry 
does not allow an expansive treatment of the 
aforementioned themes. Hence, the provided 
coverage is further supported and supplemented 
by an extensive list of references that will 
connect the interested reader to the relevant 
literature. 


A Tour of DES Problems 
and Applications 

DES-Based Behavioral Modeling, Analysis, 
and Control 

The basic characterization of behavior in the 
DES-theoretic framework is through the various 
event sequences that can be generated by the 
underlying system. Collectively, these sequences 
are known as the (formal) language generated by 
the plant system, and the primary intention is to 
restrict the plant behavior within a subset of the 
generated event strings. The investigation of this 
problem is further facilitated by the introduction 
of certain mechanisms that act as formal repre¬ 
sentations of the studied systems, in the sense that 
they generate the same strings of events (i.e., the 
same formal language). Since these models are 
concerned with the representation of the event 
sequences that are generated by DES, and not 
by the exact timing of these events, they are 


frequently characterized as untimed DES models. 
In the practical applications of DES theory, the 
most popular such models are the Finite State Au¬ 
tomaton (FSA) (Cassandras and Lafortune 2008; 
Hopcroft and Ullman 1979; ► Supervisory Con¬ 
trol of Discrete-Event Systems; ► Diagnosis of 
Discrete Event Systems), and the Petri net (PN) 
(Cassandras and Lafortune 2008; Murata 1989; 
► Modeling, Analysis, and Control with Petri 
Nets). 

In the context of DES applications, these 
modeling frameworks have been used to provide 
succinct characterizations of the underlying 
event-driven dynamics and to design controllers, 
in the form of supervisors, that will restrict these 
dynamics so that they abide to safety, consistency, 
fairness, and other similar considerations 
(►Supervisory Control of Discrete-Event 
Systems). As a more concrete example, in the 
context of contemporary manufacturing, DES- 
based behavioral control - frequently referred to 
as supervisory control (SC) - has been promoted 
as a systematic methodology for the synthesis and 
verification of the control logic that is necessary 
for the support of the, so-called, SCADA 
(Supervisory Control and Data Acquisition) 
function. This control function is typically 
implemented through the Programmable Logic 
Controllers (PLCs) that have been employed in 
contemporary manufacturing shop-floors, and 
DES SC theory can support it (i) by providing 
more rigor and specificity to the models that are 
employed for the underlying plant behavior and 
the imposed specifications and (ii) by offering 
the ability to synthesize control policies that are 
provably correct by construction. Some example 
works that have pursued the application of DES 
SC along these lines can be found in Balemi 
et al. (1993), Brandin (1996), Park et al. (1999), 
Chandra et al. (2003), Endsley et al. (2006), and 
Andersson et al. (2010). 

On the other hand, the aforementioned activity 
has also defined a further need for pertinent 
interfaces that will translate (a) the plant structure 
and the target behavior to the necessary DES- 
theoretic models and (b) the obtained policies to 
PLC executables. This need has led to a line of 
research, in terms of representational models and 
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computational tools, that is complementary to the 
core DES developments described in the previous 
paragraphs. Indicatively we mention the develop¬ 
ment of GRAFCET (David and Alla 1992) and 
of the sequential function charts (SFCs) (Lewis 
1998) from the earlier times, while some more 
recent endeavor along these lines is reported in 
Wightkin et al. (2011) and Alenljung et al. (2012) 
and the references cited therein. 

Besides its employment in the manufacturing 
domain, DES SC theory has also been considered 
for the coordination of the communicating pro¬ 
cesses that take place in various embedded sys¬ 
tems (Feng et al. 2007); the systematic validation 
of the embedded software that is employed in 
various control applications, ranging from power 
systems and nuclear plants to aircraft and au¬ 
tomotive electronics (Li and Kumar 2012); the 
synthesis of the control logic in the electronic 
switches that are utilized in telecom and data 
networks; and the modeling, analysis, and control 
of the operations that take place in health-care 
systems (Sampath et al. 2008). Wassyng et al. 
(2011) gives a very interesting account of the 
gains, but also the extensive challenges, experi¬ 
enced by a team of researchers who have tried to 
apply formal methods, similar to those that have 
been promoted by the behavioral DES theory, to 
the development and certification of the software 
that manages some safety-critical operations for 
Canadian nuclear plants. 

Apart from control, untimed DES models 
have also been employed for the diagnosis of 
critical events, like certain failures, that cannot 
be observed explicitly, but their occurrence 
can be inferred from some resultant behavioral 
patterns (Sampath et al. 1996; ►Diagnosis of 
Discrete Event Systems). More recently, the 
relevant methodology has been extended with 
prognostic capability (Kumar and Takai 2010), 
while an interesting variation of it addresses 
the “dual” problem that concerns the design 
of systems where certain events or behavioral 
patterns must remain undetectable by an external 
observer who has only partial observation of the 
system behavior; this last requirement has been 
formally characterized by the notion of “opacity” 
in the relevant literature, and it finds application 


in the design and operation of secure systems 
(Dubreil et al. 2010; Saboori and Hadjicostis 
2012, 2014). 

Dealing with the Underlying 
Computational Complexity 

As revealed from the discussion of the previous 
paragraphs, many of the applications of DES SC 
theory concern the integration and coordination 
of behavior that is generated by a number of in¬ 
teracting components. In these cases, the formal 
models that are necessary for the description of 
the underlying plant behavior may grow their size 
very fast, and the algorithms that are involved in 
the behavioral analysis and control synthesis may 
become practically intractable. Nevertheless, the 
rigorous methodological base that underlies DES 
theory provides also a framework for addressing 
these computational challenges in an effective 
and structured manner. 

More specifically, DES SC theory provides 
conditions under which the control specifica¬ 
tions can be decomposable to the constituent 
plant components while maintaining the integrity 
and correctness of the overall plant behavior 
(►Supervisory Control of Discrete-Event Sys¬ 
tems; Wonham 2006). The aforementioned works 
of Brandin (1996) and Endsley et al. (2006) pro¬ 
vide some concrete examples for the application 
of modular control synthesis. On the other hand, 
there are fundamental problems addressed by SC 
theory and practice that require a holistic view 
of the underlying plant and its operation, and 
thus, they are not amenable to modular solutions. 
For such cases, DES SC theory can still provide 
effective solutions through (i) the identification of 
special plant structure, of practical relevance, for 
which the target supervisors are implementable 
in a computationally efficient manner and (ii) 
the development of structured approaches that 
can systematically trade-off the original specifi¬ 
cations for computational tractability. 

A particular application that has benefited 
from, and, at the same time, has significantly 
promoted this last capability of DES SC theory, 
is that concerning the deadlock-free operation 
of many systems where a set of processes that 
execute concurrently and in a staged manner are 
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competing, at each of their processing stages, 
for the allocation of a finite set of reusable 
resources. In DES theory, this problem is 
known as the liveness-enforcing supervision of 
sequential resource allocation systems (RAS) 
(Reveliotis 2005), and it underlies the operation 
of many contemporary applications: from the 
resource allocation taking place in contemporary 
manufacturing shop floors, Ezpeleta et al. 
(1995), Reveliotis and Ferreira (1996), and 
Jeng et al. (2002), to the traveling and/or work¬ 
space negotiation in robotic systems (Reveliotis 
and Roszkowska 2011), automated railway 
(Giua et al. 2006), and other guidepath-based 
traffic systems (Reveliotis 2000); to Internet- 
based workflow management systems like those 
envisioned for e-commerce and certain banking 
and insurance claim processing applications 
(Van der Aalst 1997); and to the allocation of 
the semaphores that control the accessibility 
of shared resources by concurrently executing 
threads in parallel computer programs (Liao 
et al. 2013). A systematic introduction to the 
DES-based modeling of RAS and their liveness- 
enforcing supervision is provided in Reveliotis 
(2005) and Zhou and Fanti (2004), while some 
more recent developments in the area are 
epitomized in Reveliotis (2007), Li et al. (2008) 
and Reveliotis and Nazeem (2013). 

Closing the above discussion on the ability of 
DES theory to address effectively the complexity 
that underlies the DES SC problem, we should 
point out that the same merits of the theory 
have also enabled the effective management of 
the complexity that underlies problems related 
to the performance modeling and control of the 
various DES applications. We shall return to this 
capability in the next section that discusses the 
achievements of DES theory in this domain. 

DES Performance Control 

and the Interplay Among Structure, 

Behavior, and Performance 

DES theory is also interested in the performance 
modeling, analysis, and control of its target 
applications w.r.t. time-related aspects like 
throughput, resource utilization, experienced 
latencies, and congestion patterns. To support 


this type of analysis, the untimed DES behavioral 
models are extended to their timed versions. This 
extension takes place by endowing the original 
untimed models with additional attributes that 
characterize the experienced delays between 
the activation of an event and its execution 
(provided that it is not preempted by some other 
conflicting event). Timed models are further 
classified by the extent and the nature of the 
randomness that is captured by them. A basic 
such categorization is between deterministic 
models, where the aforementioned delays take 
fixed values for every event and stochastic 
models which admit more general distributions. 
From an application standpoint, timed DES 
models connect DES theory to the multitude 
of applications that have been addressed by 
Dynamic Programming, Stochastic Control, 
and scheduling theory (Bertsekas 1995; Meyn 
2008; Pinedo 2002). Also, in their most general 
definition, stochastic DES models provide 
the theoretical foundation of discrete-event 
simulation (Banks et al. 2009). 

Similar to the case of behavioral DES theory, a 
practical concern that challenges the application 
of timed DES models for performance model¬ 
ing, analysis, and control is the very large size 
of these models, even for fairly small systems. 
DES theory has tried to circumvent these com¬ 
putational challenges through the development of 
methodology that enables the assessment of the 
system performance, over a set of possible con¬ 
figurations, from the observation of its behavior 
and the resultant performance at a single configu¬ 
ration. The required observations can be obtained 
through simulation, and in many cases, they can 
be collected from a single realization - or sample 
path - of the observed behavior; but then, the 
considered methods can also be applied on the 
actual system, and thus, they become a tool for 
real-time optimization, adaptation, and learning. 
Collectively, the aforementioned methods define 
a “sensitivity”-based approach to DES perfor¬ 
mance modeling, analysis, and control (Cassan- 
dras and Lafortune 2008; ►Perturbation Anal¬ 
ysis of Discrete Event Systems). Historically, 
DES sensitivity analysis originated in the early 
1980s in an effort to address the performance 
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analysis and optimization of queueing systems 
w.r.t. certain structural parameters like the ar¬ 
rival and processing rates (Ho and Cao 1991). 
But the current theory addresses more general 
stochastic DES models that bring it closer to 
broader endeavors to support incremental op¬ 
timization, approximation, and learning in the 
context of stochastic optimal control (Cao 2007). 
Some particular applications of DES sensitiv¬ 
ity analysis for the performance optimization of 
production, telecom, and computing systems can 
be found in Cassandras and Strickland (1988), 
Cassandras (1994), Panayiotou and Cassandras 
(1999), Homem-de Mello et al. (1999), Fu and 
Xie (2002), and Santoso et al. (2005). 

Another interesting development in time- 
based DES theory is the theory of (max,+) 
algebra (Baccelli et al. 1992). In its practical 
applications, this theory addresses the timed 
dynamics of systems that involve the synchro¬ 
nization of a number of concurrently executing 
processes with no conflicts among them, and 
it provides important structural results on the 
factors that determine the behavior of these 
systems in terms of the occurrence rates of 
various critical events and the experienced 
latencies among them. Motivational applications 
of (max,+) algebra can be traced in the design and 
control of telecommunication and data networks, 
manufacturing, and railway systems, and more 
recently the theory has found considerable 
practical application in the computation of 
repetitive/cyclical schedules that seek to optimize 
the throughput rate of automated robotic cells and 
of the cluster tools that are used in semiconductor 
manufacturing (Kim and Lee 2012; Lee 2008; 
Park et al. 1999). 

Both sensitivity-based methods and the theory 
of (max,+) algebra that were discussed in the 
previous paragraphs are enabled by the explicit, 
formal modeling of the DES structure and behav¬ 
ior in the pursued performance analysis and con¬ 
trol. This integrative modeling capability that is 
supported by DES theory also enables a profound 
analysis of the impact of the imposed behavioral- 
control policies upon the system performance 
and, thus, the pursuance of a more integrative 
approach to the synthesis of the behavioral and 


the performance-oriented control policies that are 
necessary for any particular DES instantiation. 
This is a rather novel topic in the relevant DES 
literature, and some recent works in this direction 
can be found in Cao (2005), Li and Reveliotis 
(2013), Markovski and Su (2013), and David- 
Henriet et al. (2013). 

The Roles of Abstraction and Fluidification 

The notions of “abstraction” and “fluidification” 
play a significant role in mastering the complex¬ 
ity that arises in many DES applications. Further¬ 
more, both of these concepts have an important 
role in defining the essence and the boundaries of 
DES-based modeling. 

In general systems theory, abstraction can be 
broadly defined as the effort to develop sim¬ 
plified models for the considered dynamics that 
retain, however, adequate information to resolve 
the posed questions in an effective manner. In 
DES theory, abstraction has been pursued w.r.t. 
the modeling of both the timed and untimed 
behaviors, giving rise to hierarchical structures 
and models. A theory for hierarchical SC is 
presented in Wonham (2006), while some appli¬ 
cations of hierarchical SC in the manufacturing 
domain are presented in Hill et al. (2010) and 
Schmidt (2012). In general, hierarchical SC relies 
on a “spatial” decomposition that tries to local¬ 
ize/encapsulate the plant behavior into a number 
of modules that interact through the communi¬ 
cation structure that is defined by the hierarchy. 
On the other hand, when it comes to timed DES 
behavior and models, a popular approach seeks 
to define a hierarchical structure for the underly¬ 
ing decision-making process by taking advantage 
of the different time scales that correspond to 
the occurrence of the various event types. Some 
particular works that formalize and systematize 
this idea in the application context of production 
systems can be found in Gershwin (1994) and 
Sethi and Zhang (1994) and the references cited 
therein. 

In fact, the DES models that have been em¬ 
ployed in many application areas can be per¬ 
ceived themselves as abstractions of dynamics of 
a more continuous, time-driven nature, where the 
underlying plant undergoes some fundamental 
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(possibly structural) transition upon the occur¬ 
rence of certain events that are defined either en¬ 
dogenously or exogenously w.r.t. these dynamics. 
The combined consideration of the discrete-event 
dynamics that are generated in the manner de¬ 
scribed above, with the continuous, time-driven 
dynamics that characterize the modalities of the 
underlying plant, has led to the extension of 
the original DES theory to the, so-called, hybrid 
systems theory. Hybrid systems theory is itself 
very rich, and it is covered in another section 
of this encyclopedia (see also ► Discrete Event 
Systems and Hybrid Systems, Connections Be¬ 
tween). From an application standpoint, it in¬ 
creases substantially the relevance of the DES 
modeling framework and brings this framework 
to some new and exciting applications. Some 
of the most prominent applications concern the 
coordination of autonomous vehicles and robotic 
systems, and a nice anthology of works concern¬ 
ing the application of hybrid systems theory in 
this particular application area can be found in 
the IEEE Robotics and Automation magazine of 
September 2011. These works also reveal the 
strong affinity that exists between hybrid systems 
theory and the DES modeling paradigm. Along 
similar lines, hybrid systems theory underlies 
also the endeavors for the development of the 
Automated Highway Systems that have been ex¬ 
plored for the support of the future urban traf¬ 
fic needs (Horowitz and Varaiya 2000). Finally, 
hybrid systems theory and its DES component 
have been explored more recently as potential 
tools for the formal modeling and analysis of the 
molecular dynamics that are studied by systems 
biology (Curry 2012). 

Fluidification, on the other hand, is the effort 
to represent as continuous flows, dynamics that 
are essentially of discrete-event type, in order 
to alleviate the computational challenges that 
typically result from discreteness and its com¬ 
binatorial nature. The resulting models serve as 
approximations of the original dynamics, fre¬ 
quently they have the formal structure of hybrid 
systems, and they define a basis for develop¬ 
ing “relaxations” for the originally addressed 
problems. Usually, their justification is of an ad 
hoc nature, and the quality of the established 


approximations is empirically assessed on the 
basis of the delivered results (by comparing these 
results to some “baseline” performance). There 
are, however, a number of cases where the relaxed 
fluid model has been shown to retain impor¬ 
tant behavioral attributes of its original coun¬ 
terpart (Dai 1995). Furthermore, some recent 
works have investigated more analytically the 
impact of the approximation that is introduced 
by these models on the quality of the delivered 
results (Wardi and Cassandras 2013). Some more 
works regarding the application of fluidification 
in the DES-theoretic modeling frameworks, and 
of the potential advantages that it brings in vari¬ 
ous application contexts, can be found in Srikant 
(2004), Meyn (2008), David and Alla (2005), and 
Cassandras and Yao (2013). 

Summary and Future Directions 

The discussion of the previous section has 
revealed the extensive application range and 
potential of DES theory and its ability to provide 
structured and rigorous solutions to complex 
and sometimes ill-defined problems. On the 
other hand, the same discussion has revealed 
the challenges that underlie many of the DES 
applications. The complexity that arises from 
the intricate and integrating nature of most DES 
models is perhaps the most prominent of these 
challenges. This complexity manifests itself in 
the involved computations, but also in the need 
for further infrastructure, in terms of modeling in¬ 
terfaces and computational tools, that will render 
DES theory more accessible to the practitioner. 

The DES community is aware of this need, 
and the last few years have seen the development 
of a number of computational platforms that seek 
to implement and leverage the existing theory 
by connecting it to various application settings; 
indicatively, we mention DESUM A (Ricker et al. 
2006), SUPREMICA (Akesson et al. 2006), and 
TCT (Feng and Wonham 2006) that support DES 
behavioral modeling, analysis, and control along 
the lines of DES SC theory, while the website 
entitled “The Petri Nets World” has an extensive 
database of tools that support modeling and 
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analysis through untimed and timed variations 
of the Petri net model. Model checking tools, like 
SMV and NuSpin, that are used for verification 
purposes are also important enablers for the prac¬ 
tical application of DES theory, and, of course, 
there are a number of programming languages 
and platforms, like Arena, AutoMod, and Simio, 
that support discrete-event simulation. However, 
with the exception of the discrete-event- 
simulation software, which is a pretty mature 
industry, the rest of the aforementioned endeavors 
currently evolve primarily within the academic 
and the broader research community. Hence, a 
remaining challenge for the DES community is 
the strengthening and expansion of the afore¬ 
mentioned computational platforms to robust 
and user-friendly computational tools. The avail¬ 
ability of such industrial-strength computational 
tools, combined with the development of a body 
of control engineers well-trained in DES theory, 
will be catalytic for bringing all the developments 
that were described in the earlier parts of this 
document even closer to the industrial practice. 

Cross-References 
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Connections Between 

► Modeling, Analysis, and Control with Petri 
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Abstract 

Auctions are procedures for selling one or 
more items to one or more bidders. Auctions 
induce games among the bidders, so notions of 


equilibrium from game theory can be applied to 
auctions. Auction theory aims to characterize 
and compare the equilibrium outcomes for 
different types of auctions. Combinatorial 
auctions arise when multiple-related items are 
sold simultaneously. 

Keywords 
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Introduction 

Three commonly used types of auctions for the 
sale of a single item are the following: 

• First price auction: Each bidder submits a bid 
one of the bidders submitting the maximum 
bid wins, and the payment for the item is 
the maximum bid. (In this context “wins” 
means receives the item, no matter what the 
payment.) 

• Second price auction or Vickrey auction : Each 
bidder submits a bid, one of the bidders sub¬ 
mitting the maximum bid wins, and the pay¬ 
ment for the item is the second highest bid. 

• English auction: The price for the item in¬ 
creases continuously or in some small incre¬ 
ments, and bidders drop out at some points 
in time. Once all but one of the bidders has 
dropped out, the remaining bidder wins and 
the payment is the price at which the last of 
the other bidders dropped out. 

A key goal of the theory of auctions is to 
predict how the bidders will bid, and predict 
the resulting outcomes of the auction: which 
bidder is the winner and what is the payment. 
For example, a seller may be interested in the 
expected payment (seller revenue). A seller may 
have the option to choose one auction format over 
another and be interested in revenue comparisons. 
Another item of interest is efficiency or social 
welfare. For sale of a single item, the outcome is 
efficient if the item is sold to the bidder with the 
highest value for the item. The book of V. Krishna 
(2002) provides an excellent introduction to the 
theory of auctions. 
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Auctions Versus Seller Mechanisms 

An important class of mechanisms within the the¬ 
ory of mechanism design are seller mechanisms, 
which implement the sale of one or more items 
to one or more bidders. Some authors would 
consider all such mechanisms to be auctions, but 
the definition of auctions is often more narrowly 
interpreted, with auctions being the subclass of 
seller mechanisms which do not depend on the 
fine details of the set of bidders. The rules of the 
three types of auction mentioned above do not 
depend on fine details of the bidders, such as the 
number of bidders or statistical information about 
how valuable the item is to particular bidders. In 
contrast, designing a procedure to sell an item to 
a known set of bidders under specific statistical 
assumptions about the bidders’ preferences in 
order to maximize the expected revenue (as in 
Myerson (1981)) would be considered a problem 
of mechanism design, which is outside the more 
narrowly defined scope of auctions. The narrower 
definition of auctions was championed by R. Wil¬ 
son (1987). An article on ►Mechanism Design 
appears in this encyclopedia. 

Equilibrium Strategies in Auctions 

An auction induces a noncooperative game 
among the bidders, and a commonly used 
predictor of the outcome of the auction is an 
equilibrium of the game, such as a Nash or 
Bayes-Nash equilibrium. For a risk neutral bidder 
i with value X; for the item, if the bidder wins 
and the payment is Mi , the payoff of the bidder 
is Xi — Mi . If the bidder does not win, the payoff 
of the bidder is zero. If, instead, the bidder is 
risk averse with risk aversion measured by an 
increasing utility function w;, the payoff of the 
bidder would be u t (x/ — M t ) if the bidder wins 
and Ui (0) if the bidder does not win. 

The second price auction format is character¬ 
ized by simplicity of the bidding strategies. If 
bidder i knows the value X; of the item to himself, 
then for the second price auction format, a weakly 
dominant strategy for the bidder is to truthfully 
report X; as his bid for the item. Indeed, if y ? - is 


the highest bid of the other bidders, the payoff 
of bidder i is Ui (x/ — y*) if he wins and Ui (0) if 
he does not win. Thus, bidder i would prefer to 
win whenever w/ (x z — y*) > (0) and not win 

whenever w;(x z - — y*) < Ui( 0). That is precisely 
what happens if bidder i bids X/, no matter what 
the bids of the other bidders are. That is, bidding 
Xi is a weakly dominant strategy for bidder i . 

Nash equilibrium can be found for the other 
types of auctions under a model with incomplete 
information, in which the type of each bidder i is 
equal to the value of the object to the bidder and is 
modeled as a random variable X ? - with a density 
function fi supported by some interval [<z/ ,bi]. 
A simple case is that the bidders are all risk 
neutral, the densities are all equal to some fixed 
density /, and the X t ’s are mutually independent. 
The English auction in this context is equiva¬ 
lent to the second price auction: in an English 
auction, dropping out when the price reaches 
his true value is a weakly dominant strategy for 
a bidder, and for the weakly dominant strategy 
equilibrium, the outcome of the auction is the 
same as for the second price auction. For the first 
price auction in this symmetric case, there exists a 
symmetric Bayesian equilibrium. It corresponds 
to all bidders using the bidding function /3 (so the 
bid of bidder i is /3(X Z -)), where /3 is given by 
/3(x) = E[Y\\Y\ < x]. The expected revenue to 
the seller in this case is E[Y\ \Y\ < Xi], which is 
the same as the expected revenue for the second 
price auction and English auction. 

Equilibrium for Auctions 
with Interdependent Valuations 

Seminal work of Milgrom and Weber (1982) 
addresses the performance of the above three 
auction formats in case the bidders do not 
know the value of the item, but each bidder i 
has a private signal X, about the value Vi of 
the item to bidder /. The values and signals 
(Xi,... X n , V\, ..., V n ) can be interdependent. 
Under the assumption of invariance of the 
joint distribution of (Xi,... X n , V\ ,..., V n ) 
under permutation of the bidders and a strong 
form of positive correlation of the random 
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variables (X \,... X n , V \ 9 ..., V n ) (see Milgrom 
and Weber 1982 or Krishna 2002 for details), a 
symmetric Bayes-Nash equilibrium is identified 
for each of the three auction formats mentioned 
above, and the expected revenues for the 
three auction formats are shown to satisfy the 
ordering/?( first price ) < R( SQCOnd P rice ) < R (English). 

A significant extension of the theory of Milgrom 
and Weber due to DeMarzo et al. (2005) is the 
theory of security-bid auctions in which bidders 
compete to buy an asset and the final payment is 
determined by a contract involving the value of 
the asset as revealed after the auction. 


Combinatorial Auctions 

Combinatorial auctions implement the simultane¬ 
ous sale of multiple items. A simple version is the 
simultaneous ascending price auction with activ¬ 
ity constraints (Cramton 2006; Milgrom 2004). 
Such an auction procedure was originally pro¬ 
posed by Preston, McAfee, Paul Milgrom, and 
Robert Wilson for the US FCC wireless spectrum 
auction in 1994 and was used for the vast majority 
of spectrum auctions worldwide since then Cram- 
ton (2013). The auction proceeds in rounds. In 
each round a minimum price is set for each item, 
with the minimum prices for the initial round 
being reserve prices set by the seller. A given 
bidder may place a bid on an item in a given 
round such that the bid is greater than or equal 
to the minimum price for the item. If one or more 
bidders bid on an item in a round, a provisional 
winner of the item is selected from among the 
bidders with the highest bid for the item in the 
round, with the new provisional price being the 
highest bid. The minimum price for the item is 
increased 10 % (or some other small percentage) 
above the new provisional price. Once there is a 
round with no bids, the set of provisional winners 
is identified. Often constraints are placed on the 
bidders in the form of activity rules. An activity 
rule requires a bidder to maintain a history of 
bidding in order to continue bidding, so as to 
prevent bidders from not bidding in early rounds 
and bidding aggressively in later rounds. The 
motivation for activity rules is to promote price 


discovery to help bidders select the packages (or 
bundles) of items most suitable for them to buy. 
A key is that complementarities may exist among 
the items for a given bidder. Complementarity 
means that a bidder may place a significantly 
higher value on a bundle of items than the sum of 
values the bidder would place on the items indi¬ 
vidually. Complementarities lead to the exposure 
problem , which occurs when a bidder wins only 
a subset of items of a desired bundle at a price 
which is significantly higher than the price paid. 
For example, a customer might place a high value 
on a particular pair of shoes, but little value on a 
single shoe alone. 

A variation of simultaneous ascending price 
auctions for combinatorial auctions is auctions 
with package bidding (see, e.g., Ausubel and 
Milgrom 2002; Cramton 2013). A bidder will 
either win a package of items he bid for or no 
items, thereby eliminating the exposure problem. 
For example, in simultaneous clock auctions with 
package bidding, the price for each item increases 
according to a fixed schedule (the clock), and bid¬ 
ders report the packages of items they would pre¬ 
fer to purchase for the given prices. The price for 
a given item stops increasing when the number of 
bidders for that item drops to zero or one, and the 
clock phase of the auction is complete when the 
number of bidders for every item is zero or one. 
Following the clock phase, bidders can submit 
additional bids for packages of items. With the 
inputs from bidders acquired during the clock 
phase and supplemental bid phase, the auctioneer 
then runs a winner determination algorithm to 
select a set of bids for non-overlapping packages 
that maximizes the sum of the bids. This winner 
determination problem is NP hard, but is com¬ 
putationally feasible using integer programming 
or dynamic programming methods for moderate 
numbers of items (perhaps up to 30). In addition, 
the vector of payments charged to the winners 
is determined by a two-step process. First, the 
(generalized) Vickrey price for each bidder is 
determined, which is defined to be the minimum 
the bidder would have had to bid in order to 
be a winner. Secondly, the vector of Vickrey 
prices is projected onto the core of the reported 
prices. The second step insures that no coalition 
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consisting of a set of bidders and the seller can 
achieve a higher sum of payoffs (calculated using 
the bids received) for some different selection 
of winners than the coalition received under the 
outcome of the auction. While this is a promising 
family of auctions, the projection to the core 
introduces some incentive for bidders to deviate 
from truthful reporting, and much remains to be 
understood about such auctions. 

Summary and Future Directions 

Auction theory provides a good understanding 
of the outcomes of the standard auctions for the 
sale of a single item. Recently emerging auc¬ 
tions, such as for the generation and consumption 
of electrical power, and for selection of online 
advertisements, are challenging to analyze and 
comprise a direction for future research. Much 
remains to be understood in the theory of combi¬ 
natorial auctions, such as the degree of incentive 
compatibility offered by core projecting auctions. 

Cross-References 

► Game Theory: Historical Overview 
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Abstract 

Autotuning, or automatic tuning, means that the 
controller is tuned automatically. Autotuning is 
normally applied to PID controllers, but the tech¬ 
nique can also be used to initialize more advanced 
controllers. The main approaches to autotuning 
are based on step response analysis or frequency 
response analysis obtained using relay feedback. 
Autotuning has been well received in industry, 
and today most distributed control systems have 
some kind of autotuning technique. 

Keywords 

Automatic tuning; Gain scheduling; PID control; 
Process control; Proportional-integral-derivative 
control; Relay feedback 

Background 

In the late 1970s and early 1980s, there was a 
quite rapid change of controller implementation 
in process control. The analog controllers were 
replaced by computer-based controllers and dis¬ 
tributed control systems. The functionality of the 
new controllers was often more or less a copy 
of the old analog equipment, but new functions 
that utilized the computer implementation were 
gradually introduced. One of the first functions of 
this type was autotuning. Autotuning is a method 
to tune the controllers, normally PID controllers, 
automatically. 


What Is Autotuning? 

A PID controller in its basic form has the struc¬ 
ture 
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u(t) = K^e(t) + yr y e(x)dr + T d ^e(f)Y 
0 

where u is the controller output and e = y sp —y is 
the control error, where y sp is the setpoint and y 
is the process output. There are three parameters 
in the controller, gain K , integral time 7}, and 
derivative time Td . These parameters have to be 
set by the user. Their values are dependent of the 
process dynamics and the specifications of the 
control loop. 

A process control plant may have thousands 
of control loops, which means that maintaining 
high-performance controller tuning can be very 
time consuming. This was the main reason why 
procedures for automatic tuning were installed so 
rapidly in the computer-based controllers. 

When a controller is to be tuned, the following 
steps are normally performed by the user: 

1. To determine the process dynamics, a minor 
disturbance is injected by changing the control 
signal. 

2. By studying the response in the process out¬ 
put, the process dynamics can be determined, 
i.e., a process model is derived. 

3. The controller parameters are finally deter¬ 
mined based on the process model and the 
specifications. 

Autotuning means simply that these three 
steps are performed automatically. Instead of 
having a human to perform these tasks, they 
are performed automatically on demand from 
the user. Ideally, the autotuning should be fully 
automatic, which means that no information 
about the process dynamics is required from 
the user. 

Automatic tuning can be performed in many 
ways. The process disturbance can take differ¬ 
ent forms, e.g., in the form of step changes or 
some kind of oscillatory excitation. The model 
obtained can be more or less accurate. There are 
also many ways to tune the controller based on 
the process model. 

Here, we will discuss two main approaches 
for autotuning, namely, those that are based on 
step response analysis and those that are based 
on frequency response analysis. 


Methods Based on Step Response 
Analysis 

Most methods for automatic tuning of PID 
controllers are based on step response analysis. 
When the operator wishes to tune the controller, 
an open-loop step response experiment is 
performed. A process model is then obtained 
from the step response, and controller parameters 
are determined. This is usually done using simple 
formulas or look-up tables. 

The most common process model used for 
PID controller tuning based on step response ex¬ 
periments is the first-order plus dead-time model 


where K p is the static gain, T is the apparent 
time constant, and L is the apparent dead time. 
These three parameters can be obtained from a 
step response experiment according to Fig. 1. 

Static gain K p is given by the ratio between 
the steady-state change in process output and 
the magnitude of the control signal step, K p = 
Ay/Au. Dead-time L is determined from the 
time elapsed from the step change to the inter¬ 
section of the largest slope of the process output 
with the level of the process output before the step 
change. Finally, time constant T is the time when 
the process output has reached 63 % of its final 
value, subtracted by L. 


Process output 



A u 


Autotuning, Fig. 1 Determination of K p , L , and T from 
a step response experiment 
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The greatest difficulty in carrying out tuning 
automatically is in selecting the amplitude of the 
step. The user naturally wants the disturbance to 
be as small as possible so that the process is not 
disturbed more than necessary. On the other hand, 
it is easier to determine the process model if the 
disturbance is large. The result of this dilemma 
is usually that the user has to decide how large 
the step in the control signal should be. Another 
problem is to determine when the step response 
has reached its final value. 


Methods Based on Frequency 
Response Analysis 

Frequency-domain characteristics of the process 
can be obtained by adding sinusoidals to the 
control signal, but without knowing the frequency 
response of the process, the interesting frequency 
range and acceptable amplitudes are not known. 
A method that automatically provides a rele¬ 
vant frequency response can be determined from 
experiments with relay feedback according to 
Fig. 2. Notice that there is a switch that selects 
either relay feedback or ordinary PID feedback. 
When it is desired to tune the system, the PID 
function is disconnected and the system is con¬ 
nected to relay feedback control. Relay feedback 
control is the same as on/off control, but where 
the on and off levels are carefully chosen and not 
0 and 100 %. The relay feedback makes the con¬ 
trol loop oscillate. The period and the amplitude 
of the oscillation is determined when steady-state 
oscillation is obtained. This gives the ultimate 
period and the ultimate gain. The parameters of a 
PID controller can then be determined from these 
values. The PID controller is then automatically 
switched in again, and the control is executed 
with the new PID parameters. 

For large classes of processes, relay feedback 
gives an oscillation with period close to the ulti¬ 
mate frequency co u , as shown in Fig. 3, where the 
control signal is a square wave and the process 
output is close to a sinusoid. The gain of the 
transfer function at this frequency is also easy to 
obtain from amplitude measurements. 



Autotuning, Fig. 2 The relay autotuner. In the tuning 
mode the process is connected to relay feedback 


Describing function analysis can be used to 
determine the process characteristics. The de¬ 
scribing function of a relay with hysteresis is 



where d is the relay amplitude, 6 the relay hys¬ 
teresis, and a the amplitude of the input signal. 
The negative inverse of this describing function is 
a straight line parallel to the real axis; see Fig. 4. 
The oscillation corresponds to the point where the 
negative inverse describing function crosses the 
Nyquist curve of the process, i.e., where 


G{ico) = 


1 

N(a) 


Since N(a) is known, G(ioo) can be determined 
from the amplitude a and the frequency co of the 
oscillation. 

Notice that the relay experiment is easily au¬ 
tomated. There is often an initialization phase 
where the noise level in the process output is de¬ 
termined during a short period of time. The noise 
level is used to determine the relay hysteresis 
and a desired oscillation amplitude in the process 
output. After this initialization phase, the relay 
function is introduced. Since the amplitude of the 
oscillation is proportional to the relay output, it is 
easy to control it by adjusting the relay output. 


Different Adaptive Techniques 

In the late 1970s, at the same time as autotuning 
procedures were developed and implemented in 
industrial controllers, there was a large academic 
interest in adaptive control. These two concepts 
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Autotuning, Fig. 3 Process output y and control signal u during relay feedback 



Autotuning, Fig. 4 The negative inverse describing 
function of a relay with hysteresis — 1 / N(a) and a Nyquist 
curve G(ico) 

are often mixed up with each other. Autotun¬ 
ing is sometimes called tuning on demand. An 
identification experiment is performed, controller 
parameters are determined, and the controller 
is then run with fixed parameters. An adaptive 
controller is, however, a controller where the con¬ 
troller parameters are adjusted online based on 
information from routine data. Automatic tuning 
and adaptive control have, however, one thing in 
common, namely, that they are methods to adapt 
the controller parameters to the actual process 


dynamics. Therefore, they are both called adap¬ 
tive techniques. 

There is a third adaptive technique, namely, 
gain scheduling. Gain scheduling is a system 
where controller parameters are changed 
depending on measured operating conditions. 
The scheduling variable can, for instance, be 
the measurement signal, controller output, or an 
external signal. For historical reasons the word 
gain scheduling is used even if other parameters 
like integral time or derivative time are changed. 
Gain scheduling is a very effective way of 
controlling systems whose dynamics change 
with the operating conditions. Automatic tuning 
has made it possible to generate gain schedules 
automatically. 

Although research on adaptive techniques has 
almost exclusively focused on adaptation, ex¬ 
perience has shown that autotuning and gain 
scheduling have much wider industrial applica¬ 
bility. Figure 5 illustrates the appropriate use of 
the different techniques. 

Controller performance is the first issue to 
consider. If requirements are modest, a controller 
with constant parameters and conservative tuning 
can be used. Other solutions should be considered 
when higher performance is required. 
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Autotuning, Fig. 5 When 
to use different adaptive 
techniques 


Constant but 


Predictable 


Unpredictable 

unknown dynamics 


changes in dynamics 


changes in dynamics 


(^Auto- 

tuning^^) 



Constant controller 
parameters 



Auto-tuning j 


Gain scheduling 


Predictable 
parameter changes 


Auto-tuning 


Adaptation 


Unpredictable 
parameter changes 


If the process dynamics are constant, a con¬ 
troller with constant parameters should be used. 
The parameters of the controller can be obtained 
by autotuning. 

If the process dynamics or the charac¬ 
ter of the disturbances are changing, it is 
useful to compensate for these changes by 
changing the controller. If the variations 
can be predicted from measured signals, 
gain scheduling should be used since it is 
simpler and gives superior and more robust 
performance than continuous adaptation. 
Typical examples are variations caused by 
nonlinearities in the control loop. Auto tuning 
can be used to build up the gain schedules 
automatically. 

There are also cases where the variations in 
process dynamics are not predictable. Typical 
examples are changes due to unmeasurable vari¬ 
ations in raw material, wear, fouling, etc. These 
variations cannot be handled by gain scheduling 
but must be dealt with by adaptation. An auto¬ 
tuning procedure is often used to initialize the 
adaptive controller. It is then sometimes called 
pre-tuning or initial tuning. 

To summarize, autotuning is a key component 
in all adaptive techniques and a prerequisite for 
their use in practice. 


Industrial Products 

Commercial PID controllers with adaptive tech¬ 
niques have been available since the beginning of 
the late 1970s, both in single-station controllers 
and in distributed control systems. 

Two important, but distinct, applications of 
PID autotuners are temperature controllers and 
process controllers. Temperature controllers are 
primarily designed for temperature control, 
whereas process controllers are supposed to 
work for a wide range of control loops in the 
process industry such as flow, pressure, level, 
temperature, and concentration control loops. 
Automatic tuning is easier to implement in 
temperature controllers, since most temperature 
control loops have several common features. 
This is the main reason why automatic tuning 
was introduced more rapidly in these controllers. 

Since the processes that are controlled with 
process controllers may have large differences 
in their dynamics, tuning becomes more difficult 
compared to the pure temperature control loops. 

Automatic tuning can also be performed by 
external devices which are connected to the con¬ 
trol loop during the tuning phase. Since these 
devices are supposed to work together with con¬ 
trollers from different manufacturers, they must 
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be provided with quite a lot of information about 
the controller structure and parameterization in 
order to provide appropriate controller param¬ 
eters. Such information includes signal ranges, 
controller structure (series or parallel form), sam¬ 
pling rate, filter time constants, and units of 
the different controller parameters (gain or pro¬ 
portional band, minutes or seconds, time or re¬ 
peats/time). 


Summary and Future Directions 

Most of the autotuning methods that are avail¬ 
able in industrial products today were devel¬ 
oped about 30 years ago, when computer-based 
controllers started to appear. These autotuners 
are often based on simple models and simple 
tuning rules. With the computer power available 
today, and the increased knowledge about PID 
controller design, there is a potential for improv¬ 
ing the autotuners, and more efficient autotuners 
will probably appear in industrial products quite 
soon. 
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Abstract 

In this article, we overview averaging algorithms 
and consensus in the context of distributed coor¬ 
dination and control of networked systems. The 
two subjects are closely related but not iden¬ 
tical. Distributed consensus means that a team 
of agents reaches an agreement on certain vari¬ 
ables of interest by interacting with their neigh¬ 
bors. Distributed averaging aims at computing 
the average of certain variables of interest among 
multiple agents by local communication. Hence 
averaging can be treated as a special case of 
consensus - average consensus. For distributed 
consensus, we introduce distributed algorithms 
for agents with single-integrator, general linear, 
and nonlinear dynamics. For distributed averag¬ 
ing, we introduce static and dynamic averaging 
algorithms. The former is useful for computing 
the average of initial conditions (or constant sig¬ 
nals), while the latter is useful for computing the 
average of time-varying signals. Future research 
directions are also discussed. 

Keywords 

Cooperative control; Coordination; Distributed 
control; Multi-agent systems; Networked systems 


Introduction 

In the area of control of networked systems, low 
cost, high adaptivity and scalability, great robust¬ 
ness, and easy maintenance are critical factors. 
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To achieve these factors, distributed coordination 
and control algorithms that rely on only local in¬ 
teraction between neighboring agents to achieve 
collective group behavior are more favorable than 
centralized ones. In this article, we overview 
averaging algorithms and consensus in the con¬ 
text of distributed coordination and control of 
networked systems. 

Distributed consensus means that a team of 
agents reaches an agreement on certain variables 
of interest by interacting with their neighbors. 
A consensus algorithm is an update law that 
drives the variables of interest of all agents in 
the network to converge to a common value 
(Jadbabaie et al. 2003; Olfati-Saber et al. 2007; 
Ren and Beard 2008). Examples of the variables 
of interest include a local representation of the 
center and shape of a formation, the rendezvous 
time, the length of a perimeter being monitored, 
the direction of motion for a multi-vehicle swarm, 
and the probability that a target has been identi¬ 
fied. Consensus algorithms have applications in 
rendezvous, formation control, flocking, attitude 
alignment, and sensor networks (Bai et al. 2011a; 
Bullo et al. 2009; Mesbahi and Egerstedt 2010; 
Qu 2009; Ren and Cao 2011). Distributed aver¬ 
aging algorithms aim at computing the average 
of certain variables of interest among multiple 
agents by local communication. Distributed av¬ 
eraging finds applications in distributed comput¬ 
ing, distributed signal processing, and distributed 
optimization (Tsitsiklis et al. 1986). Hence the 
variables of interest are dependent on the appli¬ 
cations (e.g., a sensor measurement or a network 
quantity). Consensus and averaging algorithms 
are closely connected and yet nonidentical. When 
all agents are able to compute the average, they 
essentially reach a consensus, the so-called av¬ 
erage consensus. On the other hand, when the 
agents reach a consensus, the consensus value 
might or might not be the average value. 

Graph Theory Notations. Suppose that there 
are n agents in a network. A network topology 
(equivalently, graph) Q consisting of a node set 
V = {1,..., n} and an edge set £ c V x V will 
be used to model interaction (communication or 
sensing) between the n agents. An edge (/, j) in 


a directed graph denotes that agent j can obtain 
information from agent i , but not necessarily vice 
versa. In contrast, an edge (/, j ) in an undirected 
graph denotes that agents i and j can obtain 
information from each other. Agent j is a (in-) 
neighbor of agent i if (j, i) e £. Let A4 denote 
the neighbor set of agent i . We assume that i G 
A//. A directed path is a sequence of edges in 
a directed graph of the form (i\, h), (h, h),... 9 
where ij G V. An undirected path in an undi¬ 
rected graph is defined analogously. A directed 
graph is strongly connected if there is a directed 
path from every agent to every other agent. An 
undirected graph is connected if there is an undi¬ 
rected path between every pair of distinct agents. 
A directed graph has a directed spanning tree if 
there exists at least one agent that has directed 
paths to all other agents. For example, Fig. 1 
shows a directed graph that has a directed span¬ 
ning but is not strongly connected. The adjacency 
matrix A = [%] G associated with Q is 

defined such that aij (the weight of edge (j, /)) 
is positive if agent j is a neighbor of agent i 
while a^ = 0 otherwise. The (nonsymmetric) 
Laplacian matrix (Agaev and Chebotarev 2005) 
C = [lij] g R nxn associated with A and hence Q 
is defined as in = J]. a ij an d Uj = —a^ for 
all i ^ j . For an undirected graph, we assume 
that atj = aji. A graph is balanced if for every 
agent the total edge weights of its incoming links 
is equal to the total edge weights of its outgoing 
links (£"=i atj = T!j=i a ji for a11 0- 



Averaging Algorithms and Consensus, Fig. 1 A di¬ 
rected graph that characterizes the interaction among five 
agents, where A,, i = 1,..., 5, denotes agent i . An 
arrow from agent j to agent i indicates that agent i 
receives information from agent j . The directed graph 
has a directed spanning tree but is not strongly connected. 
Here both agents 1 and 2 have directed paths to all other 
agents 








Averaging Algorithms and Consensus 


57 


Consensus 

Consensus has a long history in management 
science, statistical physics, and distributed com¬ 
puting and finds recent interests in distributed 
control. While in the area of distributed control 
of networked systems the term consensus was 
initially more or less dominantly referred to the 
case of a continuous-time version of a distributed 
linear averaging algorithm, such a term has been 
broadened to a great extent later on. Related 
problems to consensus include synchronization, 
agreement, and rendezvous. The study of con¬ 
sensus can be categorized in various manners. 
For example, in terms of the final consensus 
value, the agents could reach a consensus on 
the average, a weighted average, the maximum 
value, the minimum value, or a general function 
of their initial conditions, or even a (changing) 
state that serves as a reference. A consensus 
algorithm could be linear or nonlinear. Consensus 
algorithms can be designed for agents with linear 
or nonlinear dynamics. As the agent dynamics 
become more complicated, so do the algorithm 
design and analysis. Numerous issues are also 
involved in consensus such as network topologies 
(fixed vs. switching, deterministic vs. random, 
directed vs. undirected, asynchronous vs. syn¬ 
chronous), time delay, quantization, optimality, 
sampling effects, and convergence speed. For 
example, in real applications, due to nonuniform 
communication/sensing ranges or limited field of 
view of sensors, the network topology could be 
directed rather than undirected. Also due to unre¬ 
liable communication/sensing and limited com¬ 
munication/sensing ranges, the network topology 
could be switching rather than fixed. 

Consensus for Agents 

with Single-Integrator Dynamics 

We start with a fundamental consensus algo¬ 
rithm for agents with single-integrator dynamics. 
The results in this section follow from Jadbabaie 
et al. (2003), Olfati-Saber et al. (2007), Ren and 
Beard (2008), Moreau (2005), and Agaev and 
Chebotarev (2000). Consider agents with single¬ 
integrator dynamics 


Xi (t) = Ui (t ), i = (1) 

where xi is the state and u\ is the control input. 
A common consensus algorithm for (1) is 

«;(0 = X (2) 

j€Afi(t) 

where Mi ( t ) is the neighbor set of agent i at time 
t and ciij (t) is the (/, j ) entry of the adjacency 
matrix A of the graph Q at time t . A consequence 
of (2) is that the state Xi (t) of agent i is driven 
toward the states of its neighbors or equivalently 
toward the weighted average of its neighbors’ 
states. The closed-loop system of (1) using (2) 
can be written in matrix form as 

x(t) = -C(t)x(t), (3) 

where x is a column stack vector of all jt; and 
C is the Laplacian matrix. Consensus is reached 
if for all initial states, the agents’ states even¬ 
tually become identical. That is, for all X/(0), 
| V/ (t) — xj (t) | approaches zero eventually. 

The properties of the Laplacian matrix C play 
an important role in the analysis of the closed- 
loop system (3). When the graph Q (and hence 
the associated Laplacian matrix C) is fixed, (3) 
can be analyzed by studying the eigenvalues and 
eigenvectors of C. Due to its special structure, 
for any graph Q, the associated Laplacian ma¬ 
trix C has at least one zero eigenvalue with an 
associated right eigenvector 1 (column vector of 
all ones) and all other eigenvalues have positive 
real parts. To ensure consensus, it is equivalent 
to ensure that C has a simple zero eigenvalue. It 
can be shown that the following three statements 
are equivalent: (i) the agents reach a consensus 
exponentially for arbitrary initial states; (ii) the 
graph Q has a directed spanning tree; and (iii) the 
Laplacian matrix C has a simple zero eigenvalue 
with an associated right eigenvector 1 and all 
other eigenvalues have positive real parts. When 
consensus is reached, the final consensus value is 
a weighted average of the initial states of those 
agents that have directed paths to all other agents 
(see Fig. 2 for an illustration). 
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Averaging Algorithms 
and Consensus, Fig. 2 

Consensus for five agents 
using the algorithm (2) for 
(1). Here the graph Q is 
given by Fig. 1. The initial 
states are chosen as 
x;(0) = 2i, where 
i = 1,..., 5. Consensus is 
reached as Q has a directed 
spanning tree. The final 
consensus value is a 
weighted average of the 
initial states of agents 1 
and 2 



When the graph Q(t) is switching at time 
instants to, t \,..the solution to the closed-loop 
system (3) is given by x(t) = ®(t, 0)x(0), where 
<P(t, 0) is the transition matrix corresponding to 
—C(t). Consensus is reached if ®(t, 0) eventually 
converges to a matrix with identical rows. Here 
<P(t, 0) = <P(t,tk)&(tk,tk-i) mm -#(fi,0), where 
< P(tk , 4-i) is the transition matrix corresponding 
to C(t) at time interval [4_i, 4]. It turns out that 
each transition matrix is a row-stochastic matrix 
with positive diagonal entries. A square matrix is 
row stochastic if all its entries are nonnegative 
and all of its row sums are one. The consen¬ 
sus convergence can be analyzed by studying 
the product of row-stochastic matrices. Another 
analysis technique is a Lyapunov approach (e.g., 
max Xi — min Xi ). It can be shown that the agents’ 
states reach a consensus if there exists an infinite 
sequence of contiguous, uniformly bounded time 
intervals, with the property that across each such 
interval, the union of the graphs Q{t) has a 
directed spanning tree. That is, across each such 
interval, there exists at least one agent that can 
directly or indirectly influence all other agents. It 
is also possible to achieve certain nice features by 
designing nonlinear consensus algorithms of the 

form ui (0 = E;eA/Ho a yVM*/(0 - *i(0]> 
where is a nonlinear function satisfying 
certain properties. One example is a continu¬ 
ous nondecreasing odd function. For example, a 
saturation type function could be introduced to 


account for actuator saturation and a signum type 
function could be introduced to achieve finite¬ 
time convergence. 

As shown above, for single-integrator dynam¬ 
ics, the consensus convergence is determined 
entirely by the network topologies. The primary 
reason is that the single-integrator dynamics are 
internally stable. However, when more compli¬ 
cated agent dynamics are involved, the consen¬ 
sus algorithm design and analysis become more 
complicated. On one hand, whether the graph is 
undirected (respectively, switching) or not has 
significant influence on the complexity of the 
consensus analysis. On the other hand, not only 
the network topology but also the agent dynamics 
themselves and the parameters in the consensus 
algorithm play important roles. Next we intro¬ 
duce consensus for agents with general linear and 
nonlinear dynamics. 

Consensus for Agents with General Linear 
Dynamics 

In some circumstances, it is relevant to deal with 
agents with general linear dynamics, which can 
also be regarded as linearized models of certain 
nonlinear dynamics. The results in this section 
follow from Li et al. (2010). Consider agents with 
general linear dynamics 

Xi = Axi + Bui, y t = Cxi , 


(4) 
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where Xi e M m , u t e M 77 , and yi e R q 
are, respectively, the state, the control input, and 
the output of agent i and A, B , C are constant 
matrices with compatible dimensions. 

When each agent has access to the relative 
states between itself and its neighbors, a dis¬ 
tributed static consensus algorithm is designed 
for (4) as 

Ml = cK y] aij(Xi - Xj ), (5) 

jeATi 

where c > 0 is a coupling gain, K e M^ xm 
is the feedback gain matrix, and A4 and are 
defined as in (2). It can be shown that if the 
graph Q has a directed spanning tree, consensus 
is reached using (5) for (4) if and only if all the 
matrices A + cXj(£)BK , where A ? (X) ^ 0 are 
Hurwitz. Here A/(X) denotes the zth eigenvalue 
of the Laplacian matrix C. A necessary condition 
for reaching a consensus is that the pair (A, B) is 
stabilizable. The consensus algorithm (5) can be 
designed via two steps: 

(a) Solve the linear matrix inequality A T P + 
PA — 2 BB t < 0 to get a positive-definite 
solution P . Then let the feedback gain matrix 
K =-B T P~\ 

(b) Select the coupling strength c larger than the 

threshold value 1/ min Re[A, (X)l, where 

h{C)± o 

Re(-) denotes the real part. 

Note that here the threshold value depends on 
the eigenvalues of the Laplacian matrix, which 
is in some sense global information. To over¬ 
come such a limitation, it is possible to introduce 
adaptive gains in the algorithm design. The gains 
could be updated dynamically using local infor¬ 
mation. 

When the relative states between each agent 
and its neighbors are not available, one is mo¬ 
tivated to make use of the output information 
and employ observer-based design to estimate 
the relative states. An observer-type consensus 
algorithm is designed for (4) as 

ii = (A -b BF)vi + cL ^2 a ij[C( v i v j) 

j 

- 0 >« - y.01 

Ui = F v t , / = !,••• ,n, (6) 


where Vi e M m are the observer states, F e 
R pxn and L e M mx ^ are the feedback gain 
matrices, and c > 0 is a coupling gain. Here the 
algorithm (6) uses not only the relative outputs 
between each agent and its neighbors but also 
its own and neighbors’ observer states. While 
relative outputs could be obtained through local 
measurements, the neighbors’ observer states can 
only be obtained via communication. It can be 
shown that if the graph Q has a directed span¬ 
ning tree, consensus is reached using (6) for (4) 
if the matrices A + BF and A + cA/(£)LC, 
where A ? (X) ^ 0, are Hurwitz. The observer- 
type consensus algorithm (6) can be seen as an 
extension of the single-system observer design to 
multi-agent systems. Here the separation princi¬ 
ple of the traditional observer design still holds 
in the multi-agent setting in the sense that the 
feedback gain matrices F and L can be designed 
separately. 

Consensus for Agents with Nonlinear 
Dynamics 

In multi-agent applications, agents usually rep¬ 
resent physical vehicles with special dynamics, 
especially nonlinear dynamics for the most part. 
Examples include Lagrangian systems for robotic 
manipulators and autonomous robots, nonholo- 
nomic systems for unicycles, attitude dynamics 
for rigid bodies, and general nonlinear systems. 
Similar to the consensus algorithms for linear 
multi-agent systems, the consensus algorithms 
used for these nonlinear agents are often designed 
based on state differences between each agent and 
its neighbors. But due to the inherent nonlinear¬ 
ity, the problem is more complicated and addi¬ 
tional terms might be required in the algorithm 
design. The main techniques used in the con¬ 
sensus analysis for nonlinear multi-agent systems 
are often Lyapunov-based techniques (Lyapunov 
functions, passivity theory, nonlinear contraction 
analysis, and potential functions). 

Early results on consensus for agents 
with nonlinear dynamics primarily focus on 
undirected graphs to exploit the symmetry to 
facilitate the construction of Lyapunov function 
candidates. Unfortunately, the extension from an 
undirected graph to a directed one is nontrivial. 
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For example, the directed graph does not preserve 
the passivity properties in general. Moreover, the 
directed graph could cause difficulties in the 
design of (positive-definite) Lyapunov functions. 
One approach is to integrate the nonnegative left 
eigenvector of the Laplacian matrix associated 
with the zero eigenvalue into the Lyapunov 
function, which is valid for strongly connected 
graphs and has been applied in some problems. 
Another approach is based on sliding mode 
control. The idea is to design a sliding surface 
for reaching a consensus. Taking multiple 
Lagrangian systems as an example, the agent 
dynamics are represented by 

M,(q,)qi + Ci{qi,qi)qi + gi(qi) = r 

/ = 1, •••,«, (7) 

where qt e R p is the vector of generalized 
coordinates, e R pxp is the symmetric 

positive-definite inertia matrix, C; (qi , qi )qi e R p 
is the vector of Coriolis and centrifugal torques, 
gi (qj) e R p is the vector of gravitational torque, 
and Ti e R p is the vector of control torque on the 
i th agent. The sliding surface can be designed as 

Si = qi - q ri =qi + 01 ^ a u (,q t - qj) (8) 

j 

where a is a positive scalar. Note that when 
si = 0, (8) is actually the closed-loop system of a 
consensus algorithm for single integrators. Then 
if the control torque r \ can be designed using 
only local information from neighbors to drive Sf 
to zero, consensus will be reached as s; can be 
treated as a vanishing disturbance to a system that 
reaches consensus exponentially. 

It is generally very challenging to deal with 
general directed or switching graphs for agents 
with more complicated dynamics other than 
single-integrator dynamics. In some cases, the 
challenge could be overcome by introducing and 
updating additional auxiliary variables (often 
observer-based algorithms) and exchanging 
these variables between neighbors (see, e.g., 
(6)). In the algorithm design, the agents might 
use not only relative physical states between 


neighbors but also local auxiliary variables from 
neighbors. While relative physical states could 
be obtained through sensing, the exchange of 
auxiliary variables can only be achieved by 
communication. Hence such generalization is 
obtained at the price of increased communication 
between the neighboring agents. Unlike some 
other algorithms, it is generally impossible 
to implement the algorithm relying on purely 
relative sensing between neighbors without the 
need for communication. 


Averaging Algorithms 

Existing distributed averaging algorithms are pri¬ 
marily static averaging algorithms based on linear 
local average iterations or gossip iterations. These 
algorithms are capable of computing the average 
of the initial conditions of all agents (or con¬ 
stant signals) in a network. In particular, the lin¬ 
ear local-average-iteration algorithms are usually 
synchronous, where at each iteration each agent 
repeatedly updates its state to be the average of 
those of its neighbors. The gossip algorithms are 
asynchronous, where at each iteration a random 
pair of agents are selected to exchange their 
states and update them to be the average of the 
two. Dynamic averaging algorithms are of signif¬ 
icance when there exist time-varying signals. The 
objective is to compute the average of these time- 
varying signals in a distributed manner. 

Static Averaging 

Take a linear local-average-iteration algorithm as 
an example. The results in this section follow 
from Tsitsiklis et al. (1986), Jadbabaie et al. 
(2003), and Olfati-Saber et al. (2007). Let jc* be 
the information state of agent i . A linear local- 
average-iteration-type algorithm has the form 

Xf [k + 1 ] = ay [k\xj [k], i = l,... ,n, 

jtAfdk] 

(9) 

where k denotes a communication event, A ft [k] 
denotes the neighbor set of agent i , and a,ij [k] 
is the (/, j ) entry of the adjacency matrix A of 
the graph Q that represents the communication 
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Averaging Algorithms and Consensus, Fig. 3 Illustra¬ 
tion of distributed averaging of multiple (time-varying) 
signals. Here A\ denotes agent i and r t ( t ) denotes a (time- 

topology at time k , with the additional assump¬ 
tion that A is row stochastic and an [k] > 0 for 
all i = 1 ,,n. Intuitively, the information state 
of each agent is updated as the weighted average 
of its current state and the current states of its 
neighbors at each iteration. Note that an agent 
maintains its current state if it does not exchange 
information with other agents at that event in¬ 
stant. In fact, a discretized version of the closed- 
loop system of (1) using (2) (with a sufficiently 
small sampling period) takes in the form of (9). 
The objective here is for all agents to compute the 
average of their initial states by communicating 
with only their neighbors. That is, each x* [k] 
approaches £ Y^)=\ x j [0] eventually. To compute 
the average of multiple constant signals q, we 
could simply set x* [0] = q. The algorithm (9) 
can be written in matrix form as x[k + 1] = 
*4[/:]x[k], where x is a column stack vector of all 
Xf and A[k] = \a,ij [k]\ is a row-stochastic matrix. 

When the graph Q (and hence the matrix A) 
is fixed, the convergence of the algorithm (9) 


varying) signal associated with agent i . Each agent needs 
to compute the average of all agents’ signals but can 
communicate with only its neighbors 

can be analyzed by studying the eigenvalues 
and eigenvectors of the row-stochastic matrix A. 
Because all diagonal entries of A are positive, 
Gershgorin’s disc theorem implies that all eigen¬ 
values of A are either within the open unit disk or 
at one. When the graph Q is strongly connected, 
the Perron-Frobenius theorem implies that A has 
a simple eigenvalue at one with an associated 
right eigenvector 1 and an associated positive left 
eigenvector. Hence when Q is strongly connected, 
it turns out that lim^oo A k = lv T , where v T is a 
positive left eigenvector of A associated with the 
eigenvalue one and satisfies v T l = 1. Note that 
x[k] = A k x[0]. Hence, each agent’s state x/ [k] 
approaches v r x[0] eventually. If it can be further 
ensured that v = -1, then averaging is achieved. 
It can be shown that the agents’ states converge to 
the average of their initial values if and only if the 
directed graph Q is both strongly connected and 
balanced or the undirected graph Q is connected. 
When the graph is switching, the convergence of 
the algorithm (9) can be analyzed by studying the 
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product of row-stochastic matrices. Such analysis 
is closely related to Markov chains. It can be 
shown that the agents’ states converge to the 
average of their initial values if the directed 
graph Q is balanced at each communication event 
and strongly connected in a joint manner or the 
undirected graph Q is jointly connected. 

Dynamic Averaging 

In a more general setting, there exist n time- 
varying signals, r;(7), i = 1 ,,n, which could 
be an external signal or an output from a dy¬ 
namical system. Here r 7 (t) is available to only 
agent i and each agent can exchange information 
with only its neighbors. Each agent maintains a 
local estimate, denoted by of the average 

of all the signals r(t) = \Y^k =1 r ^(0- The 
objective is to design a distributed algorithm for 
agent i based on r 7 (t) and Xj(t), j e J\fi(t), 
such that all agents will finally track the average 
that changes over time. That is, ||x/(f) — r(t) ||, 
i = 1,..., n, approaches zero eventually. Such 
a dynamic averaging idea finds applications in 
distributed sensor fusion with time-varying mea¬ 
surements (Bai et al. 2011b; Spanos and Murray 
2005) and distributed estimation and tracking 
(Yang et al. 2008). 

Figure 3 illustrates the dynamic averaging 
idea. If there exists a central station that can 
always access the signals of all agents, then it is 
trivial to compute the average. Unfortunately, in 
a distributed context, where there does not exist a 
central station and each agent can only communi¬ 
cate with its local neighbors, it is challenging for 
each agent to compute the average that changes 
over time. While each agent could compute the 
average of its own and local neighbors’ signals, 
this will not be the average of all signals. 

When the signal r ? (t) can be arbitrary but its 
derivative exists and is bounded almost every¬ 
where, a distributed nonlinear nonsmooth algo¬ 
rithm is designed in Chen et al. (2012) as 

4>,(t) = a Y s g n k/ (r) - x i(t)] 

j 

Xi(t) = (pi(t) + r f (0, i = 1, • • • ,n, (10) 


where a is a positive scalar, A4 denotes the 
neighbor set of agent i , sgn(-) is the signum func¬ 
tion defined componentwise, 0/ is the internal 
state of the estimator with 0/(0) = 0, and x 7 
is the estimate of the average r(t). Due to the 
existence of the discontinuous signum function, 
the solution of (10) is understood in the Filippov 
sense (Cortes 2008). 

The idea behind the algorithm (10) is as 
follows. First, (10) is designed to ensure that 
YTi =i x i (0 — Ym=i r i (0 holds for all time. Note 

that E"=i x i(0 = T,1=i4>i(t) + T!i=i r i(t)- 

When the graph Q is undirected and 0/ (0) = 0, 
it follows that X!"=i0/(O = E”=i 0/(0) + 
“ £/=i E; s Jo sgn[x 7 (r) - X;(r)]dr = 0. 
As a result, E" = i x i(0 = E?=i r /(0 holds for 
all time. Second, when Q is connected, if the 
algorithm (10) guarantees that all estimates x 7 
approach the same value in finite time , then it can 
be guaranteed that each estimate approaches the 
average of all signals in finite time. 


Summary and Future Research 
Directions 

Averaging algorithms and consensus play an 
important role in distributed control of networked 
systems. While there is significant progress 
in this direction, there are still numerous 
open problems. For example, it is challenging 
to achieve averaging when the graph is not 
balanced. It is generally not clear how to deal 
with a general directed or switching graph for 
nonlinear agents or nonlinear algorithms when 
the algorithms are based on only interagent 
physical state coupling without the need for 
communicating additional auxiliary variables 
between neighbors. The study of consensus 
for multiple underactuated agents remains 
a challenge. Furthermore, when the agents’ 
dynamics are heterogeneous, it is challenging 
to design consensus algorithms. In addition, in 
the existing study, it is often assumed that the 
agents are cooperative. When there exist faulty 
or malicious agents, the problem becomes more 
involved. 



Averaging Algorithms and Consensus 


63 


Cross-References 

► Distributed Optimization 

► Dynamic Graphs, Connectivity of 

► Flocking in Networked Systems 

► Graphs for Modeling Networked Interactions 

► Networked Systems 

► Oscillator Synchronization 

► Vehicular Chains 

Bibliography 

Agaev R, Chebotarev P (2000) The matrix of maximum 
out forests of a digraph and its applications. Autom 
Remote Control 61(9): 1424-1450 

Agaev R, Chebotarev P (2005) On the spectra of non- 
symmetric Laplacian matrices. Linear Algebra Appl 
399:157-178 

Bai H, Arcak M, Wen J (2011a) Cooperative control de¬ 
sign: a systematic, passivity-based approach. Springer, 
New York 

Bai H, Freeman RA, Lynch KM (2011b) Distributed 
Kalman filtering using the internal model average 
consensus estimator. In: Proceedings of the 
American control conference, San Francisco, 
pp 1500-1505 

Bullo F, Cortes J, Martinez S (2009) Distributed con¬ 
trol of robotic networks. Princeton University Press, 
Princeton 

Chen F, Cao Y, Ren W (2012) Distributed average 
tracking of multiple time-varying reference signals 
with bounded derivatives. IEEE Trans Autom Control 
57(12):3169-3174 


Cortes J (2008) Discontinuous dynamical systems. IEEE 
Control Syst Mag 28(3):36-73 

Jadbabaie A, Lin J, Morse AS (2003) Coordination of 
groups of mobile autonomous agents using nearest 
neighbor rules. IEEE Trans Autom Control 48(6):988- 
1001 

Li Z, Duan Z, Chen G, Huang L (2010) Consensus of 
multiagent systems and synchronization of complex 
networks: a unified viewpoint. IEEE Trans Circuits 
Syst I Regul Pap 57(l):213-224 

Mesbahi M, Egerstedt M (2010) Graph theoretic methods 
for multiagent networks. Princeton University Press, 
Princeton 

Moreau L (2005) Stability of multi-agent systems with 
time-dependent communication links. IEEE Trans Au¬ 
tom Control 50(2): 169-182 

Olfati-Saber R, Fax JA, Murray RM (2007) Consensus 
and cooperation in networked multi-agent systems. 
Proc IEEE 95(l):215-233 

Qu Z (2009) Cooperative control of dynamical sys¬ 
tems: applications to autonomous vehicles. Springer, 
London 

Ren W, Beard RW (2008) Distributed consensus in multi¬ 
vehicle cooperative control. Springer, London 

Ren W, Cao Y (2011) Distributed coordination of multi¬ 
agent networks. Springer, London 

Spanos DP, Murray RM (2005) Distributed sensor fusion 
using dynamic consensus. In: Proceedings of the IFAC 
world congress, Prague 

Tsitsiklis JN, Bertsekas DP, Athans M (1986) Distributed 
asynchronous deterministic and stochastic gradient 
optimization algorithms. IEEE Trans Autom Control 
31(9):803—812 

Yang P, Freeman RA, Lynch KM (2008) Multi-agent 
coordination by decentralized estimation and control. 
IEEE Trans Autom Control 53(ll):2480-2496 



B 


Backward Stochastic Differential 
Equations and Related Control 
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Synonyms 

BSDE 

Abstract 

A conditional expectation of the form Y t = 
E[% + f f T fsdslFt] is regarded as a simple and 
typical example of backward stochastic differen¬ 
tial equation (abbreviated by BSDE). BSDEs are 
widely applied to formulate and solve problems 
related to stochastic optimal control, stochastic 
games, and stochastic valuation. 

Keywords 

Brownian motion; Feynman-Kac formula; Lips- 
chitz condition; Optimal stopping 

Definition 

A typical real valued backward stochastic differ¬ 
ential equation defined on a time interval [0, T] 


and driven by a d- dim. Brownian motion B 
is 

jdY t = - f(t, Y t , Z t )dt + Z t dB u 

j Y t = $, 

or its integral form 

T pT 

f(s,co,Y s , Z s )ds — / Z s dB s , 

( 1 ) 

where £ is a given random variable depending on 
the (canonical) Brownian path B t (co) = co(t) on 
[0, T], f(t, co, y, z) is a given function of the time 
t, the Brownian path co on [0, t], and the pair of 
variables (y,z) e M m x A solution of this 

BSDE is a pair of stochastic processes ( Y t , Z t ), 
the solution of the above equation, on [0, T] 
satisfying the following constraint: for each t, 
the value of Y t (co), Z t (co) depends only on the 
Brownian path co on [0, f]. Notice that, because 
of this constraint, the extra freedom Z t is needed. 
For simplicity we set d = in = 1. 

Often square-integrable conditions for £ and 
/ and Lipschitz condition for / with respect 
to (y,z) are assumed under which there exists a 
unique square-integrable solution ( Y t ,Z t ) on 
[0, T] (existence and uniqueness theorem of 
BSDE). We can also consider a multidimensional 
process Y and/or a multidimensional Brownian 
motion B , L^-integrable conditions (p > 1) for 
£ and /, as well as local Lipschitz conditions of 
/ with respect to (y,z). If Y t is real valued, we 
often call the equation a real valued BSDE. 
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We compare this BSDE with the classical 
stochastic differential equation (SDE): 

dX s = a(X s )dB s + b(X s )ds 

with given initial condition X s ^=0 = x e W 1 . Its 
integral form is 

X t (co) = x + [ <r(X s ((L>))dB s (co) 

Jo 

+ [ b(X s (co))ds. (2) 

Jo 

Linear backward stochastic differential 
equation was firstly introduced (Bismut 1973) 
in stochastic optimal control problems to solve 
the adjoint equation in the stochastic maximum 
principle of Pontryagin’s type. The above 
existence and uniqueness theorem was obtained 
by Pardoux and Peng (1990). In the research 
domain of economics, this type of 1-dimensional 
BSDE was also independently derived by Duffie 
and Epstein (1992). Comparison theorem of 
BSDE was obtained in Peng (1992) and improved 
in El Karoui et al. (1997a). Nonlinear Feynman- 
Kac formula was obtained in Peng (1991, 1992) 
and improved in Pardoux and Peng (1992). BSDE 
is applied as a nonlinear Black-Scholes option 
pricing formula in finance. This formulation was 
given in El Karoui et al. (1997b). We refer to a 
recent survey in Peng (2010) for more details. 


Hedging and Risk Measuring 
in Finance 

Let us consider the following hedging problem 
in a financial market with a typical model of 
continuous time asset price: the basic securities 
consist of two assets, a riskless one called bond, 
and a risky security called stock. Their prices are 
governed by dP® = P^rdt, for the bond, and 

dP t = P t [bdt + <jdB t \, for the stock. 

Here we only consider the situation where 
the volatility rate a > 0. The case of 


multidimensional stocks with degenerate 
volatility matrix a can be treated by constrained 
BSDE. Assume that a small investor whose 
investment behavior cannot affect market prices 
and who invests at time t e [0, T] the amount 
Tt t of his or her wealth Y t in the security and 
7r ? ° in the bond, thus Y t = tv® + n t . If his 
investment strategy is self-financing, then we 
have d Y t = n^dP^/P® + ir t dP t / P t , thus 

dY t = ( rY t + 7i t (j0)dt + Tt t adB t , 

where 6 = a~ l (b - r). A strategy (Y t , 7t t ) t e[o,T] 
is said to be feasible if Y t > 0, t e [0,7]. 
A European path-dependent contingent claim set¬ 
tled at time 7 is a given nonnegative function of 
path £ = £((T^ e [o,r]). A feasible strategy (T, n) 
is called a hedging strategy against a contingent 
claim £ at the maturity T if it satisfies 

dY t = ( rY t + t r t a0)dt + 7i t <7dB t , Yt = £. 

This problem can be regarded as finding a 
stochastic control tt and an initial condition To 
such that the final state replicates the contingent 
claim £, i.e., Yt = £. This type of replications 
is also called “exact controllability” in terms 
of stochastic control (see Peng 2005 for more 
general results). 

Observe that (T, ttct) is the solution of the 
above BSDE. It is called a superhedging strat¬ 
egy if there exists an increasing process K t , of¬ 
ten called an accumulated consumption process, 
such that 

dY t = (rY t + 7 x t c>6)dt + 7t t (jdB t — dK t , Yt = £. 

This type of strategies is often applied in a 
constrained market in which certain constraint 
(Y t ,jt t ) e T is imposed. In fact a real market has 
many frictions and constraints. An example is the 
common case where interest rate R for borrowing 
money is higher than the bond rate r. The above 
equation for the hedging strategy becomes 

dY t =[rY t + :x t o6 - (R - r)(jt t - Y t ) + ] 
dt + 7t t adB t , Yt = £, 
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where [of] + = max{o',0}. A short selling con¬ 
straint Tt t > 0 is also a typical requirement in 
markets. The method of constrained BSDE can 
be applied to this type of problems. BSDE theory 
provides powerful tools to the robust pricing 
and risk measures for contingent claims (see El 
Karoui et al. 1997a). For the dynamic risk mea¬ 
sure under Brownian filtration, see Rosazza Gi- 
anin (2006), Peng (2004), Barrieu and El Karoui 
(2005), Hu et al. (2005), and Delbaen et al. 
( 2010 ). 


Y, u = h(X T ) + J 


l(X s ,u s )ds-j Z“dB s . 


From Girsanov transformation, under the proba¬ 
bility measure P defined by 


dP , 
~dP 


I t = exp 


1 

2 


b {X s , u s )dB s 


L 


I b(X s ,u s , v s )\ 2 ds 


Comparison Theorem 

The comparison theorem, for a real valued 
BSDE, tells us that, if (Y t ,Z t ) and (Y t ,Z t ) 
are two solutions of BSDE (1) with terminal 
condition Yj = £, Yt = £ such that £(&>) > 
£(&>), (o e Q, then one has Y, > Y t . This theorem 
holds if / and £, £ satisfy the abovementioned 
L 2 -integrability condition and / is a Lipschitz 
function in (y,z). This theorem plays the same 
important role as the maximum principle in 
PDE theory. The theorem also has several very 
interesting generalizations (see Buckdahn et al. 
2000 ). 


X t is a Brownian motion, and the above BSDE is 
changed to 

Y, u = h(X T ) + £ \l(X s ,u s ) + (Z“,b(X s ,u s ))]ds 
-f' Z“dX s , 

where (•, •) is the Euclidean scalar product in R d . 
Notice that P and P are absolutely continuous 
with each other. Compare this BSDE with the 
following one: 


Stochastic Optimization and Two-Person 
Zero-Sum Stochastic Games 

An important point of view is to regard an ex¬ 
pectation value as a solution of a special type of 
BSDE. Consider an optimal control problem 

f T l(X s ,u s )ds + h(X T ) • 
0 

Here the state process X is controlled by the 
control process u t which is valued in a control 
(compact) domain U through the following d- 
dimensional SDE 


min J(u) : J(u) = E 

u 


Y, = h(X T ) 


I, 


T H(X S , Z s )ds - J T Z s dX s , 

(3) 


where H(x,z) := inf ueU {l(x,u) + {z,b(x,u))}. 
It is a direct consequence of the comparison 
theorem of BSDE that % < T 0 M = J(u ), for any 
admissible control u t . Moreover, one can find a 
feedback control u such that To = J(u). 

The above BSDE method has been introduced 
to solve the following two-person zero-sum game 
(Hamadeene and Lepeltier 1995): 


dX s = b(X s , u s )ds + a(X s )dB s 


max min J(u, v ), J(u, v) 

V u 


defined in a Wiener probability space (£2, T 7 , P) 
with the Brownian motion B t (oo) = oo(t) which 
is the canonical process. Here we only discuss the 
case or = Id for simplicity. Observe that in fact 
the expected value J(u ) is T 0 M = E[Yq], where 
Y t u solves the BSDE 


= e \1 


l(X s , u s , v s )ds + h(X T ) 


dX s = b(X s , u s , v s )ds + dB s , 
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where ( u s , v s ) is formulated as above with com¬ 
pact control domains u s e U and v s e V. In 
this case the equilibrium of the game exists if the 
following Isaac condition is satisfied: 

H(x,z) :=max inf { l{x , u , v) + (z, b(x, u , v))} 

veV ueU 

= inf max {l(x, u, v)-\- (z,b(x, u, v))} , 
ueU veV 

and the equilibrium is also obtained through a 
BSDE (3) defined above. 

Nonlinear Feynman-Kac Formula 

A very interesting situation is when f = g(X t ,y,z) 
and Y T =(p(X T ) in BSDE (1). In this case we 
have the following relation, called “nonlinear 
Feynman-Kac formula,” 

Y, = u(t, X t ), Z t = cr T (X,)Vu(t, X t ) 

where u = u(t,x) is the solution of the following 
quasilinear parabolic PDE: 

d t u + Cu + g(x, u, o T Xu) = 0, (4) 

u(x,T) = (p(x), (5) 

where C is the following, possibly degenerate, 
elliptic operator: 

l d 

£<p(x) =~ Y 

ij = 1 

d 

+ y ^bj(x)d Xi (p(x), a(x) = a(x)a T (x). 

i = 1 

Nonlinear Feynman-Kac formula can be used to 
solve a nonlinear PDE of form (4) to (5) by a 
BSDE (1) coupled with an SDE (2). 

A general principle is, once we solve a BSDE 
driven by a Markov process X for which the ter¬ 
minal condition Yj at time T depends only on Xt 
and the generator f(t,co,y,z) also depends on 
the state X t at each time t , then the corresponding 
solution of the BSDE is also state dependent, 
namely, Y t = u(t,X t ), where u is the solution 


of the corresponding quasilinear PDE. Once Yt 
and g are path functions of X, then the solution 
of the BSDE becomes also path dependent. In 
this sense, we can say that the PDE is in fact 
a “state-dependent BSDE,”and BSDE gives us a 
new generalization of “path-dependent PDE” of 
parabolic and/or elliptic types. This principle was 
illustrated in Peng (2010) for both quasilinear and 
fully nonlinear situations. 

Observe that BSDE (1) and forward SDE 
(2) are only partially coupled. A fully coupled 
system of SDE and BSDE is called a forward- 
backward stochastic differential equation 
(FBSDE). It has the following form: 

dXt = b{t , X,, Y t ,Z t )dt + a(t, X t , Y t ,Z t )dB t , 

X 0 = x eir, 

—dY t = f(t,X t ,Y t , Z t )dt — Z t dB t , Y r = <p(X T ). 

In general the Lipschitz assumptions for b, a, /, 
and cp w. r. t. (x, y , z) are not enough. Then Ma 
et al. (1994) have proposed a four-step scheme 
method of FBSDE for the nondegenerate Marko¬ 
vian case with a independent of Z. For the case 
dim(x) = dim(y) = n, Hu and Peng (1995) pro¬ 
posed a new type of monotonicity condition. This 
method does not need to assume the coefficients 
to be deterministic. Peng and Wu (1999) have 
weakened the monotonicity condition. Observe 
that in the case where b = X y H(x, y, z), & = 
V z H(x, y, z), and / = X x H(x, y, z), for a given 
real valued function H convex in v concave in 
(y,z), the above FBSDE is called the stochastic 
Hamilton equation associated to a stochastic op¬ 
timal control problem. We also refer to the book 
of Ma and Yong (1999) for a systematic exposi¬ 
tion on this subject. For time-symmetric forward- 
backward stochastic differential equations and its 
relation with stochastic optimality, see Peng and 
Shi (2003) and Han et al. (2010). 

Reflected BSDE and Optimal Stopping 

If (Y, Z) solves the BSDE 

dY s = —g(s, Y s , Z s )ds+Z s dB s —dK s , Y r = %, 

( 6 ) 
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where K is a cadlag and increasing process 
with K 0 = 0 and K t e L\{T t ), then F or 
(F, Z,K) is called a supersolution of the BSDE, 
or g-supersolution. This notion is often used 
for constrained BSDEs. A typical situation is as 
follows: for a given continuous adapted process 
(L t ) t e[\ o,t], find a smallest g- super solution 
(Y,Z,K) such that Y t > L t . This problem 
was initialed in El Karoui et al. (1997b). It is 
proved that this problem is equivalent to finding 
a triple (F, Z, K ) satisfying (4) and the following 
reflecting condition of Skorohod type: 

Y S >L S , [ (Y s — L s )dK s = 0. (7) 

Jo 

In fact r* := inf{^ e [0, T] : K t > 0} is the 
optimal stopping time associated to this BSDE. 
A well-known example is the pricing of Ameri¬ 
can option. 

Moreover, a new type of nonlinear Feynman- 
Kac formula was introduced: if all coefficients are 
given as in the formulation of the above nonlinear 
Feynman-Kac formula and L s = where 

satisfies the same condition as cp , then we have 
Y s = u(s,X s ), where u = u(t,x) is the solution 
of the following variational inequality: 

min{3 t u + Cu + g(x, u, a*Du), u — d>} 

= 0, (t, x) e [0, T] xR”, (8) 

with terminal condition u\ t =r = cp- They also 
demonstrated that this reflected BSDE is a pow¬ 
erful tool to deal with contingent claims of Amer¬ 
ican types in a financial market with constraints. 

BSDE reflected within two barriers, a lower 
one L and an upper one U, was first investigated 
by Cvitanic and Karatzas (1996) where a type 
of nonlinear Dynkin games was formulated for a 
two-player model with zero-sum utility and each 
player chooses his own optimal exit time. 

Stochastic optimal switching problems can be 
also solved by new types of oblique-reflected 
BSDEs. 

A more general case of constrained BSDE is to 
find the smallest g-supersolution (F, Z, K ) with 
constraint ( Y t ,Z t ) e T r where, for each t e 


[0, T], T t (El Karoui and Quenez 1995; Cvitanic 
and Karatzas 1993; El Karoui et al. 1997a) for the 
problem of superhedging in a market with con¬ 
vex constrained portfolios (Cvitanic et al. 1998). 
The case with an arbitrary closed constraint was 
proved in Peng (1999). 

Backward Stochastic Semigroup 
and g-Expectations 

Let Sf T [£] = Y t where F is the solution of 
BSDE (1). (£? T [-])o<t<T<oo h as th e (backward) 
semigroup property (Peng 1997) 

£!A£f, T m = £ g T ,M 

= £, 0 < s < t < T. 

For a real valued BSDE, by the comparison 
theorem, the semigroup is monotone: Sf T [£] > 
£f T [%\, if £ > f. If moreover g| z=0 = 0, then the 
semigroup is constant preserving: £f T [c] = c. 
Thus the semigroup forms in fact a nonlinear 
expectation called g-expectation (since this non¬ 
linear expectation is totally determined by the 
generator g). 

This notion allows us to establish a nonlinear 
g-martingale theory, e.g., g-supermartingale 
decomposition theorem. Peng (1999) claims 
that, if F is a square-integrable cadlag 
g-supermartingale, then it has the unique 
decomposition: there exists a unique predictable, 
increasing, and cadlag process A such that F 
solves 

— dY t = g(t, Y t , Z t )dt + dA t — Z t dB t . 

A theoretically challenging and practically im¬ 
portant problem is as follows: given an abstract 
family of expectations (£ s ,t['])s<t satisfying the 
same backward semigroup properties as these of 
g-expectation, can we find a function g such that 
£ s t = £^ t l Coquet, Hu et al. (2005) proved that 
if £ is dominated by g M -expectation with g M (z) = 
pt\z\ for a large enough constant /x > 0, then there 
exists a unique function g = g(t,co,z) satisfying 
/x-Lipschitz condition such that (£ s ,t[-])s<t is in 
fact a g-expectation. For a concave dynamic ex¬ 
pectation with an assumption much weaker than 



70 


Backward Stochastic Differential Equations and Related Control Problems 


the above domination condition, we can still find 
a function g = g(t,z ) with possibly singular 
values (Delbaen et al. 2010). For the case without 
the assumption of constant preservation, see Peng 
(2005). In practice, the above criterion is very 
useful to test whether a dynamic pricing mech¬ 
anism of contingent contracts can be represented 
through a concrete function g. 

A serious challenging problem in the 
stochastic control theory is as follows: it is based 
on a given probability space (£2, J 7 , P). But in 
most practical situations, it is far from being 
true. In many risky situations, it is necessary 
to consider the uncertainty of the probability 
measures themselves, e.g., { Pe}ee 0 , namely, 
the well-known Knightian uncertainty (Knight 
1921). A new framework of G -expectation space 
E) and the corresponding random and 
stochastic analysis (Ito’s analysis) is introduced 
(see Peng 2007, 2010 and Soner et al. 2012) to 
replace the probability framework (^,7 r , P). 
g-expectation is a special and typical case in this 
new theory. 


Cross-References 

► Numerical Methods for Continuous-Time 
Stochastic Control Problems 

► Risk-Sensitive Stochastic Control 

► Stochastic Dynamic Programming 

► Stochastic Linear-Quadratic Control 

► Stochastic Maximum Principle 


Recommended Reading 

BSDE theory applied in maximization of stochas¬ 
tic control can be found in the book of Yong 
and Zhou (1999); stochastic control problem in 
finance in El Karoui et al. (1997a); optimal stop¬ 
ping and reflected BSDE in El Karoui et al. 
(1997b); Maximization under Knightian uncer¬ 
tainty using nonlinear expectation can be found 
in Chen and Epstein (2002) and a survey paper in 
Peng (2010). 
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Abstract 

Basic principles for the development of computa¬ 
tional methods for the analysis and design of lin¬ 
ear time-invariant systems are discussed. These 
have been used in the design of the subroutine 
library SLICOT. The principles are illustrated on 
the basis of a method to check the controllability 
of a linear system. 
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Introduction 

Basic numerical methods for the analysis and 
design of dynamical systems are at the heart of 
most techniques in systems and control theory 
that are used to describe, control, or optimize 
industrial and economical processes. There are 
many methods available for all the different tasks 
in systems and control, but even though most 
of these methods are based on sound theoretical 
principles, many of them still fail when applied 
to real-life problems. The reasons for this may 
be quite diverse, such as the fact that the system 
dimensions are very large, that the underlying 
problem is very sensitive to small changes in 
the data, or that the method lacks numerical 
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robustness when implemented in a finite preci¬ 
sion environment. 

To overcome such failures, major efforts have 
been made in the last few decades to develop 
robust, well-implemented, and standardized 
software packages for computer-aided control 
systems design (Griibel 1983; Nag Slicot 1990; 
Wieslander 1977). Following the standards of 
modern software design, such packages should 
consist of numerically robust routines with 
known performance in terms of reliability and 
efficiency that can be used to form the basis 
of more complex control methods. Also to 
avoid duplication and to achieve efficiency 
and portability to different computational 
environments, it is essential to make maximal 
use of the established standard packages that 
are available for numerical computations, e.g., 
the Basic Linear Algebra Subroutines (BLAS) 
(Dongarra et al. 1990) or the Linear Algebra 
Packages (LAPACK) (Anderson et al. 1992). 
On the basis of such standard packages, the next 
layer of more complex control methods can then 
be built in a robust way. 

In the late 1980s, a working group was cre¬ 
ated in Europe to coordinate efforts and integrate 
and extend the earlier software developments 
in systems and control. Thanks to the support 
of the European Union, this eventually led to 
the development of the Subroutine Library in 
Control Theory (SLICOT) (Benner et al. 1999; 
SLICOT 2012). This library contains most of the 
basic computational methods for control systems 
design of linear time-invariant control systems. 

An important feature of this and similar kind 
of subroutine libraries is that the development 
of further higher level methods is not restricted 
by specific requirements of the languages or data 
structures used and that the routines can be eas¬ 
ily incorporated within other more user-friendly 
software systems (Gomez et al. 1997; MATLAB 
2013). Usually, this low-level reusability can only 
be achieved by using a general-purpose program¬ 
ming language like C or Fortran. 

We cannot present all the features of the SLI¬ 
COT library here. Instead, we discuss its general 
philosophy in section “The Control Subroutine 
Library SLICOT” and illustrate these concepts in 


section “An Illustration” using one specific task, 
namely, checking the controllability of a system. 
We refer to SLICOT (2012) for more details 
on SLICOT and to Varga (2004) for a general 
discussion on numerical software for systems and 
control. 


The Control Subroutine Library 
SLICOT 

When designing a subroutine library of basic 
algorithms, one should make sure that it satisfies 
certain basic requirements and that it follows 
a strict standardization in implementation and 
documentation. It should also contain standard¬ 
ized test sets that can be used for benchmarking, 
and it should provide means for maintenance 
and portability to new computing environments. 
The subroutine library SLICOT was designed to 
satisfy the following basic recommendations that 
are typically expected in this context (Benner 
et al. 1999). 

Robustness: A subroutine must either return 
reliable results or it must return an error or 
warning indicator, if the problem has not been 
well posed or if the problem does not fall in 
the class to which the algorithm is applicable 
or if the problem is too ill-conditioned 
to be solved in a particular computing 
environment. 

Numerical stability and accuracy: Subroutines 
are supposed to return results that are as good 
as can be expected when working at a given 
precision. They also should provide an option 
to return a parameter estimating the accuracy 
actually achieved. 

Efficiency: An algorithm should never be cho¬ 
sen for its speed if it fails to meet the usual 
standards of robustness, numerical stability, 
and accuracy, as described above. Efficiency 
must be evaluated, e.g., in terms of the num¬ 
ber of floating-point operations, the memory 
requirements, or the number and cost of itera¬ 
tions to be performed. 

Modern computer architectures: The require¬ 
ments of modern computer architectures 
must be taken into account, such as shared 
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or distributed memory parallel processors, 
which are the standard environments of 
today. The differences in the various 
architectures may imply different choices of 
algorithms. 

Comprehensive functional coverage: The 

routines of the library should solve control 
systems relevant computational problems and 
try to cover a comprehensive set of routines 
to make it functional for a wide range of 
users. The SLICOT library covers most of the 
numerical linear algebra methods needed in 
systems analysis and synthesis problems for 
standard and generalized state space models, 
such as Lyapunov, Sylvester, and Riccati equa¬ 
tion solvers, transfer matrix factorizations, 
similarity and equivalence transformations, 
structure exploiting algorithms, and condition 
number estimators. 

The implementation of subroutines for a li¬ 
brary should be highly standardized, and it should 
be accompanied by a well-written online docu¬ 
mentation as well as a user manual (see, e.g., 
standard Denham and Benson 1981; Working 
Group Software 1996) which is compatible with 
that of the LAPACK library (Anderson et al. 
1992). Although such highly restricted standards 
often put a heavy burden on the programmer, it 
has been observed that it has a high importance 
for the reusability of software and it also has 
turned out to be a very valuable tool in teaching 
students how to implement algorithms in the 
context of their studies. 

Benchmarking 

In the validation of numerical software, it is 
extremely important to be able to test the cor¬ 
rectness of the implementation as well as the 
performance of the method, which is one of the 
major steps in the construction of a software 
library. To achieve this, one needs a standard¬ 
ized set of benchmark examples that allows an 
evaluation of a method with respect to correct¬ 
ness, accuracy, and efficiency and to analyze 
the behavior of the method in extreme situa¬ 
tions, i.e., on problems where the limit of the 
possible accuracy is reached. In the context of 
basic systems and control methods, several such 


benchmark collections have been developed (see, 
e.g., Benner et al. 1997; Frederick 1998, or http:// 
www.slicot.org/index.php ? site=benchmarks ). 


Maintenance, Open Access, and Archives 

It is a major challenge to maintain a well- 
developed library accessible and usable over 
time when computer architectures and operating 
systems are changing rapidly, while keeping the 
library open for access to the user community. 
This usually requires financial resources that 
either have to be provided by public funding or 
by licensing the commercial use. 

In the SLICOT library, this challenge has been 
addressed by the formation of the Niconet As¬ 
sociation (http://www.niconet-ev.info/en/) which 
provides the current versions of the codes and 
all the documentations. Those of Release 4.5 are 
available under the GNU General Public License 
or from the archives of http://www.slicot.org/. 


An Illustration 

To give an illustration for the development of a 
basic control system routine, we consider the 
specific problem of checking controllability 
of a linear time-invariant control system. A 
linear time-invariant control problem has the 
form 


—-— — Ax T Bu , t G [to, oo) (1) 
at 

Here v denotes the state and u the input function, 
and the system matrices are typically of the form 
A G R n ' n , B g R n ’ m . 

One of the most important topics in control 
is the question whether by an appropriate choice 
of input function u(t) we can control the system 
from an arbitrary state to the null state. This prop¬ 
erty, called controllability , can be characterized 
by one of the following equivalent conditions (see 
Paige 1981). 
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Theorem 1 The following are equivalent: 

(i) System (1) is controllable. 

(ii) Rank [ B , AB, A 2 B , ••• , A n ~ l B] = n. 

(iii) Rank [B, A — XI] = n VA G C. 

(iv) 3F such that A and A + BF have no 
common eigenvalues. 

The conditions of Theorem 1 are nice for 
theoretical purposes, but none of them is really 
adequate for the implementation of an algorithm 
that satisfies the requirements described in 
the previous section. Condition (ii) creates 
difficulties because the controllability matrix 
K = [B, AB, A 2 B, ■■■ , A n ~ l B] will be highly 
corrupted by roundoff errors. Condition (iii) can 
simply not be checked in finite time. However, it 
is sufficient to check this condition only for the 
eigenvalues of A, but this is extremely expensive. 
And finally, condition (iv) will almost always 
give disjoint spectra between A and A + BF 
since the computation of eigenvalues is sensitive 
to roundoff. 

To devise numerical procedures, one often 
resorts to the computation of canonical or con¬ 
densed forms of the underlying system. To obtain 
such a form one employs controllability preserv¬ 
ing linear transformations v i-> Fx, u i-> Qu 
with nonsingular matrices P G IRA", Q G H m m . 
The canonical form under these transformations 
is the Luenberger form (see Luenberger 1967). 
This form allows to check the controllability us¬ 
ing the above criterion (iii) by simple inspection 
of the condensed matrices. This is ideal from 
a theoretical point of view but is very sensitive 
to small perturbations in the data, in particular 
because the transformation matrices may have 
arbitrary large norm, which may lead to large 
errors. 

For the implementation as robust numerical 
software one uses instead transformations with 
real orthogonal matrices P, Q that can be im¬ 
plemented in a backward stable manner, i.e., 
the resulting backward error is bounded by a 
small constant times the unit roundoff u of the 
finite precision arithmetic, and employs for reli¬ 
able rank determinations the well-known singular 
value decomposition (SVD) (see, e.g., Golub and 
Van Loan 1996). 


Theorem 2 (Singular value decomposition) 

Given A G R n,m , then there exist orthogonal 
matrices U, V with U G IRA", V G R m ’ m , such 
that A = UY<V t and E G R n,m is quasi- 
diagonal, i.e., 


E = 


"E r 0" 

where E r = 

<7l 


0 

0 



G r _ 


and the nonzero singular values 07 are ordered 
as G\ > 02 > • • • > o r > 0. 

The SVD presents the best way to determine 
(numerical) ranks of matrices in finite precision 
arithmetic by counting the number of singular 
values satisfying Gj > ucq and by putting those 
for which aj < ucri equal to zero. The compu¬ 
tational method for the SVD is well established 
and analyzed, and it has been implemented in 
the LAPACK routine SGESVD (see http://www. 
netlib.org/lapack/). A faster but less reliable alter¬ 
native to compute the numerical rank of a matrix 
A is its QR factorization with pivoting (see, e.g., 
Golub and Van Loan 1996). 

Theorem 3 (QRE decomposition) Given A e 
R n,m , then there exists an orthogonal matrix Q G 
R n,n and a permutation E G R m,m , such that 
A = QRE T and R G R n,m is trapezoidal, i.e., 


r\ 1 ... ru ... ri m 


R = 


0 


m ••• rim 
0 


and the nonzero diagonal entries ru are ordered 
as r n >•••> ru > 0 . 

The (numerical) rank in this case is again ob¬ 
tained by counting the diagonal elements ru > 
urn. 

One can use such orthogonal transformations 
to construct the controllability staircase form (see 
Van Dooren 1981). 

Theorem 4 (Staircase form) Given matrices 
A G R n,n , B G R n,m , then there exist orthogonal 
matrices P, Q with P eR n,n , Q G R m,m , so that 
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-An . A hr -i 

A h y ■ 


A21 ' • • 


PAP 7 = 

Ay — l,r —2 A r — \ r — 1 

A r —l,r 


L 0 ■■■ 0 0 

Ayy . 


m . . . riy—2 rly — l 

n r 


B\ on m 


0 0 ri2 


PBQ = 


_ 0 Oj n r 
n\ m — n i 


( 2 ) 


where ni > ri 2 > ••• > n r -i > n r > 0, n r -\ > 
0, A u - 1 = [£,„_! 0], with nonsingular blocks 
£ /ff _i G G M" 1 *" 1 . 

Notice that when using the reduced pair in 
condition (iii) of Theorem 1, the controllability 
condition is just n r = 0 , which is simply checked 
by inspection. A numerically stable algorithm 
to compute the staircase form of Theorem 4 is 
given below. It is based on the use of the singular 
value decomposition, but one could also have 
used instead the QR decomposition with column 
pivoting. 


so that 


A := P 2 AP4 =: 


B := P 2 B =: 


where A 21 = [£21 
Step 2: 
i = 3 

DO WHILE > 0 AND A m _i 7 ^ 0). 
Perform an SVD of A^-_i = f/^-i 
^m-i 0 


hi 

^12 

^13" 

hi 

^22 

^23 

0 

^32 

^ 33 _ 

~B, 

1 0" 


0 

0 

, 

_0 

0 _ 


l*i 

:= 

V^b 


0 0 
^i,i -1 € 

Set 


with 


nonsingular and diagonal. 




P/:= 


^-1 


^-1 


,P:=PiP, 


Staircase Algorithm 

Input: AGr”,5Gr m 

Output: PAP t ,PBQ in the form (2), P, Q 

orthogonal 


Step 0: Perform an SVD B = Ub 


Zb 0 
0 0 


vS 


with nonsingular and diagonal £5 G Set 


P :=Ug,Q:= Vb, so that 


so that 


-An 

A21 


A := PiAPj =: 


A\,i + l 
Ai,i+\ 


Ai,i -1 A u : 

Ai+\,i Ai+u+i_ 


A := E/jA£/* 


^4n ^12 
^21 ^22 


5 := U l b BV b = 

with An of size ^i x «i. 
Step 1: Perform an SVD A 2 i 


0 

0 0 


t /21 


£21 0 
0 0 


^21 


with nonsingular and diagonal £21 <= M" 2 ’" 2 . Set 


Pi 


VI O' 

. 0 K. 


, P := P 2 P 


where Av-i = 0 ], 

i \= i -|-1 
END 
r := / 

It is clear that this algorithm will stop with 
m = 0 or Aij -1 = 0. In every step, the 

remaining block shrinks at least by 1 row/column, 
as long as Rank Af ti -\ > 1, so that the algorithm 
stops after maximally n — 1 steps. It has been 
shown in Van Dooren (1981) that system (1) is 
controllable if and only if in the staircase form of 
(A, B) one has n r = 0. 
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It should be noted that the updating 
transformations Pi of this algorithm will affect 
previously created “stairs” so that the blocks 
denoted as i will not be diagonal anymore, 
but their singular values are unchanged. This is 
critical in the decision about the controllability of 
the pair (A, B) since it depends on the numerical 
rank of the submatrices A 7-1 and B (see 
Demmel and Kagstrom 1993). Based on this 
and a detailed error and perturbation analysis, 
the Staircase Algorithm has been implemented 
in the SLICOT routine ABOIOD, and it uses 
in the worst-case (D(n 4 ) flops (a “flop” is an 
elementary floating-point operation 
or /). For efficiency reasons, the SLICOT 
routine ABOIOD does not use SVDs for rank 
decisions, but QR decompositions with column 
pivoting. When applying the corresponding 
orthogonal transformations to the system without 
accumulating them, the complexity can be 
reduced to (D(n 3 ) flops. It has been provided with 
error bounds, condition estimates, and warning 
strategies. 

Summary and Future Directions 

We have presented the SLICOT library and the 
basic principles for the design of such basic 
subroutine libraries. To illustrate these princi¬ 
ples, we have presented the development of a 
method for checking controllability for a linear 
time-invariant control system. But the SLICOT 
library contains much more than that. It essen¬ 
tially covers most of the problems listed in the 
selected reprint volume (Patel et al. 1994). This 
volume contained in 1994 the state of the art 
in numerical methods for systems and control, 
but the field has strongly evolved since then. 
Examples of areas that were not in this vol¬ 
ume but that are included in SLICOT are pe¬ 
riodic systems, differential algebraic equations, 
and model reduction. Areas which still need new 
results and software are the control of large- 
scale systems, obtained either from discretiza¬ 
tions of partial differential equations or from the 
interconnection of a large number of interact¬ 
ing systems. But it is unclear for the moment 


which will be the methods of choice for such 
problems. We still need to understand the nu¬ 
merical challenges in such areas, before we can 
propose numerically reliable software for these 
problems: the area is still quite open for new 
developments. 


Cross-References 

► Computer-Aided Control Systems Design: 
Introduction and Historical Overview 

► Interactive Environments and Software Tools 
for CACSD 
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Abstract 

This entry is an introduction to modern issues 
about controllability of Schrodinger PDEs with 
bilinear controls. This model is pertinent for a 
quantum particle, controlled by an electric field. 
We review recent developments in the field, with 
discrimination between exact and approximate 
controllabilities, in finite or infinite time. We also 
underline the variety of mathematical tools used 


by various teams in the last decade. The results 
are illustrated on several classical examples. 

Keywords 

Approximate controllability; Global exact con¬ 
trollability; Local exact controllability; Quantum 
particles; Schrodinger equation; Small-time con¬ 
trollability 

Introduction 

A quantum particle, in a space with dimension N 
(N = 1,2, 3), in a potential V = V(x), and in an 
electric field u = u{t), is represented by a wave 
function \j/ : (t,x) e M x Q -> C on the 
L 2 (£2 , C) sphere S 

/ \xfr (t,x)\ 2 dx = 1, W e M, 

Jo, 

where Q C R N is a possibly unbounded open 
domain. In first approximation, the time evolution 
of the wave function is given by the Schrodinger 
equation, 

id t \lf ( t,x ) = (—A -\-V)^r ( t,x ) 

(v) (t, x ), t e (0, +oo), x e £2, 

xfr (t, x) = 0, v G 3£2 

( 1 ) 

where /z is the dipolar moment of the particle 
and fi = 1 here. Sometimes, this equation is 
considered in the more abstract framework 

= 0 + ii(OHi)Vr (2) 

at 

where ifr lives on the unit sphere of a separable 
Hilbert space % and the Hamiltonians Ho, Hi 
are Hermitian operators onH.A natural question, 
with many practical applications, is the existence 
of a control u that steers the wave function i/s 
from a given initial state t/'A, to a prescribed target 
xl/f. 

The goal of this survey is to present 
well-established results concerning exact and 
approximate controllabilities for the bilinear 
control system (1), with applications to relevant 
examples. The main difficulties are the infinite 
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dimension of TL and the nonlinearity of the 
control system. 


Preliminary Results 

When the Hilbert space TL has finite dimension n , 
then controllability of Eq. (2) is well understood 
(D’Alessandro 2008). If, for example, the Lie 
algebra spanned by Ho and H\ coincides with 
u(n), the set of skew-Hermitian matrices, then 
system (2) is globally controllable: for any initial 
and final states £ T-L of length one, there 

exist T > 0 and a bounded open-loop control 
[0, T] 3 t i-> u(t ) steering \j/ from ^r(O) = i/'o to 

In infinite dimension, this idea served to intuit 
a negative controllability result in Mirrahimi and 
Rouchon (2004), but the above characterization 
cannot be generalized because iterated Lie brack¬ 
ets of unbounded operators are not necessarily 
well defined. Lor example, the quantum harmonic 
oscillator 

id t x/r ( t,x ) = — d 2 \j/ ( t,x ) + x 2 \j/ ( t,x ) 

— u(t) xxjr (t,x) , x g M, (3) 

is not controllable (in any reasonable sense) (Mir¬ 
rahimi and Rouchon 2004) even if all its Galerkin 
approximations are controllable (Lu et al. 2001). 
Thus, much care is required in the use of Galerkin 
approximations to prove controllability in infinite 
dimension. This motivates the search of different 
methods to study exact controllability of bilinear 
PDEs of form (1). 

In infinite dimension, the norms need to be 
specified. In this article, we use Sobolev norms. 
Lor s G N, the Sobolev space H S (Q) is the 
space of functions \js : Q —> C with square inte¬ 
grate derivatives d k \js for k = 0, ... ,s (deriva¬ 
tives are well defined in the distribution sense). 
H s (£2) is endowed with the norm ||^r|| h s • = 

(X^=o |^V r |/ 2 (m) • We also use the space 

//q(£ 2) which contains functions \// G H l (Q) 
that vanish on the boundary 3 £2 (in the trace 
sense) (Brezis 1999). 


The first control result of the literature 
states the noncontrollability of system (1) in 
( H 2 n //q 1 ) (£2) fl S with controls u e L 2 ((0, 
T), M) (Ball et al. 1982; Turinici 2000). More 
precisely, by applying L 2 (0, T) controls u , the 
reachable wave functions form a subset of 
(// 2 PI //q 1 ) (^) fl 5 with empty interior. This 
statement does not give obstructions for system 
(1) to be controllable in different functional 
spaces as we will see below, but it indicates 
that controllability issues are much more subtle 
in infinite dimension than in finite dimension. 


Local Exact Controllability 

In 1D and with Discrete Spectrum 

This section is devoted to the ID PDE: 


( id t \// (t,x) = —3 2 ^ ( t,x ) 

< —u{t)ll (x) iff (t, x) , V G (0, 1) , t G (0, T) , 

[ \fr (t, 0 ) = x// ( t , 1) = 0 . 

(4) 

We call “ground state” the solution of the 
free system (u = 0) built with the first eigen¬ 
value and eigenvector of — 3 2 : \/fi(t,x) = 

\fl sin(jtx) e~ lJl2t . Under appropriate assump¬ 
tions on the dipolar moment /z, then system (4) 
is controllable around the ground state, locally in 
(0,1) PI 5, with controls in L 2 ((0,T), M), as 
stated below. 

Theorem 1 Assume /z G // 3 ((0, 1),M) and 



sin (nx) sin (knx) dx 


>^, Wke N 


* 


(5) 


for some constant c > 0. Then, for every 
T > 0, there exists 8 > 0 such that for 
every ^o^f £ H ((0,1),C) with 

11^0-^1(0)11 #3 + \\ff - fx(T) | < 8, 

there exists u e L 2 ((0, T),Wi) such that the 
solution of (4) with initial condition \//(0,x) = 
xf/o (x) satisfies \f(T) = \[rf. 

Here, H? 0) (0,1) : = {x/t e H\( 0,1), C); 
i (r= i jr" = 0 at x = 0,1}. We refer to Beauchard 
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and Laurent (2010) and Beauchard et al. (2013) 
for proof and generalizations to nonlinear PDEs. 
The proof relies on the linearization principle, by 
applying the classical inverse mapping theorem 
to the endpoint map. Controllability of the lin¬ 
earized system around the ground state is a con¬ 
sequence of assumption (5) and classical results 
about trigonometric moment problems. A subtle 
smoothing effect allows to prove C Regularity of 
the endpoint map. 

The assumption (5) holds for generic \i e 
// 3 ((0,1),R) and plays a key role for local ex¬ 
act controllability to hold in small time T. In 
Beauchard and Morancey (2014), local exact con¬ 
trollability is proved under the weaker assump¬ 
tion, namely, /z'(0) zb/xRl) ^ 0, but only in large 
time T. 

Moreover, under appropriate assumptions on 
/x, references Coron (2006) and Beauchard and 
Morancey (2014) propose explicit motions that 
are impossible in small time T,with small con¬ 
trols in L 2 . Thus, a positive minimal time is 
required for local exact controllability, even if 
information propagates at infinite speed. This 
minimal time is due to nonlinearities; its charac¬ 
terization is an open problem. 

Actually, assumption /x'(0) zb/xRl) ^ 0 is not 
necessary for local exact controllability in large 
time. For instance, the quantum box, i.e., 

id t f (t,x) = -d 2 x f (t,x) 

—u(t)x\l/ (t, x) , x G (0,1) , (6) 

\j/ (t, 0) = x/r (f, 1) = 0, 

is treated in Beauchard (2005). Of course, these 
results are proved with additional techniques: 
power series expansions and Coron’s return 
method (Coron 2007). 

There is no contradiction between the nega¬ 
tive result of section “Preliminary Results” and 
the positive result of Theorem 1. Indeed, the 
wave function cannot be steered between any 
two points xf/o, x\rf of H 2 fl Hq, but it can 
be steered between any two points f of 
H 3 0 y which is smaller than H 2 D Hq. In par¬ 
ticular, H ( 3 0 ) ((0, 1),C) has an empty interior in 
H 2 ^ ((0,1), C). Thus, there is no incompatibility 
between the reachable set to have empty interior 


in H 2 PI //q 1 and the reachable set to coincide 
with H^y 

Open Problems in Multi-D or 
with Continuous Spectrum 

The linearization principle used to prove Theo¬ 
rem 1 does not work in multi-D: the trigonometric 
moment problem, associated to the controllabil¬ 
ity of the linearized system, cannot be solved. 
Indeed, its frequencies, which are the eigenvalues 
of the Dirichlet Laplacian operator, do not satisfy 
a required gap condition (Loreti and Komornik 
2005). 

The study of a toy model (Beauchard 2011) 
suggests that if local controllability holds in 2D 
(with a priori bounded L 2 -controls) then a posi¬ 
tive minimal time is required, whatever /z is. The 
appropriate functional frame for such a result is 
an open problem. 

In 3D or in the presence of continuous 
spectrum, we conjecture that local exact con¬ 
trollability does not hold (with a priori bounded 
L 2 -controls) because the gap condition in the 
spectrum of the Dirichlet Laplacian operator is 
violated (see Beauchard et al. (2010) for a toy 
model from nuclear magnetic resonance and 
ensemble controllability as originally stated in Li 
and Khaneja (2009)). Thus, exact controllability 
should be investigated with controls that are 
not a priori bounded in L 2 ; this requires new 
techniques. We refer to Nersesyan and Nersisyan 
(2012a) for precise negative results. 

Finally, we emphasize that exact controlla¬ 
bility in multi-D but in infinite time has been 
proved in Nersesyan and Nersisyan (2012a,b), 
with techniques similar to one used in the proof 
of Theorem 1 . 


Approximate Controllability 

Different approaches have been developed to 
prove approximate controllability. 

Lyapunov Techniques 

Due to measurement effect and back action, 
closed-loop controls in the Schrodinger frame are 
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not appropriate. However, closed-loop controls 
may be computed via numerical simulations 
and then applied to real quantum systems 
in open loop, without measurement. Then, 
the strategy consists in designing damping 
feedback laws, thanks to a controlled Lyapunov 
function, which encodes the distance to the 
target. In finite dimension, the convergence 
proof relies on LaSalle invariance principle. In 
infinite dimension, this principle works when 
the trajectories of the closed-loop system are 
compact (in the appropriate space), which is 
often difficult to prove. Thus, two adaptations 
have been proposed: approximate convergence 
(Beauchard and Mirrahimi 2009; Mirrahimi 
2009) and weak convergence (Beauchard and 
Nersesyan 2010) to the target. 

Variational Methods and Global Exact 
Controllability 

The global approximate controllability of (1), 
in any Sobolev space, is proved in Nersesvan 
(2010), under generic assumptions on (V, 
pi), with Lyapunov techniques and variational 
arguments. 

Theorem 2 Let V, /jl g C°° (£2, M) and 
(A j)jeN*, (<Pj)jeN* be the eigenvalues and 
normalized eigenvectors of (— A + V). Assume 
(fifj, (pi) ^ 0, for all j > 2 and X\ — A j ^ 
X p ~ f or j’P’V e N* such that 
{1,7} ^ {P,q}\ j ^ I- Then, for every 

s > 0, the system (1) is globally approximately 
controllable in : D [(—A + F)^ 2 ], the 

domain of (—A + V) s / 2 : for every e,<5 > 0 
and \po e S PI Hfo, there exist a time T > 0 
and a control u G C 0 °° ((0, T), R) such that the 
solution of (1) with initial condition \p (0) = xp'o 
satisfies \\ip(T) — (pi\\ H s-s < e. 

This theorem is of particular importance. In¬ 
deed, in ID and for appropriate choices of (V, 
pi), global exact controllability of (1) in // 3+ can 
be proved by combining the following: 

• Global approximate controllability in H 3 

given by Theorem 2, 

• Local exact controllability in H 3 given by 

Theorem 1, 


• Time reversibility of the Schrodinger equation 
(i.e., if {fif(t, x), u(t)) is a trajectory, then so 
is (r/r* (T — t , x), u(T — t)) where i/r* is the 
complex conjugate of \jr). 

Let us expose this strategy on the quantum box 
(6). First, one can check the assumptions of Theo¬ 
rem 2 with V(x) = yx and /x(v) = (1 —y)x when 
y > 0 is small enough. This means that, in (6), 
we consider controls u{t) of the form y-\-u(t). 
Thus, an initial condition xpo e H 3 ^ can be 
steered arbitrarily close to the first eigenvector 
<pi, y of (— d 2 x + yx), in H 3 norm. Moreover, by 
a variant of Theorem 1 , the local exact control¬ 
lability of (6) holds in H 3 ^ around 0i y . There¬ 
fore, the initial condition xpo e H 3 + can be 
steered exactly to (p\ tY in finite time. By the time 
reversibility of the Schrodinger equation, we can 
also steer exactly the solution from cp\ tY to any 
target e H 3+ . Therefore, the solution can be 
steered exactly from any initial condition ipo e 
H 3 + to any target fip G H 3 + in finite time. 

Geometric Techniques Applied to Galerkin 
Approximations 

In Boscain et al. (2012, 2013) and Chambrion 
et al. (2009) the authors study the control of 
Schrodinger PDEs, in the abstract form (2) and 
under technical assumptions on the (unbounded) 
operators Ho and Hi that ensure the existence of 
solutions with piecewise constant controls u: 

1. Ho is skew-adjoint on its domain D(Ho). 

2. There exists a Hilbert basis (cpk)keNof TL made 
of eigenvectors of Ho : Hofik = iXkfik and 
(p k e D(Hi),Vk g N. 

3. Ho + uH i is essentially skew-adjoint (not 
necessarily with domain D(Ho)) for every u G 
[0,8] for some 8 > 0. 

4. {H\ (pj, cpk) = 0 for every j, k G N such that 

A j = A k and j k. 

Theorem 3 Assume that, for every j, k G 
N, there exists a finite number of integers 
pi ,..., p r G N such that 

Pl = j, Pr =k, (Hi<p pi ,<p pi+1 ) 

7^0,V/ = 1 
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|Al—Am| ^ \hpi - A/? /+1 1, VI < Z < r - 1, 
LM e N with {L, M} ^ {p/, p/+i}. 

Then for every e > 0 rmr/ tyo,tyf in the unit 
sphere of TL , there exists a piecewise constant 
function u : [0, r e ] —> [0,8] such that the solution 
of (2) with initial condition ty (0) = tyo satisfies 

lk<r 6 ) -tf\ n < «• 

We refer to Boscain et al. (2012, 2013) and 
Chambrion et al. (2009) for proof and additional 
results such as estimates on the L 1 norm of 
the control. Note that Ho is not necessarily of 
the form (—A + V), Hi can be unbounded, 8 
may be arbitrary small, and the two assumptions 
are generic with respect to (Ho, H\). The con¬ 
nectivity and transition frequency conditions in 
Theorem 3 mean physically that each pair of 
Ho eigenstates is connected via a finite number 
of first-order (one-photon) transitions and that 
the transition frequencies between pairs of eigen¬ 
states are all different. 

Note that, contrary to Theorems 2, Theorem 3 
cannot be combined with Theorem 1 to prove 
global exact controllability. Indeed, functional 
spaces are different: TL = L 2 (£l) in Theorem 3, 
whereas // 3 -regularity is required for Theorem 1 . 

This kind of results applies to several relevant 
examples such as the control of a particule in 
a quantum box by an electric field (6) and the 
control of the planar rotation of a linear molecule 
by means of two electric fields: 

id tty (t, 0) = (—3q + u\ (t) cos (6) 

+M2 (0 sin (0)) ty (t, 6), 0 e T 

where T is the lD-torus. However, several other 
systems of physical interest are not covered by 
these results such as trapped ions modeled by two 
coupled quantum harmonic oscillators. In Erve- 
doza and Puel (2009), specific methods have been 
used to prove their approximate controllability. 

Concluding Remarks 

The variety of methods developed by different 
authors to characterize controllability of 
Schrodinger PDEs with bilinear control is the 


sign of a rich structure and subtle nature of 
control issues. New methods will probably be 
necessary to answer the remaining open problems 
in the field. 

This survey is far from being complete. 
In particular, we do not consider numerical 
methods to derive the steering control such as 
those used in NMR (Nielsen et al. 2010) to 
achieve robustness versus parameter uncertainties 
or such as monotone algorithms (Baudouin and 
Salomon 2008; Liao et al. 2011) for optimal 
control (Cances et al. 2000). We do not consider 
also open quantum systems where the state 
is then the density operator p, a nonnegative 
Hermitian operator with unit trace on H. The 
Schrodinger equation is then replaced by the 
Lindblad equation: 

—p = —l [Ho + uH\, p] + L v pL\ 

V 

— 2 ( L\L v p + pL^Ly) 

with operator L v related to the decoherence 
channel v . Even in the case of finite dimensional 
Hilbert space H, controllability of such system 
is not yet well understood and characterized 
(see Altafini (2003) and Kurniawan et al. (2012)). 

Cross-References 

► Control of Quantum Systems 

► Robustness Issues in Quantum Control 
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Abstract 

One-dimensional hyperbolic systems are com¬ 
monly used to describe the evolution of various 
physical systems. Lor many of these systems, 
controls are available on the boundary. There 
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are then two natural questions: controllability 
(steer the system from a given state to a desired 
target) and stabilization (construct feedback laws 
leading to a good behavior of the closed loop 
system around a given set point). 


Keywords 

Chromatography; Controllability; Electrical 
lines; Hyperbolic systems; Open channels; Road 
traffic; Stabilization 



where I(t,x ) is the current intensity, V(t,x) is 
the voltage, L s is the self-inductance per unit 
length, and C s is the self-capacitance per unit 
length. The system has two characteristic ve¬ 
locities (which are the eigenvalues of the ma¬ 
trix A): 


Ai 


1 

\/ L s C s 


> 0 > a 2 


1 

\/ L s C s 


(3) 


One-Dimensional Hyperbolic Systems 

The operation of many physical systems may be 
represented by hyperbolic systems in one space 
dimension. These systems are described by the 
following partial differential equation: 

Y t + A(Y)Y X = 0, ?g[ 0J], xe[0,L], (1) 


Saint-Venant Equation for Open Channels 

First proposed by Barre de Saint-Venant 
in (1871), the Saint-Venant equations (also 
called shallow water equations) describe the 
propagation of water in open channels (see 
Fig. 2). In the case of a horizontal channel 
with rectangular cross section, unit width, and 
negligible friction, the Saint-Venant model is a 
hyperbolic system of the form 


where: 

• t and v are two independent variables: a time 
variable t e [0, T] and a space variable v e 
[0, L] over a finite interval. 

• Y : [0, T] x [0, L] W 1 is the vector of state 
variables. 

• A : W 1 Af W;W (M) with M n , n (®0 is the set 
of n x n real matrices. 

• Y t and Y x denote the partial derivatives of Y 
with respect to t and x, respectively. 

The system (1) is hyperbolic which means that 
A{Y) has n distinct real eigenvalues (called char¬ 
acteristic velocities) for all Y in a domain of 
W 1 . Here are some typical examples of physical 
models having the form of a hyperbolic system. 

Electrical Lines 

First proposed by Heaviside in (1885, 1886 and 
1887), the equations of (lossless) electrical lines 
(also called telegrapher equations) describe the 
propagation of current and voltage along electri¬ 
cal transmission lines (see Fig. 1). It is a hyper¬ 
bolic system of the following form: 


GKfMS- 

where H(t,x) is the water depth, V(t,x) is the 
water horizontal velocity, and g is the gravity ac¬ 
celeration. Under subcritical flow conditions, the 
system is hyperbolic with characteristic velocities 

Ai = V + y/gH > 0 > A 2 = V — /gH. (5) 

Aw-Rascle Equations for Fluid Models 
of Road Traffic 

In the fluid paradigm for road traffic modeling, 
the traffic is described in terms of two basic 
macroscopic state variables: the density Q(t,x) 
and the speed V(t, x) of the vehicles at position v 
along the road at time t . The following dynamical 
model for road traffic was proposed by Aw and 
Rascle in (2000): 

(k) + (o K - e to) ) (k) = < 6) 
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Boundary Control of 1-D Hyperbolic Systems, Fig. 1 Transmission line connecting a power supply to a resistive 
load Rf, the power supply is represented by a Thevenin equivalent with efm U{t ) and internal resistance R g 


Boundary Control of 1-D 
Hyperbolic Systems, 

Fig. 2 Lateral view of a 
pool of a horizontal open 
channel 



H-—-h- 

0 L 


The system is hyperbolic with characteristic ve¬ 
locities 


X\ = V > A 2 = V-Q(q). (7) 


In this model the first equation of (6) is a conti¬ 
nuity equation that represents the conservation of 
the number of vehicles on the road. The second 
equation of (6) is a phenomenological model 
describing the speed variations induced by the 
driver’s behavior. 


Chromatography 

In chromatography, a mixture of species with 
different affinities is injected in the carrying fluid 
at the entrance of the process as illustrated in 
Fig. 3. The various substances travel at different 
propagation speeds and are ultimately separated 
in different bands. The dynamics of the mixture 
are described by a system of partial differential 
equations: 
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Principle of chromatography 


Lt(P) 


kiPj 
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(8) where Pi (i = denote the densities 

of the n carried species. The function Li(P ) 
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(called the “Langmuir isotherm”) was proposed 
by Langmuir in ( 191 6) . 

Boundary Control 

Boundary control of 1-D hyperbolic systems 
refers to situations where manipulated control 
inputs are physically located at the boundaries. 
Formally, this means that the system (1) is 
considered under n boundary conditions having 
the general form 

B(Y(t,0),Y(t,L),U(t))=0, (9) 

with B : R n x R n x R q R n . The depen¬ 
dence of the map B on (T(lO), Y(t,L )) refers 
to natural physical constraints on the system. The 
function U(t) e represents a set of q ex¬ 
ogenous control inputs. The following examples 
illustrate how the control boundary conditions (9) 
may be defined for some commonly used control 
devices: 

1. Electrical lines. For the circuit represented in 
Fig. 1, the line model (2) is to be considered 
under the following boundary conditions: 

F(L0) + R g I(t,0) = U(t ), 
V(t,L)-R t I(t,L) = 0. 

The telegrapher equations (2) coupled with 
these boundary conditions constitute therefore 
a boundary control system with the voltage 
U(t) as control input. 


2. Open channels. A standard situation is when 
the boundary conditions are assigned by tun¬ 
able hydraulic gates as in irrigation canals and 
navigable rivers; see Fig. 4. 

The hydraulic model of mobile spillways 
gives the boundary conditions 

Hit , 0 )V{t, 0) = k G ^[Z {) (t) - t/ 0 (f)] 3 , 

Hit , L)Vit, L) = k G y[//(;,L)-C/ L (0] 3 , 

where H(t, 0) and H(t,L ) denote the water 
depth at the boundaries inside the pool, Zo(t) 
and Zl( 0 are the water levels on the other 
side of the gate, kc is a constant gate shape 
parameter, and Uo and Ul represent the weir 
elevations. The Saint-Venant equations cou¬ 
pled to these boundary conditions constitute a 
boundary control system with Uo(t) and Ui(t) 
as command signals. 

3. Ramp metering. Ramp metering is a strategy 
that uses traffic lights to regulate the flow of 
traffic entering freeways according to mea¬ 
sured traffic conditions as illustrated in Fig. 5. 
For the stretch of motorway represented in this 
figure, the boundary conditions are 

Q (t,0)V(t,0)= QUt) + U(t), 

Qit,L)Vit,L ) = 0out(O, 

where U(t) is the inflow rate controlled by 
the traffic lights. The Aw-Rascle equations (6) 
coupled to these boundary conditions consti¬ 
tute a boundary control system with U(t) as 




Boundary Control of 1 -D Hyperbolic Systems, Fig. 4 Hydraulic gates at the input and the output of a pool 
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Boundary Control of 1 -D Hyperbolic Systems, Fig. 5 Ramp metering on a stretch of a motorway 


the command signal. In a feedback implemen¬ 
tation of the ramp metering strategy, U(t) may 
be a function of the measured disturbances 
Q int(0 or gout(0 that are imposed by the 
traffic conditions. 

4. Simulated moving bed chromatography is 

a technology where several interconnected 
chromatographic columns are switched 
periodically against the fluid flow. This 
allows for a continuous separation with a 
better performance than the discontinuous 
single-column chromatography. An efficient 
operation of SMB chromatography requires a 
tight control of the process by manipulating 
the inflow rates in the columns. This process 
is therefore a typical example of a periodic 
boundary control hyperbolic system. 


Controllability 

In this section and in the following one, 7* G R n 
is such that none of the eigenvalues of A (7*) 
are 0. After an appropriate linear state transfor¬ 
mation, the matrix A(7*) can be assumed to be 
diagonal, with distinct and nonzero entries: 

A(7*) = diag(A 1 ,A 2 ,...,A„), (10) 

Ai ^- > A 2 * '-• > hjfi 0 A^. 

Let 7+ G R m and 7“ G R n ~ m be such that 7 T = 

(7+t 7“ t ) t . 

For the boundary control system (1), (9), the 
local controllability issue is to investigate if, 
starting from a given initial state 7 q : x G 


[0, L] i-> 7o(v) G R n , it is possible to reach in 
time T a desired target state Y\ : x e [o ,L] 
7i(v) G M", with 7 0 (v) and 7i(v) close to 7*. 

Theorem 4 (See Li and Rao 2003) If there exist 
control inputs U + (t ) and U~(t ) such that the 
boundary conditions (9) are equivalent to 

Y + (t,0) = U + (t), Y~(t,L) = U~(t), 

( 11 ) 

then the boundary control system (1), (11) is 
locally controllable for the C 1 -norm if and only 
ifT > T c with 


Feedback Stabilization 

For the boundary control system (1), (9), the 
problem of local boundary feedback stabilization 
is the problem of finding boundary feedback 
control actions 

U(t ) = F(Y(t, 0), Y(t, L), 7*), 

F xP xf ~^R p , (12) 

such that the system trajectory exponentially con¬ 
verges to a desired steady-state 7* (called set 
point ) from any initial condition 7o(v) close to 
7*. In such case, the set point is said to be 
exponentially stable. 

Theorem 5 (See Coron et al. 2008) If 

there exists a boundary feedback U(t ) = 
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F(Y(t,0),Y(t,L), 7*) such that the boundary 
conditions (9) are written in the form 


(Y + (t,0)\ = (Y+{t,L)\ 

V~0,L)) \Y-(,S)))' 


G(Y*) 


Y * 

(13) 


then , for the boundary control system (1), (13), 
the set point Y * is locally exponentially stable for 
the H 2 -norm if 


Inf{\\AG'(Y*)A~ l ||; A e V} < 1, 


where || || denotes the usual 2-norm ofn x n real 
matrices, G'(7*) denotes the Jacobian matrix 
of the map G at Y*, and V denotes the set of 
nxn diagonal real matrices with strictly positive 
diagonal entries. 


As shown in Li (2010), this problem has strong 
connections with the controllability problem and 
the system (1) is observable if the time T is large 
enough. 

The above results are on smooth solutions of 
(1). However, the system (1) is known to be 
well posed in class of B V- solutions (Bounded 
Variations), with extra conditions (e.g., entropy 
type); see in particular Bressan (2000). There are 
partial results on the controllability in this class. 
See, in particular, Ancona and Marson (1998) and 
Horsin (1998) for n = 1. For n = 2, it is shown 
in Bressan and Coclite (2002) that Theorem 4 no 
longer holds in general in the B V class. However, 
there are positive results for important physical 
systems; see, for example, Glass (2007) for the 
1-D isentropic Euler equation. 


For the stabilization in the C^norm, another 

sufficient condition is given in Li (1994). Cross-References 


Summary and Future Directions 

With suitable boundary controls, hyperbolic sys¬ 
tems can be controlled and stabilized around a 
desired set point. However, in many situations 
the hyperbolic model is not sufficient: one needs 
to add a zero-order term and (1) has to be re¬ 
placed by 

Y t +A(Y)Y X +C(Y) = 0, t e [0 ,T], x e [0,L], 

(14) 

where C : W 1 W. This is, for example, 
the case for the open channels when slope and 
friction cannot be neglected. Note that the set 
point 7* may now depend on v. For the con¬ 
trollability issue, the new term C(7) turns out 
to be not essential; see in particular Li (2010). 
The situation is not the same for the stabilization 
and only partial results are known. In particular, 
Coron et al. (2013) uses Krstic’s backstepping 
approach (Krstic and Smyshlyaev 2008) to treat 
the case n = 2 and m = 1. 

Another important issue for the system (1) is 
the observability problem: assume that the state 
is measured on the boundary during the interval 
of time [0, T], can one recover the initial data? 


► Controllability and Observability 

► Control of Fluids and Fluid-Structure Interac¬ 
tions 

► Control of Linear Systems with Delays 

► Feedback Stabilization of Nonlinear Systems 

► Lyapunov’s Stability Theory 
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Abstract 

The Korteweg-de Vries (KdV) and the Kuramoto- 
Sivashinsky (KS) partial differential equations 
are used to model nonlinear propagation of one¬ 
dimensional phenomena. The KdV equation 
is used in fluid mechanics to describe waves 
propagation in shallow water surfaces, while 
the KS equation models front propagation in 
reaction-diffusion systems. In this article, the 
boundary control of these equations is considered 


when they are posed on a bounded interval. 
Different choices of controls are studied for each 
equation. 

Keywords 

Controllability; Dispersive equations; Higher- 
order partial differential equations; Parabolic 
equations; Stabilizability 

Introduction 

The Korteweg-de Vries (KdV) and the Kuramoto- 
Sivashinsky (KS) equations have very different 
properties because they do not belong to the same 
class of partial differential equations (PDEs). 
The first one is a third-order nonlinear dispersive 
equation 

yt + j* + y X xx + yyx — 0, (l) 

and the second one is a fourth-order nonlinear 
parabolic equation 

Ut + u X xxx H - H - uu x — 0, (2) 

where A > 0 is called the anti-diffusion param¬ 
eter. However, they have one important charac¬ 
teristic in common. They are both used to model 
nonlinear propagation phenomena in the space 
x-direction when the variable t stands for time. 
The KdV equation serves as a model for waves 
propagation in shallow water surfaces (Korteweg 
and de Vries 1895), and the KS equation models 
front propagation in reaction-diffusion phenom¬ 
ena including some instability effects (Kuramoto 
and Tsuzuki 1975; Sivashinsky 1977). 

From a control point of view, a new com¬ 
mon characteristic arises. Because of the order 
of the spatial derivatives involved, when studying 
these equations on a bounded interval [0, L], two 
boundary conditions have to be imposed at the 
same point, for instance, at x = L. Thus, we 
can consider control systems where we control 
one boundary condition but not all the bound¬ 
ary data at one endpoint of the interval. This 
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configuration is not possible for the classical 
wave and heat equations where at each extreme, 
only one boundary condition exists and therefore 
controlling one or all the boundary data at one 
point is the same. 

The KdV equation being of third order in 
space, three boundary conditions have to be im¬ 
posed: one at the left endpoint x = 0 and two at 
the right endpoint x = L. For the KS equation, 
four boundary conditions are needed to get a 
well-posed system, two at each extreme. We will 
focus on the cases where Dirichlet and Neumann 
boundary conditions are considered because lack 
of controllability phenomena appears. This holds 
for some special values of the length of the 
interval for the KdV equation and depends on the 
anti-diffusion coefficient A for the KS equation. 

The particular cases where the lack of 
controllability occurs can be seen as isolated 
anomalies. However, those phenomena give 
us important information on the systems. In 
particular, any method independent of the value 
of those constants cannot control or stabilize 
the system when acting from the corresponding 
control input where trouble appears. In all of 
these cases, for both the KdV and the KS 
equations, the space of uncontrollable states is 
finite dimensional, and therefore, some methods 
coming from the control of ordinary differential 
equations can be applied. 

General Definitions 

Infinite-dimensional control systems described 
by PDEs have attracted a lot of attention since the 
1970s. In this framework, the state of the control 
system is given by the solution of an evolution 
PDE. This solution can be seen as a trajectory 
in an infinite-dimensional Hilbert space //, for 
instance, the space of square integrable functions 
or some Sobolev space. Thus, for any time t , the 
state belongs to H . Concerning the control input, 
this is either an internal force distributed in the 
domain, or a punctual force localized within the 
domain, or some boundary data as considered in 
this article. For any time t, the control belongs 
to a control space U, which can be, for instance, 


the space of bounded functions. The main control 
properties to be mentioned in this article are con¬ 
trollability, stability, and stabilization. A control 
system is said to be exactly controllable if the 
system can be driven from any initial state to 
another one in finite time. This kind of properties 
holds, for instance, for hyperbolic system as the 
wave equation. The notion of null-controllability 
means that the system can be driven to the origin 
from any initial state. The main example for 
this property is the heat equation, which presents 
regularizing effects. Even if the initial data is 
discontinuous, right after t = 0, the solution 
of the heat equation becomes very smooth, and 
therefore, it is not possible to impose a discontin¬ 
uous final state. A system is said to be asymptoti¬ 
cally stable if the solutions of the system without 
any control converge as the time goes to infinity 
to a stationary solution of the PDE. When this 
convergence holds with a control depending at 
each time on the state of the system (feedback 
control), the system is said to be stabilizable by 
means of a feedback control law. 

All these properties have local versions when a 
smallness condition for the initial and/or the final 
state is added. This local character is normally 
due to the nonlinearity of the system. 


The KdV Equation 

The classical approach to deal with nonlinearities 
is first to linearize the system around a given 
state or trajectory, then to study the linear system 
and finally to go back to the nonlinear one by 
means of an inversion argument or a fixed-point 
theorem. Linearizing (1) around the origin, we 
get the equation 

yt + yx + y X xx — 0, (3) 

which can be studied on a finite interval [0, L] 
under the following three boundary conditions: 

y(/,0) = h\(t), y(t,L ) = /z 2 (0> and 
y x (t,L ) = h 3 (t). 


( 4 ) 



90 


Boundary Control of Korteweg-de Vries and Kuramoto-Sivashinsky PDEs 


Thus, viewing h\(t), hi(t), hi(t) G Mas controls 
and the solution y(t, •) : [0, L] —> M as the state, 
we can consider the linear control system (3)-(4) 
and the nonlinear one (l)-(4). 

We will report on the role of each input control 
when the other two are off. The tools used are 
mainly the duality controllability-observability, 
Carleman estimates, the multiplier method, the 
compactness-uniqueness argument, the backstep- 
ping method, and fixed-point theorems. Surpris¬ 
ingly, the control properties of the system depend 
strongly on the location of the controls. 

Theorem 1 The linear KdV system (3)-(4) is: 

1. Null-controllable when controlled from h\ 
(i.e., h 2 = h$ = 0) (Glass and Guerrero 
2008). 

2. Exactly controllable when controlled from /z 2 
(i.e., h\ = hi = 0) if and only if L does not 
belong to a set O of critical lengths defined in 
Glass and Guerrero (2010). 

3. Exactly controllable when controlled from hi 
(i.e., h\ = h 2 = 0) if and only if L does not 
belong to a set of critical lengths N defined in 
Rosier (1997). 

4. Asymptotically stable to the origin if L £ N 
and no control is applied (Perla Menzala et al. 
2002 ). 

5. Stabilizable by means of a feedback law using 
h\ only (i.e., hi = hi = 0) Cerpa and Coron 
(2013). 

If L G N or L G O , one says that L is 
a critical length since the linear control system 
(3)-(4) loses controllability properties when only 
one control input is applied. In those cases, there 
exists a finite-dimensional subspace of L 2 (0, L) 
which is unreachable from 0 for the linear system. 
The sets N and O contain infinitely many critical 
lengths, but they are countable sets. 

When one is allowed to use more than one 
boundary control input, there is no critical spatial 
domain, and the exact controllability holds for 
any L > 0. This is proved in Zhang (1999) when 
three boundary controls are used. The case of two 
control inputs is solved in Rosier (1997), Glass 
and Guerrero (2010), and Cerpa et al. (2013). 

Previous results concern the linearized control 
system. Considering the nonlinearity yy x , we 


obtain the original KdV control system and the 
following results. 

Theorem 2 The nonlinear KdV system (1)- 
(4) is: 

1. Locally null-controllable when controlled 
from hi (i.e., hi = hi = 0) (Glass and 
Guerrero 2008). 

2. Locally exactly controllable when controlled 
from hi (i.e., h\ = hi = 0) if L does not 
belong to the set O of critical lengths (Glass 
and Guerrero 2010). 

3. Locally exactly controllable when controlled 
from hi (i.e., h\ = hi =0). IfL belongs to the 
set of critical lengths N, then a minimal time 
of control may be required (see Cerpa 2014). 

4. Asymptotically stable to the origin if L N 
and no control is applied (Perla Menzala et al. 
2002 ). 

5. Locally stabilizable by means of a feedback 
law using h\ only (i.e., hi = hi = 0) (Cerpa 
and Coron 2013). 

Item 3 in Theorem 2 is a truly nonlinear re¬ 
sult obtained by applying a power series method 
introduced in Coron and Crepeau (2004). All 
other items are implied by perturbation argu¬ 
ments based on the linear control system. The re¬ 
lated control system formed by (1) with boundary 
controls 

y(t,0) = hi(t), y x (t,L) = h 2 (t), and 
y xx (t,L) = h 3 (t), (5) 

is studied in Cerpa et al. (2013), and the same 
phenomenon of critical lengths appears. 


The KS Equation 

Applying the same strategy than for KdV, we 
linearize (2) around the origin to get the equation 

Mt + U X xxx ^U X x — 0 , ( 6 ) 

which can be studied on the finite interval [0,1] 
under the following four boundary conditions: 
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u(t, 0) = V\(t), u x (t,0 ) = V 2 (t ), 

u(t, 1) = vi(t), and u x (t , 1) = ^(O- 

(7) 

Thus, viewing ui(f), ^(0^3(0> u 4(0 ^ M as 
controls and the solution u(t, •) : [0,1] -> M 
as the state, we can consider the linear control 
system (6)-(7) and the nonlinear one (2)-(7). The 
role of the parameter A is crucial. The KS equa¬ 
tion is parabolic and the eigenvalues of system 
(6)-(7) with no control (v\ = r> 2 = = V 4 = 0) 

go to — 00 . If A increases, then the eigenvalues 
move to the right. When A > At: 1 , the sys¬ 
tem becomes unstable because there are a finite 
number of positive eigenvalues. In this unstable 
regime, the system loses control properties for 
some values of A. 

Theorem 3 The linear KS control system (6)-(7) 
is: 

1. Null-controllable when controlled from V\ and 
V 2 (he., V 3 = V 4 = 0). The same is true when 
controlling V 3 and V 4 (i.e., V\ = V 2 = 0) 
(Cerpa and Mercado 2011; Lin Guo 2002). 

2. Null-controllable when controlled from V 2 
(i.e., V\ = V 2 = r >3 = 0) if and only if X does 
not belong to a countable set M defined in 
Cerpa (2010). 

3. Asymptotically stable to the origin if X < 
An 2 and no control is applied (Liu and Krstic 
2001 ). 

4. Stabilizable by means of a feedback law using 
V 2 only (i.e., V 2 = V 3 = V 4 = 0) if and only if 
X £ M (Cerpa 2010). 

In the critical case A e M, the linear system 
is not null-controllable anymore if we control 
V 2 only (item 2 in Theorem 3). The space of 
noncontrollable states is finite dimensional. To 
obtain the null-controllability of the linear system 
in these cases, we have to add another control. 
Controlling with V 2 and V 4 does not improve the 
situation in the critical cases. Unlike that, the 
system becomes null-controllable if we can act 
on V\ and V 2 . This result with two input controls 
has been proved in Lin Guo (2002) for the case 
A = 0 and in Cerpa and Mercado (2011) in the 
general case (item 1 in Theorem 3). 


It is known from Liu and Krstic (2001) that 
if A < 4;r 2 , then the system is exponentially 
stable in L 2 (0,1). On the other hand, if A = 47T 2 , 
then zero becomes an eigenvalue of the system, 
and therefore the asymptotic stability fails. When 
A > 4 tt 2 , the system has positive eigenvalues 
and becomes unstable. In order to stabilize this 
system, a finite-dimensional-based feedback law 
can be designed by using the pole placement 
method (item 4 in Theorem 3). 

Previous results concern the linearized control 
system. If we add the nonlinearity uu x , we obtain 
the original KS control system and the following 
results. 

Theorem 4 The KS control system (2)-(7) is: 

1. Locally null-controllable when controlled 
from i> 1 and V2 (i.e., V3 = V4 = 0). The 
same is true when controlling V 3 and V 4 (i.e., 
u 1 = i ?2 = 0) (Cerpa and Mercado 2011). 

2. Asymptotically stable to the origin if X < 
An 2 and no control is applied (Liu and Krstic 
2001 ). 

There are less results for the nonlinear systems 
than for the linear one. This is due to the fact that 
the spectral techniques used to study the linear 
system with only one control input are not robust 
enough to deal with perturbations in order to 
address the nonlinear control system. 


Summary and Future Directions 

The KdV and the KS equations possess both 
noncontrol results when one boundary control 
input is applied. This is due to the fact that 
both are higher-order equations, and therefore, 
when posed on a bounded interval, more than 
one boundary condition should be imposed at 
the same point. The KdV equation is exactly 
controllable when acting from the right and null- 
controllable when acting from the left. On the 
other hand, the KS equation, being parabolic as 
the heat equation, is not exactly controllable but 
null-controllable. Most of the results are implied 
by the behaviors of the corresponding linear sys¬ 
tem, which are very well understood. 
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For the KdV equation, the main directions to 
investigate at this moment are the controllability 
and the stability for the nonlinear equation in 
critical domains. Among others, some questions 
concerning controllability, minimal time of con¬ 
trol, and decay rates for the stability are open. 
Regarding the KS equation, there are few results 
for the nonlinear system with one control input 
even if we are not in a critical value of the 
anti-diffusion parameter. In the critical cases, the 
controllability and stability issues are wide open. 

In general, for PDEs, there are few results 
about delay phenomena, output feedback laws, 
adaptive control, and other classical questions 
in control theory. The existing results on these 
topics mainly concern the more popular heat and 
wave equations. As KdV and KS equations are 
one dimensional in space, many mathematical 
tools are available to tackle those problems. For 
all that, to our opinion, the KdV and KS equations 
are excellent candidates to continue investigating 
these control properties in a PDE framework. 

Cross-References 

► Boundary Control of 1-D Hyperbolic Systems 

► Controllability and Observability 

► Control of Fluids and Fluid-Structure Interac¬ 
tions 

► Feedback Stabilization of Nonlinear Systems 

► Stability: Lyapunov, Linear Systems 

Recommended Reading 

The book Coron (2007) is a very good refer¬ 
ence to study the control of PDEs. In Cerpa 
(2014), a tutorial presentation of the KdV con¬ 
trol system is given. Control system for PDEs 
with boundary conditions and internal controls is 
considered in Rosier and Zhang (2009) and the 
references therein for the KdV equation and in 
Armaou and Christofides (2000) and Christofides 
and Armaou (2000) for the KS equation. Control 
topics as delay and adaptive control are studied 
in the framework of PDEs in Krstic (2009) and 
Smyshlyaev and Krstic (2010), respectively. 
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Abstract 

We review several universal lower bounds on 
statistical estimation, including deterministic 
bounds on unbiased estimators such as Cramer- 
Rao bound and Barankin-type bound, as well as 
Bayesian bounds such as Ziv-Zakai bound. We 
present explicit forms of these bounds, illustrate 
their usage for parameter estimation in Gaussian 
additive noise, and compare their tightness. 


Keywords 

Barankin-type bound; Cramer-Rao bound; 
Mean-squared error; Statistical estimation; 
Ziv-Zakai bound 


Introduction 

Statistical estimation involves inferring the values 
of parameters specifying a statistical model from 
data. The performance of a particular statistical 
algorithm is measured by the error between the 


true parameter values and those estimated by the 
algorithm. However, explicit forms of estimation 
error are usually difficult to obtain except for 
the simplest statistical models. Therefore, per¬ 
formance bounds are derived as a way of quan¬ 
tifying estimation accuracy while maintaining 
tractability. 

In many cases, it is beneficial to quantify 
performance using universal bounds that are in¬ 
dependent of the estimation algorithms and rely 
only upon the model. In this regard, universal 
lower bounds are particularly useful as it provides 
means to assess the difficulty of performing es¬ 
timation for a particular model and can act as 
benchmarks to evaluate the quality of any algo¬ 
rithm: the closer the estimation error of the algo¬ 
rithm to the lower bound, the better the algorithm. 
In the following, we review three widely used 
universal lower bounds on estimation: Cramer- 
Rao bound (CRB), Barankin-type bound (BTB), 
and Ziv-Zakai bound (ZZB). These bounds find 
numerous applications in determining the per¬ 
formance of sensor arrays, radar, and nonlinear 
filtering; in benchmarking various algorithms; 
and in optimal design of systems. 


Statistical Model and Related 
Concepts 

To formalize matters, we define a statistical 
model for estimation as a family of parameterized 
probability density functions in R N : {p(x;6) : 
6 e 0 C M J }. We observe a realization of 
v e R n generated from a distribution p(x;0 ), 
where 6 e 0 is the true parameter to be 
estimated from data v. Though we assume a 
single observation x, the model is general enough 
to encompass multiple independent, identically 
distributed samples (i.i.d.) by considering the 
joint probability distribution. 

An estimator of 0 is a measurable function of 
the observation 0(x) : R N —> 0. An unbiased 
estimator is one such that 

Eg |i9(x)| = eye e ©. (i) 
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Here we used the subscript 6 to emphasize 
that the expectation is taken with respect to 
p(x;0). We focus on the performance of 
unbiased estimators in this entry. There are 
various ways to measure the error of the estimator 
0(x). Two typical ones are the error covariance 
matrix: 


E 0 {(0 - 9)0 - 0) r } = Cov(<9), (2) 

where the equation holds only for unbiased esti¬ 
mators, and the mean-squared error (MSE): 

E e {||0 (jc) - 9\\l] = trace (e 9 {(0 - 9) 

(0-0) 7 }). (3) 

Example 1 (Signal in additive Gaussian noise 
(SAGN)) To illustrate the usage of different 
estimation bounds, we use the following 
statistical model as a running example: 

x n = s n (6) + w n ., n = 0,..., N — 1. (4) 

Here 0 e 0 C M is a scalar parameter to be 

estimated and the noise w n follows i.i.d. Gaussian 
distribution with mean 0 and known variance a 2 . 
Therefore, the density function for x is 


p(x;9) 

N~ 1 


= n^H- 


(x„-^(g)) 2 

2 a 2 


N -1 


(V / 27rcr) / 


ex p {-1] 


n =0 


( X n ~^»(^)) 2 
2cr 2 


In particular, we consider the frequency estima¬ 
tion problem where s n (6) = cos(2 nnO) with 

© = [0, i). 


Cramer-Rao Bound 

The Cramer-Rao bound (CRB) (Kay 2001a; Sto- 
ica and Nehorai 1989; Van Trees 2001) is ar¬ 
guably the most well-known lower bounds on 


estimation. Define the Fisher information matrix 
1(0) via 


I iJ (9) = E e j 


= —IE e 


\ww, ' oip(x ' e> 


Then for any unbiased estimator 0, the error 
covariance matrix is bounded by 

Eg j(0 — 9)(9 — 0) r | >: [I (0)] _1 , (5) 

where A > B for two symmetric matrices means 
A — B is positive semidefinite. The inverse of the 
Fisher information matrix CRB (0) = [/(#)] -1 is 
called the Cramer-Rao bound. 

When 6 is scalar, 1(6) measures the expected 
sensitivity of the density function with respect to 
changes in the parameter. A density family that 
is more sensitive to parameter changes (larger 
1(0)) will generate observations that look more 
different when the true parameter varies, making 
it easier to estimate (smaller error). 

Example 2 For the SAGN model (4), the CRB is 

CRB (9) = I(9)~ l = ---j. (6) 

v-^iv-l r ds n (9) 1 

2^n =0 [ 80 } 

The inverse dependence on the I 2 norm of signal 
derivative suggests that signals more sensitive to 
parameter change are easier to estimate. 

For the frequency estimation problem with 
s n (0) = cos(2jtn0), the CRB as a function of 
6 is plotted in Fig. 1. 

There are many modifications of the basic 
CRB such as the posterior CRB (Tichavsky et al. 
1998; Van Trees 2001), the hybrid CRB (Rockah 
and Schultheiss 1987), the modified CRB 
(D’Andrea et al. 1994), the concentrated CRB 
(Hochwald and Nehorai 1994), and constrained 
CRB (Gorman and Hero 1990; Marzetta 1993; 
Stoica and Ng 1998). The posterior CRB 
takes into account the prior information of the 
parameters when they are modeled as random 
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CRB for frequency estimation 



Bounds on Estimation, Fig. 1 Cramer-Rao bound on 
frequency estimation: N = 5 vs. N = 10 


variables, while the hybrid CRB considers the 
case that the parameters contain both random and 
deterministic parts. The modified CRB and the 
concentrated CRB focus on handling nuisance 
parameters in a tractable manner. The application 
of the these CRBs requires a regular parameter 
space (e.g., an open set in R d ). However, in many 
case, the parameter space 0 is a low-dimensional 
manifolds in R d specified by equalities and 
inequalities. In this case, the constrained CRB 
provides tighter lower bounds by incorporating 
knowledge of the constraints. 


Barankin Bound 

CRB is a local bound in the sense that it involves 
only local properties (the first or second order 
derivatives) of the log-likelihood function. So if 
two families of log-likelihood functions coincide 
at a region near 9°, the CRB at 9° would be the 
same, even if they are drastically different in other 
regions of the parameter space. 

However, the entire parametric space should 
play a role in determining the difficulty of param¬ 
eter estimation. To see this, imagine that there are 
two statistical models. In the first model there is 
another point 9 l e © such that the likelihood 


family p(x\9) behaves similarly around 9 0 and 
9 l , but these two points are not in neighbor¬ 
hoods of each other. Then it would be difficult 
to distinguish these two points for any estimation 
algorithm, and the estimation performance for the 
first statistical model would be bad (an extreme 
case is p(x;9 °) = p(x;9 l ) in which case the 
model is non-identifiable; more discussions on 
identifiability and Fisher information matrix can 
be found in Hochwald and Nehorai (1997)). In 
the second model, we remove the point 6 1 and its 
near neighborhood from 0, then the performance 
should get better. However, CRB for both models 
would remain the same whether we exclude 6 l 
from 0 or not. As a matter of fact, CRB(#°) uses 
only the fact that the estimator is unbiased in a 
neighborhood of the true parameter 9°. 

Barankin bound addresses CRB’s shortcoming 
of not respecting the global structure of the sta¬ 
tistical model by introducing finitely many test 
points {9 1 ,i = 1,..., M} and ensures that the 
estimator is unbiased at the neighborhood of 9° 
as well as these test points (Forster and Larzabal 
2002). The original Barankin bound (Barankin 
1949) is derived for scalar parameter 9 e 0 C M 
and any unbiased estimator g(9) for a function 
g(0): 

E e(g(0) - g(0)) 2 > sup 

M,6>V 

[x£-i"'(«(»') 

Using (7), we can derive a Barankin-type bound 
on the error covariance matrix of any unbiased 
estimator 9 (x) for a vector parameter 9 e © C 
R d (Forster and Larzabal 2002): 

E e {(0-<?)(#-6>) r } > <S>(B - ll r )“ 1 <l> 7 \ (8) 

where the matrices are defined via 


B,,j = E# 


p(x\9 l ) p(x ; 9 J )) 
p(x;9 ) p(x;9 ) ) 


, 1 < i,j < M, 


0 = [6d -9 ■■■ 6 M -6] 
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CRB vs BTB 



Bounds on Estimation, Fig. 2 Cramer-Rao bound vs. 
Barankin-type bound on frequency estimation when 0° = 
0.1. The BTB is obtained using M = 10 uniform random 
points 


and 1 is the vector in with all ones. Note 
that we have used 6 l with a superscript to denote 
different points in 0, while Of with a subscript to 
denote the i th component of a point 0. 

Since the bound (8) is valid for any M and 
any choice of test points {0 1 }, we obtain the 
tightest bound by taking the supremum over all 
finite families of test points. Note that when we 
have d test points that approach 0 in d linearly 
independent directions, the Barankin-type bound 
(8) converges to the CRB. If we have more than 
d test points, however, the Barankin-type bound 
is always not worse than the CRB. Particularly, 
the Barankin-type bound is much tighter in the 
regime of low signal-to-noise ratio (SNR) and 
small number of measurements, which allows 
one to investigate the “threshold” phenomena as 
shown in the next example. 

Example 3 For the SAGN model, if we have M 
test points, the elements of matrix B are of the 
following form: 


N -1 


Bij = exp < -f VM# 1 ) - s„(0)] 

/ <T Z z — 1 


n =0 


[s n (6J)-s n m\ 


In most cases, it is extremely difficult to derive 
an analytical form of the Barankin bound by 
optimizing with respect to M and the test points 
{6 J }. In Fig. 2, we plot the Barankin-type bounds 
for s n (0) = cos(2jtn6) for M = 10 randomly 
selected test points. We observe that Barankin- 
type bound is tighter than the CRB when SNR 
is small. There is a SNR region around 0 dB that 
the Brankin-type bound drops drastically. This is 
usually called the “threshold” phenomenon. Prac¬ 
tical systems operate much better in the region 
above the threshold. 

The basic CRB and BTB belong to the family 
of deterministic “covariance inequality” bounds 
in the sense that the unknown parameter is as¬ 
sumed to be a deterministic quantity (as opposed 
to a random quantity). Additionally, both bounds 
work only for unbiased estimators, making them 
inappropriate performance indicators for biased 
estimators such as many regularization-based es¬ 
timators. 


Ziv-Zakai Bound 

In this section, we introduce the Ziv-Zakai bound 
(ZZB) (Bell et al. 1997) that is applicable to any 
estimator (not necessarily unbiased). Unlike the 
CRB and BTB, the ZZB is a Bayesian bound and 
the errors are averaged by the prior distribution 
pe (0) of the parameter. For any the ZZB 

states that 

a T E {(<9(x) - B)(B(x) - 6>) r } a > 

5 /o 00 V {ma x S:a T S = h [f M Xpe(<P) + Pe(<P + $)) 
timesP m in(0,0 + <5)z/0]} hdh, 

where the expectation is taken with respect to 
the joint disunity p(x; 6)po((/)), V{q(h)} = 

max r >o q(h + r) is the valley-filling function, 
and P min (0,0 + 8 ) is the minimal probability of 
error for the following binary hypothesis testing 
problem: 


Hq : 6 = 0; v ~ p(x;(p) 
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H\ : 0 = <p + 8; x ~ cp + S) 


with 


Pr(H 0 ) = 


PfA4>)_ 

Pe(<t >) + 


Pr(Hi) = 


^(0) + + <$)' 


Example 4 For the ASGN model, we assume a 
uniform prior probability, i.e., po(<p) = 4,0 e 
[0,1 /4). The ZZB simplifies to 


e{||0(*)-0|| 




8Pmi ni<P-><P h)d(j) 


hdh. 


The binary hypothesis testing problem is to de¬ 
cide which one of two signals is buried in addi¬ 
tive Gaussian noise. The optimal detector with 
minimal probability of error is the minimum 
distance receiver (Kay 2001b), and the associated 
probability of error is 


PmmiSpi (f) h) 

= Q ^ 1 ^ E^Tp 1 fa ((p)-sj(p + /?)) 2 j 

2 

where Q(h) = -j=e~^dt. For the 

frequency estimation problem, we numerically 
estimate the integral and plot the resulting 
ZZB in Fig. 3 together with the mean-squared 
error for the maximum likelihood estimator 
(MLE). 

Summary and Future Directions 

We have reviewed several important performance 
bounds on statistical estimation problems, 
particularly, the Cramer-Rao bound, the 
Barankin-type bound, and the Ziv-Zakai bound. 
These bounds provide a universal way to quantify 


Ziv-Zakai Bound 



Bounds on Estimation, Fig. 3 Ziv-Zakai bound vs. 
maximum likelihood estimator for frequency estimation 


the performance of statistically modeled physical 
systems that is independent of any specific 
algorithm. 

Future directions of performance bounds on 
estimation include deriving tighter bounds, de¬ 
veloping computational schemes to approximate 
existing bounds in a tractable way, and applying 
them to practical problems. 
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Abstract 

This entry provides an overview of systems and 
issues related to providing optimized controls for 
commercial buildings. It includes a description of 
the evolution of the control systems over time, 
typical equipment and control variables, typical 
two-level hierarchal structure for feedback and 
supervisory control, definition of the optimal su¬ 
pervisory control problem, references to typical 
heuristic control approaches, and a description of 
current and future developments. 


Keywords 

Building automation systems (BAS); Cooling 
plant optimization; Energy management and 
controls systems (EMCS); Intelligent building 
controls 


Introduction 

Computerized control systems were developed 
in the 1980s for commercial buildings and are 
typically termed energy management and control 
systems (EMCS) or building automation systems 
(BAS). They have been most successfully ap¬ 
plied to large commercial buildings that have 
hundreds of building zones and thousands of con¬ 
trol points. Less than about 15 % of commercial 
buildings have EMCS, but they serve about 40 % 
of the floor area. Small commercial buildings 
tend not to have an EMCS, although there is 
a recent trend towards the use of wireless ther¬ 
mostats with cloud-based energy management 
solutions. 

EMCS architectures for buildings have 
evolved from centralized to highly distributed 
systems as depicted in Fig. 1 in order to reduce 
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CENTRALIZED CONFIGURATION 


DISTRIBUTED CONFIGURATION 


Building Control Systems, Fig. 1 Evolution from centralized to distributed network architectures 


wiring costs and provide more modular solutions. 
The development of open communications 
protocols, such as BACNet, has enabled the 
use of distributed control devices from different 
vendors and improved the cost-effectiveness 
of ECMS. There has also been a recent trend 
towards the use of existing enterprise networks 
to reduce system installed costs and to more 
easily allow remote access and control from any 
Internet accessible device. 

An EMCS for a large commercial building can 
automate the control of many of the building and 
system functions, including scheduling of lights 
and zone thermostat settings according to occu¬ 
pancy patterns. Security and fire safety systems 
tend to be managed using separate systems. In 
addition to scheduling, an EMCS manages the 
control of individual equipment and subsystems 
that provide heating, ventilation, and air condi¬ 
tioning of the building (HVAC). This control is 
achieved using a two-level hierarchical structure 
of local-loop and supervisory control. Local-loop 
control of individual set points is typically im¬ 
plemented using individual proportional-integral 
(PI) feedback algorithms that manipulate individ¬ 
ual actuators in response to deviations from the 
set points. For example, supply air temperature 
from a cooling coil is controlled by adjusting 
a valve opening that provides chilled water to 
the coil. The second level of supervisory con¬ 
trol specifies the set points and other modes 


of operation that depend on time and external 
conditions. 

Each local-loop feedback controller acts 
independently, but their performance can be 
coupled to other local-loop controllers if not 
tuned appropriately. Adaptive tuning algorithms 
have been developed in recent years to enable 
controllers to automatically adjust to changing 
weather and load conditions. There are typically 
a number of degrees of freedom in adjusting 
supervisory control set points over a wide 
range while still achieving adequate comfort 
conditions. Optimal control of supervisory set 
points involves minimizing a cost function 
with respect to the free variables and subject 
to constraints. Although model-based, control 
optimization approaches are not typically 
employed in buildings, they have been used to 
inform the development and assessment of some 
heuristic control strategies. Most commonly, 
strategies for adjusting supervisory control 
variables are established at the control design 
phase based on some limited analysis of the 
HVAC system and specified as a sequence of 
operations that is programmed into the EMCS. 

Systems, Equipment, and Controls 

The greatest challenges and opportunities for 
optimizing supervisory control variables exist for 
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Building Control Systems, Fig. 2 Schematic of a chilled water cooling system 


centralized cooling systems that are employed 
in large commercial buildings because of the 
large number of control variables and degrees 
of freedom along with utility rate incentives. 
A simplified schematic of a typical centralized 
cooling plant is shown in Fig. 2 with components 
grouped under air distribution, chilled water loop, 
chiller plant, and condenser water loop. 

Typical air distribution systems include VAV 
(variable-air volume) boxes within the zones, 
air-handling units, ducts, and controls. An air¬ 
handling unit (AHU) provides the primary condi¬ 
tioning, ventilation, and flow of air and includes 
cooling and heating coils, dampers, fans, and con¬ 
trols. A single air handler typically serves many 
zones and several air handlers are utilized in a 
large commercial building. For each AHU, out¬ 
door ventilation air is mixed with return air from 
the zones and fed to the cooling coil. Outdoor and 
return air dampers are typically controlled using 
an economizer control that selects between min¬ 
imum and maximum ventilation air depending 
upon the condition of the outside air. The cooling 
coil provides both cooling and dehumidification 
of the process air. The air outlet temperature from 


the coil is controlled with a local feedback con¬ 
troller that adjusts the flow of water using a valve. 
A supply fan and return fan (not shown in Fig. 2) 
provide the necessary airflow to and from the 
zones. With a VAV system, zone temperature set 
points are regulated using a feedback controller 
applied to dampers within the VAV boxes. The 
overall air flow provided by the AHU is typically 
controlled to maintain a duct static pressure set 
point within the supply duct. 

The chilled water loop communicates 
between the cooling coils within the AHUs 
and chillers that provide the primary source 
for cooling. It consists of pumps, pipes, 
valves, and controls. Primary/secondary chilled 
water systems are commonly employed to 
accommodate variable-speed pumping. In the 
primary loop, fixed-speed pumps are used to 
provide relatively constant chiller flow rates to 
ensure good performance and reduce the risk of 
evaporator tube freezing. Individual pumps are 
typically cycled on and off with a chiller that 
it serves. The secondary loop incorporates one 
or more variable-speed pumps that are typically 
controlled to maintain a set point for chilled water 
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loop differential pressure between the building 
supplies and returns. 

The primary source of cooling for the system 
is typically provided by one or more chillers 
that are arranged in parallel and have dedicated 
pumps. Each chiller has an on-board local-loop 
feedback controller that adjusts its cooling capac¬ 
ity to maintain a specified set point for chilled wa¬ 
ter supply temperature. Additional chiller control 
variables include the number of chillers operating 
and the relative loading for each chiller. The 
relative loading can be controlled for a given total 
cooling requirement by utilizing different chilled 
water supply set points for constant individual 
flow or by adjusting individual flows for iden¬ 
tical set points. Chillers can be augmented with 
thermal storage to reduce the amount of chiller 
power required during occupied periods in order 
to reduce on-peak energy and power demand 
costs. The thermal storage medium is cooled 
during the unoccupied, nighttime period using the 
chillers when electricity is less expensive. During 
occupied times, a combination of the chillers and 
storage are used to meet cooling requirements. 
Control of thermal storage is defined by the 
manner in which the storage medium is charged 
and discharged over time. 

The condenser water loop includes cooling 
towers, pumps, piping, and controls. Cooling 
towers reject energy to the ambient air through 
heat transfer and possibly evaporation (for wet 
towers). Larger systems tend to have multiple 
cooling towers with each tower having multiple 
cells that share a common sump with individual 
fans having two or more speed settings. The 
number of operating cells and tower fan speeds 
are often controlled using a local-loop feedback 
controller that maintains a set point for the 
water temperature leaving the cooling tower. 
Typically, condenser water pumps are dedicated 
to individual chillers (i.e., each pump is cycled 
on and off with a chiller that it serves). 

In order to better understand building control 
variables, interactions, and opportunities, con¬ 
sider how controls change in response to in¬ 
creasing building cooling requirements for the 
system of Fig. 2. As energy gains to the zones 
increase, zone temperatures rise in the absence 


of any control changes. However, zone feedback 
controllers respond to higher temperatures by 
increasing VAV box airflow through increased 
damper openings. This leads to reduced static 
pressure in the primary supply duct, which causes 
the AHU supply fan controller to create ad¬ 
ditional airflow. The greater airflow causes an 
increase in supply air temperatures leaving the 
cooling coils in the absence of any additional 
control changes. However, the supply air tem¬ 
perature feedback controllers respond by opening 
the cooling coil valves to increase water flow and 
the heat transfer to the chilled water (the cooling 
load). For variable-speed pumping, a feedback 
controller would respond to decreasing pressure 
differential by increasing the pump speed. The 
chillers would then experience increased loads 
due to higher return water temperature and/or 
flow rate that would lead to increases in chilled 
water supply temperatures. However, the chiller 
controllers would respond by increasing chiller 
cooling capacities in order to maintain the chilled 
water supply set points (and match the cooling 
coil loads). In turn, the heat rejection to the con¬ 
denser water loop would increase to balance the 
increased energy removed by the chiller, which 
would increase the temperature of water leaving 
the condenser. The temperature of water leav¬ 
ing the cooling tower would then increase due 
to an increase in its energy water temperature. 
However, a feedback controller would respond to 
the higher condenser water supply temperature 
and increase the tower airflow. At some load, 
the current set of operating chillers would not 
be sufficient to meet the load (i.e., maintain the 
chilled water supply set points) and an additional 
chiller would need to be brought online. 

This example illustrated how different local- 
loop controllers might respond to load changes 
in order to maintain individual set points. Super¬ 
visory control might change these set points and 
modes of operation. At any given time, it is pos¬ 
sible to meet the cooling needs with any number 
of different modes of operation and set points 
leading to the potential for control optimization 
to minimize an objective function. 

The system depicted in Fig. 2 and described 
in the preceding paragraphs represents one of 
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many different types of systems employed in 
commercial buildings. Medium-sized commer¬ 
cial buildings often employ multiple direct ex¬ 
pansion (DX) cooling systems where refrigerant 
flows between each AHU and an outdoor con¬ 
densing unit that employs variable capacity com¬ 
pressors. The compressor capacity is typically 
controlled to maintain a supply air temperature 
set point, which is still available as a supervisory 
control variable. However, the other condensing 
unit controls (e.g., condensing fans, expansion 
valve) are typically prepackaged with the unit 
and not available to the EMCS. For smaller com¬ 
mercial buildings, rooftop units (RTUs) are typi¬ 
cally employed that contain a prepackaged AHU, 
refrigeration cycle, and controls. Each RTU di¬ 
rectly cools the air in a portion of the build¬ 
ing in response to an individual thermostat. The 
capacity control is typically on/off staging of 
the compressor and constant volume air flow is 
mostly commonly employed. In this case, the 
only free supervisory control variables are the 
thermostat set points. In general, the degrees 
of freedom for supervisory control decrease in 
going from chilled water cooling plants to DX 
system to RTUs. In addition, the utility rate in¬ 
centives for taking advantage of thermal storage 
and advanced controls are greater for large com¬ 
mercial building applications. 


Optimal Supervisory Control 

In commercial buildings, it is common to have 
electric utility rates that have energy and de¬ 
mand charges that vary with time of use. The 
different rate periods can often include on-peak, 
off-peak, and mid-peak periods. For this type of 
rate structure, the time horizon necessary to truly 
minimize operating costs extends over the entire 
month. In order to better understand the con¬ 
trol issues, consider the general optimal control 
problem for minimizing monthly electrical utility 
charges associated with operating an all-electric 
cooling system in the presence of time-of-use 
and demand charges. The dynamic optimization 
involves minimizing 


rate periods 

J — ^2 Jp + maX = 1 to iV m onth 

p=\ 

with respect to a trajectory of controls up,k = 

1 tO A^nonth, Mp,k — 1 tO A^nonth 

where 
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Jp = Re,p^2 Ppj ^ ^ d ’P maX [Ppj] j = l t0 Np 
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with the optimization subject to the following 
general constraints 

P k = P(f k ,Hk,M k ) 
x k = x(.x k -u f k ,u h ,M k ) 
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Xp y min — Xp < Xp max yk(fk’“k,M k ) < y k 
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where J is the monthly electrical cost ($), the 
subscript p denotes that a quantity is limited 
to a particular type of rate period p (e.g., on- 
peak, off-peak, mid-peak), Rda is an anytime 
demand charge ($/kW) that is applied to the 
maximum power consumption occurring over the 
month Pp is average building power (kW) for 
stage k within the month, A m0 nth is the number 
of stages in the month, R e p is the unit cost of 
electrical energy ($/kWh) for rate period type 
p , At is the length of the stage (h), N p is the 
number of stages within rate period type p in 
the month, Rj tP is a rate period specific demand 
charge ($/kW) that is applied to the maximum 
power consumption occurring during the month 
within rate period p,fp is a vector of uncontrolled 
inputs that affect building power consumption 
(e.g., weather, internal gains), up is a vector of 
continuous supervisory control variables (e.g., 
supply air temperature set point), is a vector 
of discrete supervisory control variables (chiller 
on/off controls), xp is a vector of state variables, 
y p is a vector of outputs, and subscripts min and 
max denote minimum and maximum allowable 
values. 

The state variables could characterize the state 
of a storage device such as a chilled water or 
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ice storage tank. In this case, the states would 
be constrained between limits associated with 
the device’s practical storage potential. When 
variations in zone temperature set points are con¬ 
sidered within an optimization, then state vari¬ 
ables associated with the distributed nature of 
energy storage within the building structure are 
important to consider. The outputs are additional 
quantities of interest, such as equipment cooling 
capacities, occupant comfort conditions, etc., that 
often need to be constrained. In order to imple¬ 
ment a model-based predictive control scheme, it 
would be necessary to have models for the build¬ 
ing power, state variables, and outputs in terms 
of the control and uncontrolled variables. The 
uncontrolled variables would generally include 
weather (temperature, humidity, solar radiation) 
and internal gains due to lights and occupants, 
etc., that would need to be forecasted over a 
prediction horizon. 

It is not feasible to solve this type of monthly 
optimization problem for buildings for a variety 
of reasons, including that forecasting of uncon¬ 
trolled inputs beyond a day is unreliable. Also, 
it is very costly to develop the models necessary 
to implement a model-based control approach of 
this scale for a particular building. However, it 
is instructive to consider some special cases that 
have led to some practical control approaches. 
First of all, consider the problem of optimiz¬ 
ing only the cooling plant supervisory control 
variables when energy storage effects are not 
important. This is typically the case for typical 
systems that do not include ice or chilled water 
storage. For this scenario, the future does not 
matter and the problem can be reformulated as 
a static optimization problem, such that for each 
stage k the goal is to minimize the building power 
consumption, J = P k , with respect to the current 
supervisory control variables, i^and M&, and 
subject to constraints. ASHRAE (2011) presents 
a number of heuristic approaches for adjusting 
supervisory control variables that have been de¬ 
veloped through consideration of this type of 
optimization problem. This includes algorithms 
for adjusting cooling tower fan settings, chilled 
water supply air set points, and chiller sequencing 
and loading. 


Other heuristic approaches have been devel¬ 
oped (e.g., ASHRAE 2011; Braun 2007) for 
controlling the charging and discharging of ice 
or chilled water storage that were derived from 
a daily optimization formulation. For the case 
of real-time pricing of energy, heuristic charg¬ 
ing and discharging strategies were derived from 
minimizing a daily cost function 

Alay 

Jday — ^ ' Re,k Pk^-t 

k =1 

with respect to a trajectory of charging and dis¬ 
charging rates, subject to a constraint of equal 
beginning and ending storage states along with 
other constraints previously described. For the 
case of typical time-of-use (e.g., on-peak, off- 
peak) or real-time pricing energy charges with 
demand charges, heuristic strategies have been 
developed based on the same form of the daily 
cost function above with an added demand cost 
constraint Rd,kPk < TDC where TDC is a 
target demand cost that is set heuristically at the 
beginning of each billing period and updated at 
each stage as TDC k +1 = ma x(TDC k , Rd,kPk )• 
The heuristic storage control strategies can be 
readily combined with heuristic strategies for the 
cooling plant components. 

There has been a lot of interest in develop¬ 
ing practical methods for dynamic control of 
zone temperature set points within the bounds of 
comfort in order to minimize the utility costs. 
However, this is a very difficult problem and so 
this remains in the research realm for the time 
being with limited commercial success. 

Summary and Future Directions 

Although there is great opportunity for reduc¬ 
ing energy use and operating costs in buildings 
through optimal supervisory control, it is rarely 
implemented in practice because of high costs as¬ 
sociated with engineering site-specific solutions. 
Current efforts are underway to develop scal¬ 
able approaches that utilize general methods for 
configuring and learning models needed to im¬ 
plement model-based predictive control (MPC). 
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The current thinking is that solutions for optimal 
supervisory control will be implemented in the 
cloud and overlay on existing building automa¬ 
tion systems (BMS) through the use of universal 
middleware. This will reduce the cost of im¬ 
plementation compared to programming within 
existing BMS. There is also a need to reduce 
the cost of the additional sensors needed to im¬ 
plement MPC. One approach involves the use of 
virtual sensors that employ models with low-cost 
sensor inputs to provide higher value information 
that would normally require expensive sensors to 
obtain. 
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Abstract 

Cascading failure consists of complicated se¬ 
quences of dependent failures and can cause large 
blackouts. The emerging risk analysis, simula¬ 
tion, and modeling of cascading blackouts are 
briefly surveyed, and key references are sug¬ 
gested. 

Keywords 

Branching process; Dependent failures; Outage; 
Power law; Risk; Simulation 

Introduction 

The main mechanism for the rare and costly 
widespread blackouts of bulk power transmission 
systems is cascading failure. Cascading failure 


can be defined as a sequence of dependent events 
that successively weaken the power system (IEEE 
PES CAMS Task Force on Cascading Failure 
2008). The events and their dependencies are 
very varied and include outages or failures of 
many different parts of the power system and 
a whole range of possible physical, cyber, and 
human interactions. The events and dependen¬ 
cies tend to be rare or complicated, since the 
common and straightforward failures tend to be 
already mitigated by engineering design or oper¬ 
ating practice. 

Examples of a small initial outage cascad¬ 
ing into a complicated sequence of dependent 
outages are the August 10, 1996, blackout of 
the Northwest United States that disconnected 
power to about 7.5 million customers (Kosterev 
et al. 1999) and the August 14, 2003 blackout 
of about 50 million customers in Northeastern 
United States and Canada (US-Canada Power 
System Outage Task Force 2004). Although such 
extreme events are rare, the direct costs run to 
billions of dollars and the disruption to society 
is substantial. Large blackouts also have a strong 
effect on shaping the way power systems are 
regulated and the reputation of the power in¬ 
dustry. Moreover, some blackouts involve social 
disruptions that can multiply the economic dam¬ 
age. The hardship to people and possible deaths 
underscore the engineer’s responsibility to work 
to avoid blackouts. 

It is useful when analyzing cascading failure to 
consider cascading events of all sizes, including 
the short cascades that do not lead to interruption 
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of power to customers and cascades that in¬ 
volve events in other infrastructures, especially 
since loss of electricity can significantly impair 
other essential or economically important infras¬ 
tructures. Note that in the context of interact¬ 
ing infrastructures, the term “cascading failure” 
sometimes has the more restrictive definition of 
events cascading between infrastructures (Rinaldi 
etal. 2001). 

Blackout Risk 

Cascading failure is a sequence of dependent 
events that successively weaken the power sys¬ 
tem. At a given stage in the cascade, the previous 
events have weakened the power system so that 
further events are more likely. It is this depen¬ 
dence that makes the long series of cascading 
events that cause large blackouts likely enough 
to pose a substantial risk. (If the events were 
independent, then the probability of a large num¬ 
ber of events would be the product of the small 
probabilities of individual events and would be 
vanishingly small.) The statistics for the distribu¬ 
tion of sizes of blackouts have correspondingly 
“heavy tails” indicating that blackouts of all sizes, 
including large blackouts, can occur. Large black¬ 
outs are rare, but they are expected to happen 
occasionally, and they are not “perfect storms.” 

In particular, it has been observed in several 
developed countries that the probability distribu¬ 
tion of blackout size has an approximate power 
law dependence (Carreras et al. 2004b; Dobson 
et al. 2007; Hines et al. 2009). (The power law 
is of course limited in extent because every grid 
has a largest possible blackout in which the en¬ 
tire grid blacks out.) The power law region can 
be explained using ideas from complex systems 
theory. The main idea is that over the long term, 
the power grid reliability is shaped by the engi¬ 
neering responses to blackouts and the slow load 
growth and tends to evolve towards the power law 
distribution of blackout size (Dobson et al. 2007; 
Ren et al. 2008). 

Blackout risk can be defined as the prod¬ 
uct of blackout probability and blackout cost. 
One simple assumption is that blackout cost is 


roughly proportional to blackout size, although 
larger blackouts may well have costs (especially 
indirect costs) that increase faster than linearly. In 
the case of the power law dependence, the larger 
blackouts can become rarer at a similar rate as 
costs increase, and then the risk of large black¬ 
outs is comparable to or even exceeding the risk 
of small blackouts. Mitigation of blackout risk 
should consider both small and large blackouts, 
because mitigating the small blackouts that are 
easiest to analyze may inadvertently increase the 
risk of large blackouts (Newman et al. 201 1). 

Approaches to quantify blackout risk are chal¬ 
lenging and emerging, but there are also valuable 
approaches to mitigating blackout risk that do 
not quantify the blackout risk. The n-1 criterion 
that requires the power system to survive any sin¬ 
gle component failure has the effect of reducing 
cascading failures. The individual mechanisms 
of dependence in cascades (overloads, protection 
failures, voltage collapse, transient stability, lack 
of situational awareness, human error, etc.) can 
be addressed individually by specialized analyses 
or simulations or by training and procedures. 
Credible initiating outages can be sampled and 
simulated, and those resulting in cascading can 
be mitigated (Hardiman et al. 2004). This can be 
thought of as a “defense in depth” approach in 
which mitigating a subset of credible contingen¬ 
cies is likely to mitigate other possible contin¬ 
gencies not studied. Moreover, when blackouts 
occur, a postmortem analysis of that particular se¬ 
quence of events leads to lessons learned that can 
be implemented to mitigate the risk of some sim¬ 
ilar blackouts (US-Canada Power System Outage 
Task Force 2004). 

Simulation and Models 

There are many simulations of cascading 
blackouts using Monte Carlo and other methods, 
for example, Hardiman et al. (2004), Carreras 
et al. (2004a), Chen et al. (2005), Kirschen et al. 
(2004), Anghel et al. (2007), and Bienstock 
and Mattia (2007). All these simulations 
select and approximate a modest subset of the 
many physical and engineering mechanisms of 
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cascading failure, such as line overloads, voltage 
collapse, and protection failures. In addition, 
operator actions or the effects of engineering the 
network may also be crudely represented. Some 
of the simulations give a set of credible cascades, 
and others approximately estimate blackout risk. 

Except for describing the initial outages, 
where there is a wealth of useful knowledge, 
much of standard risk analysis and modeling does 
not easily apply to cascading failure in power 
systems because of the variety of dependencies 
and mechanisms, the combinatorial explosion of 
rare possibilities, and the heavy-tailed probability 
distributions. However, progress has been made 
in probabilistic models of cascading (Chen et al. 
2006; Dobson 2012; Rahnamay-Naeini et al. 
2012 ). 

The goal of high-level probabilistic models is 
to capture salient features of the cascade without 
detailed models of the interactions and dependen¬ 
cies. They provide insight into cascading failure 
data and simulations, and parameters of the high- 
level models can serve as metrics of cascading. 

Branching process models are transient 
Markov probabilistic models in which, after 
some initial outages, the outages are produced 
in successive generations. Each outage in each 
generation (a “parent” outage) produces a 
probabilistic number of outages (“children” 
outages) in the next generation according to an 
offspring probability distribution. The children 
failures then become parents to produce the next 
generation and so on, until there is a generation 
with zero children and the cascade stops. As 
might be expected, a key parameter describing 
the cascading is its average propagation, which 
is the average number of children outages 
per parent outage. Branching processes have 
traditionally been applied to many cascading 
processes outside of risk analysis (Harris), but 
they have recently been validated and applied to 
estimate the distribution of the total number of 
outages from utility outage data (Dobson 2012). 
A probabilistic model that tracks the cascade as 
it progresses in time through lumped grid states 
is presented in Rahnamay-Naeini et al. (2012). 

There is an extensive complex networks liter¬ 
ature on cascading in abstract networks that is 


largely motivated by idealized models of propa¬ 
gation of failures in the Internet. The way that 
failures propagate only along the network links 
is not realistic for power systems, which satisfy 
Kirchhoff’s laws so that many types of failures 
propagate differently. For example, line over¬ 
loads tend to propagate along cutsets of the net¬ 
work. However, the high-level qualitative results 
of phase transitions in the complex networks 
have provided inspiration for similar effects to 
be discovered in power system models (Dobson 
et al. 2007). There is also a possible research 
opportunity to elaborate the complex network 
models to incorporate some of the realities of 
power system and then validate them. 

Summary and Future Directions 

One challenge for simulation is what selection 
of phenomena to model and in how much detail 
in order to get useful engineering results. Faster 
simulations would help to ease the requirements 
of sampling appropriately from all the sources 
of uncertainty. Better metrics of cascading in 
addition to average propagation need to be de¬ 
veloped and extracted from real and simulated 
data in order to better quantify and understand 
blackout risk. There are many new ideas emerg¬ 
ing to analyze and simulate cascading failure, 
and the next step is to validate and improve 
these new approaches by comparing them with 
observed blackout data. Overall, there is an excit¬ 
ing challenge to build on the more deterministic 
approaches to mitigate cascading failure and find 
ways to more directly quantify and mitigate cas¬ 
cading blackout risk by coordinated analysis of 
real data, simulation, and probabilistic models. 

Cross-References 
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► Fyapunov Methods in Power System Stability 
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► Small Signal Stability in Electric Power 
Systems 




108 


Cash Management 


Bibliography 

Anghel M, Werley KA, Motter AE (2007) Stochastic 
model for power grid dynamics. In: 40th Hawaii inter¬ 
national conference on system sciences, Hawaii, Jan 
2007 

Bienstock D, Mattia S (2007) Using mixed-integer pro¬ 
gramming to solve power grid blackout problems. 
Discret Optim 4(1): 115-141 

Carreras BA, Lynch VE, Dobson I, Newman DE (2004a) 
Complex dynamics of blackouts in power transmission 
systems. Chaos 14(3):643-652 
Carreras BA, Newman DE, Dobson I, Poole AB (2004b) 
Evidence for self-organized criticality in a time series 
of electric power system blackouts. IEEE Trans Cir¬ 
cuits Syst Part I 51(9): 1733-1740 
Chen J, Thorp JS, Dobson I (2005) Cascading dynamics 
and mitigation assessment in power system distur¬ 
bances via a hidden failure model. Int J Electr Power 
Energy Syst 27(4):318-326 

Chen Q, Jiang C, Qiu W, McCalley JD (2006) Probability 
models for estimating the probabilities of cascading 
outages in high-voltage transmission network. IEEE 
Trans Power Syst 21(3): 1423-1431 
Dobson I, Carreras BA, Newman DE (2005) A loading- 
dependent model of probabilistic cascading failure. 
Probab Eng Inf Sci 19(1): 15-32 
Dobson I, Carreras BA, Lynch VE, Newman DE (2007) 
Complex systems analysis of series of blackouts: cas¬ 
cading failure, critical points, and self-organization. 
Chaos 17:026103 

Dobson I (2012) Estimating the propagation and extent 
of cascading line outages from utility data with a 
branching process, IEEE Trans Power Systems 27(4): 
2146-215 

Hardiman RC, Kumbale MT, Makarov YV (2004) An 
advanced tool for analyzing multiple cascading fail¬ 
ures. In: Eighth international conference on probabil¬ 
ity methods applied to power systems, Ames, Sept 
2004 

Harris TE (1989) Theory of branching processes. Dover, 
New York 

Hines P, Apt J, Talukdar S (2009) Large blackouts in North 
America: historical trends and policy implications. 
Energy Policy 37(12):5249-5259 
IEEE PES CAMS Task Force on Cascading Failure (2008) 
Initial review of methods for cascading failure analy¬ 
sis in electric power transmission systems. In: IEEE 
power and energy society general meeting, Pittsburgh, 
July 2008 

Kirschen DS, Strbac G (2004) Why investments do not 
prevent blackouts. Electr J 17(2):29-36 
Kirschen DS, Jawayeera D, Nedic DP, Allan RN (2004) 
A probabilistic indicator of system stress. IEEE Trans 
Power Syst 19(3): 1650-1657 
Kosterev D, Taylor C, Mittelstadt W (1999) Model vali¬ 
dation for the August 10, 1996 WSCC system outage. 
IEEE Trans Power Syst 14:967-979 


Newman DE, Carreras BA, Lynch VE, Dobson I (2011) 
Exploring complex systems aspects of blackout risk 
and mitigation. IEEE Trans Reliab 60(1): 134-143 
Rahnamay-Naeini M, Wang Z, Ghani N, Mammoli A, 
Hayat M.M (2014) Stochastic Analysis of Cascading- 
Failure Dynamics in Power Grids, to appear in IEEE 
Transactions on Power Systems 
Ren H, Dobson I, Carreras BA (2008) Long-term effect 
of the n-1 criterion on cascading line outages in an 
evolving power transmission grid. IEEE Trans Power 
Syst 23(3): 1217-1225 

Rinaldi SM, Peerenboom JP, Kelly TK (2001) Identifying, 
understanding, and analyzing critical infrastructure 
interdependencies. IEEE Control Syst Mag 21:11-25 
US-Canada Power System Outage Task Force (2004) 
Final report on the August 14, 2003 blackout in the 
United States and Canada 


Cash Management 

Abel Cadenillas 

University of Alberta, Edmonton, AB, Canada 

Abstract 

Cash on hand (or cash held in highly liquid form 
in a bank account) is needed for routine busi¬ 
ness and personal transactions. The problem of 
determining the right amount of cash to hold in¬ 
volves balancing liquidity against investment op¬ 
portunity costs. This entry traces solutions using 
both discrete-time and continuous-time stochas¬ 
tic models. 


Keywords 

Brownian motion; Inventory theory; Stochastic 
impulse control 


Definition 

A firm needs to keep cash, either in the form 
of cash on hand or as a bank deposit, to meet 
its daily transaction requirements. Daily inflows 
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and outflows of cash are random. There is a finite 
target for the cash balance, which could be zero in 
some cases. The firm wants to select a policy that 
minimizes the expected total discounted cost for 
being far away from the target during some time 
horizon. This time horizon is usually infinity. 
The firm has an incentive to keep the cash level 
low, because each unit of positive cash leads to 
a holding cost since cash has alternative uses 
like dividends or investments in earning assets. 
The firm has an incentive to keep the cash level 
high, because penalty costs are generated as a 
result of delays in meeting demands for cash. 
The firm can increase its cash balance by raising 
new capital or by selling some earnings assets, 
and it can reduce its cash balance by paying 
dividends or investing in earning assets. This 
control of the cash balance generates fixed and 
proportional transaction costs. Thus, there is a 
cost when the cash balance is different from its 
target, and there is also a cost for increasing 
or reducing the cash reserve. The objective of 
the manager is to minimize the expected total 
discounted cost. 

Hasbrouck (2007), Madhavan and Smidt 
(1993), and Manaster and Mann (1996) study 
inventories of stocks that are similar to the cash 
management problem. 


The Solution 

The qualitative form of optimal policies of the 
cash management problem in discrete time was 
studied by Eppen and Fama (1968, 1969), Girgis 
(1968), and Neave (1970). However, their solu¬ 
tions were incomplete. 

Many of the difficulties that they and other 
researchers encountered in a discrete-time frame¬ 
work disappeared when it was assumed that de¬ 
cisions were made continuously in time and that 
demand is generated by a Brownian motion with 
drift. Vial (1972) formulated the cash manage¬ 
ment problem in continuous time with fixed and 
proportional transaction costs, linear holding and 
penalty costs, and demand for cash generated 
by a Brownian motion with drift. Under very 


strong assumptions, Vial (1972) proved that if an 
optimal policy exists, then it is of a simple form 
(a, a, /3, b ). 

This means that the cash balance should 
be increased to level a when it reaches 
level a and should be reduced to level /3 
when it reaches level b. Constantinides (1976) 
assumed that an optimal policy exists and 
it is of a simple form, and determined the 
above levels and discussed the properties 
of the optimal solution. Constantinides and 
Richard (1978) proved the main assumptions of 
Vial (1972) and therefore obtained rigorously 
a solution for the cash management prob¬ 
lem. 

Constantinides and Richard (1978) applied 
the theory of stochastic impulse control devel¬ 
oped by Bensoussan and Lions (1973, 1975, 
1982). He used a Brownian motion W to model 
the uncertainty in the inventory. Formally, he 
considered a probability space (£2, T, P ) to¬ 
gether with a filtration (7^) generated by a one¬ 
dimensional Brownian motion W. He considered 
X t := inventory level at time t, and assumed 
that X is an adapted stochastic process given 
by 


X t 


nt nt 00 

= x- /ids - adW s + V/ {r( 
Jo Jo /=1 


Here, /z > 0 is the drift of the demand and a > 
0 is the volatility of the demand. Furthermore, r z - 
is the time of the / -th intervention and is the 
intensity of the i -th intervention. 

A stochastic impulse control is a pair 


i(r„y, (&)) 

= ( r o, t 2 ,..., r„,...; £o> £i, £ 2 , • • •, • • •)> 


where 


To = 0 < T\ < T2 < • • • < r n < • • • 

is an increasing sequence of stopping times and 
(£„) is a sequence of random variables such that 
each : Q i-> R is measurable with respect 
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to T Zn . We assume fo = 0. The management (the 
controller) decides to act at time 


* + = Xr, + 

We note that f* and X can also take negative 
values. The management wants to select the pair 


((*»); (£»)) 


that minimizes the functional J defined by 


e~ Xt f(X t )dt 

OO 

+ y ] e n s(.^n)l{r n <oo} 

n = 1 


J(x\ ((r„); (£„))) : = 


L 


where 


and 


f(x) = max(/zv, — 


fC + cf if f > 0 
g(£) = | min(C,D) iff = 0 
if f < 0 

Furthermore, A > 0 ,C,c,D,d e (0, oo), 
and h, p e (0, oo). Here, / represents the run¬ 
ning cost incurred by deviating from the aimed 
cash level 0, C represents the fixed cost per 
intervention when the management pushes the 
cash level upwards, D represents the fixed cost 
per intervention when the management pushes 
the cash level downwards, c represents the pro¬ 
portional cost per intervention when the manage¬ 
ment pushes the cash level upwards, d represents 
the proportional cost per intervention when the 
management pushes the cash level downwards, 
and A is the discount rate. 

The results of Constantinides were comple¬ 
mented, extended, or improved by Cadenillas 
et al. (2010), Cadenillas and Zapatero (1999), 
Feng and Muthuraman (2010), Harrison et al. 
(1983), and Ormeci et al. (2008). 
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Classical Frequency-Domain Design 
Methods 
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Abstract 

The design of feedback control systems in indus¬ 
try is probably accomplished using frequency- 
response (FR) methods more often than any other. 
Frequency-response design is popular primarily 
because it provides good designs in the face of 
uncertainty in the plant model (G(s) in Fig. 1). 
For example, for systems with poorly known 
or changing high-frequency resonances, we can 
temper the feedback design to alleviate the effects 
of those uncertainties. Currently, this tempering is 
carried out more easily using FR design than with 
any other method. The method is most effective 
for systems that are stable in open loop; however, 
it can also be applied to systems with instabilities. 
This section will introduce the reader to methods 
of design (i.e., finding D(s ) in Fig. 1) using lead 
and lag compensation. It will also cover the use 
of FR design to reduce steady-state errors and 
to improve robustness to uncertainties in high- 
frequency dynamics. 


Keywords 

Amplitude stabilization; Bandwidth; Bode plot; 
Crossover frequency; Frequency response; Gain; 
Gain stabilization; Gain margin; Notch filter; 
Phase; Phase margin; Stability 


Introduction 

Finding an appropriate compensation ( D(s ) in 
Fig. 1) using frequency response is probably the 
easiest of all feedback control design methods. 
Designs are achievable starting with the FR 
plots of both magnitude and phase of G(s ) then 
selecting D(s ) to achieve certain values of the 


gain and/or phase margins and system bandwidth 
or error characteristics. This section will cover 
the design process for finding an appropriate 
D(s). 


Design Specifications 

As discussed in Section X, the gain margin 
(GM) is the factor by which the gain can be 
raised before instability results. The phase 
margin (PM) is the amount by which the 
phase of D(j(o)G(jco ) exceeds —180° when 
\D(jo))G(jco)\ = 1, the crossover frequency. 
Performance requirements for control systems 
are often partially specified in terms of PM and/or 
GM. For example, a typical specification might 
include the requirement that PM > 50° and GM 
>5. It can be shown that the PM tends to 
correlate well with the damping ratio, £, of the 
closed-loop roots. In fact, it is shown in Franklin 
et al. (2010), that 

. PM 

s 100 

for many systems; however, the actual resulting 
damping and/or response overshoot of the final 
closed-loop system will need to be verified if they 
are specified as well as the PM. A PM of 50° 
would tend to yield a £ of 0.5 for the closed-loop 
roots, which is a modestly damped system. The 
GM does not generally correlate directly with the 
damping ratio, but is a measure of the degree of 
stability and is a useful secondary specification to 
ensure stability. 

Another design specification is the band¬ 
width, which was defined in Section X. The 
bandwidth is a direct measure of the frequency 
at which the closed-loop system starts to fail in 
following the input command. It is also a measure 
of the speed of response of a closed-loop system. 
Generally speaking, it correlates well with the 
step response rise time of the system. 

In some cases, the steady-state error must 
be less than a certain amount. As discussed in 
Franklin et al. (2010), the steady-state error is 
a direct function of the low-frequency gain of 
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Classical Frequency-Domain Design Methods, Fig. 1 Feedback system showing compensation, D(s) (Source: 
Franklin et al. (2010, p-249), Reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ) 


the FR magnitude plot. However, increasing the 
low-frequency gain typically will raise the en¬ 
tire magnitude plot upward, thus increasing the 
magnitude 1 crossover frequency and, therefore, 
increasing the speed of response and bandwidth 
of the system. 

Compensation Design 

In some cases, the design of a feedback com¬ 
pensation can be accomplished by using pro¬ 
portional control only, i.e., setting D(s ) = K 
(see Fig. 1) and selecting a suitable value for K. 
This can be accomplished by plotting the mag¬ 
nitude and phase of G(s ), looking at \G(jco)\ 
at the frequency where AG {jco) = —180°, and 
then selecting K so that \KG(jco)\ yields the 
desired GM. Similarly, if a particular value of 
PM is desired, one can find the frequency where 
Z G(jco ) = —180° + the desired PM. The value 
of \KG(jco)\ at that frequency must equal 1; 
therefore, the value of | G( jco) | must equal 1 /K. 
Note that the \KG(jco)\ curve moves vertically 
based on the value of K; however the AKG( jco) 
curve is not affected by the value of K. This 
characteristic simplifies the design process. 

In more typical cases, proportional feedback 
alone is not sufficient. There is a need for a 
certain damping (i.e., PM) and/or speed of re¬ 
sponse (i.e., bandwidth) and there is no value of 
K that will meet the specifications. Therefore, 
some increased damping from the compensation 
is required. Likewise, a certain steady-state error 
requirement and its resulting low-frequency gain 
will cause the \D( jco)G( jco) | to be greater than 
desired for an acceptable PM, so more phase 
lead is required from the compensation. This is 


typically accomplished by lead compensation. 
A phase increase (or lead) is accomplished by 
placing a zero in D(s). However, that alone 
would cause an undesirable high-frequency gain 
which would amplify noise; therefore, a first- 
order pole is added in the denominator at frequen¬ 
cies substantially higher than the zero break point 
of the compensator. Thus, the phase lead still 
occurs, but the amplification at high frequencies 
is limited. The resulting lead compensation has a 
transfer function of 

D(s) = K j ±J 1 . a < 1, (1) 

otTs - 1-1 

where l/a is the ratio between the pole/zero 
break-point frequencies. Figure 2 shows the fre¬ 
quency response of this lead compensation. The 
maximum amount of phase lead supplied is de¬ 
pendent on the ratio of the pole to zero and is 
shown in Fig. 3 as a function of that ratio. 

For example, a lead compensator with a 
zero at s = —2 (T = 0.5) and a pole at 
s = —10 (aT = 0.1) (and thus a = |) would 
yield the maximum phase lead of 0 max = 40°. 
Note from the figure that we could increase the 
phase lead almost up to 90° using higher values 
of the lead ratio, l/a; however, Fig. 2 shows that 
increasing values of I/o' also produces higher 
amplifications at higher frequencies. Thus, our 
task is to select a value of l/a that is a good 
compromise between an acceptable PM and 
acceptable noise sensitivity at high frequencies. 
Usually the compromise suggests that a lead 
compensation should contribute a maximum 
of 70° to the phase. If a greater phase lead is 
needed, then a double-lead compensation would 
be suggested, where 
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Classical 

Frequency-Domain 
Design Methods, Fig. 2 

Lead-compensation 
frequency response with 
1/a = 10, K = 1 
(Source: Franklin et al. 
(2010, p-349), Reprinted 
by permission of Pearson 
Education, Inc.) 
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and if R(s ) = 1 /s 2 for a unit ramp, Eq. (2) 
reduces to 


Even if a system had negligible amounts of 
noise present, the pole must exist at some point 
because of the impossibility of building a pure 
differentiator. No physical system - mechanical 
or electrical or digital - responds with infinite 
amplitude at infinite frequencies, so there will be 
a limit in the frequency range (or bandwidth) for 
which derivative information (or phase lead) can 
be provided. 

As an example of designing a lead compen¬ 
sator, let us design compensation for a DC motor 
with the transfer function 


G(s) = 


1 

+ 1 )' 


We wish to obtain a steady-state error of less than 
0.1 for a unit-ramp input and we desire a system 
bandwidth greater than 3 rad/sec. Furthermore, 
we desire a PM of 45°. To accomplish the error 
requirement, Franklin et al. shows that 


e ss = Urns 

s—K) 


1 


Ll + Z)(j)G(s)J 


R(s), (2) 


_I_}=_!_. 

.( +■ /J(.()[l/(i+ 1)11 D( 0) 

Therefore, we find that D(0), the steady-state 
gain of the compensation, cannot be less than 10 
if it is to meet the error criterion, so we pick 
K = 10. The frequency response of KG(s ) in 
Fig. 4 shows that the PM = 20° if no phase 
lead is added by compensation. If it were pos¬ 
sible to simply add phase without affecting the 
magnitude, we would need an additional phase 
of only 25° at the KG(s ) crossover frequency 
of co = 3 rad/sec. However, maintaining the 
same low-frequency gain and adding a compen¬ 
sator zero will increase the crossover frequency; 
hence, more than a 25° phase contribution will 
be required from the lead compensation. To be 
safe, we will design the lead compensator so 
that it supplies a maximum phase lead of 40°. 
Figure 3 shows that 1/a = 5 will accomplish 
that goal. We will derive the greatest benefit from 
the compensation if the maximum phase lead 
from the compensator occurs at the crossover fre¬ 
quency. With some trial and error, we determine 
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Classical Frequency-Domain Design Methods, Fig. 4 Frequency response for lead-compensation design (Source: 
Franklin et al. (2010, p-352), Reprinted by permission of Pearson Education, Inc.) 
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that placing the zero at go = 2 rad/sec and the 
pole at co = 10 rad/sec causes the maximum 
phase lead to be at the crossover frequency. The 
compensation, therefore, is 


D(s) = 10 


s / 2+1 

s/ io + r 


The frequency-response characteristics of 
L(s ) = D(s)G(s ) in Fig. 4 can be seen to yield 
a PM of 53°, which satisfies the PM and steady - 
state error design goals. In addition, the crossover 
frequency of 5 rad/sec will also yield a bandwidth 
greater than 3 rad/sec, as desired. 

Lag compensation is the same form as the 
lead compensation in Eq. (1) except that a > 1. 
Therefore, the pole is at a lower frequency than 
the zero and it produces higher gain at lower 
frequencies. The compensation is used primar¬ 
ily to reduce steady-state errors by raising the 
low-frequency gain but without increasing the 
crossover frequency and speed of response. This 
can be accomplished by placing the pole and zero 
of the lag compensation well below the crossover 
frequency. Alternatively, lag compensation can 
also be used to improve the PM by keeping the 
low frequency gain the same, but reducing the 
gain near crossover, thus reducing the crossover 
frequency. That will usually improve the PM 
since the phase of the uncompensated system 
typically is higher at lower frequencies. 

Systems being controlled often have high- 
frequency dynamic phenomena, such as 
mechanical resonances, that could have an 
impact on the stability of a system. In very- 
high-performance designs, these high-frequency 
dynamics are included in the plant model, 
and a compensator is designed with a specific 
knowledge of those dynamics. However, a more 
robust approach for designing with uncertain 
high-frequency dynamics is to keep the high- 
frequency gain low, just as we did for sensor- 
noise reduction. The reason for this can be 
seen from the gain-frequency relationship of 
a typical system, shown in Fig. 5. The only 
way instability can result from high-frequency 
dynamics is if an unknown high-frequency 
resonance causes the magnitude to rise above 1. 



Classical Frequency-Domain Design Methods, Fig. 5 

Effect of high-frequency plant uncertainty (Source: 
Franklin et al. (2010, p-372), Reprinted by permission of 
Pearson Education, Inc.) 


Conversely, if all unknown high-frequency 
phenomena are guaranteed to remain below a 
magnitude of 1, stability can be guaranteed. 
The likelihood of an unknown resonance in 
the plant G rising above 1 can be reduced if 
the nominal high-frequency loop gain (L) is 
lowered by the addition of extra poles in D(s). 
When the stability of a system with resonances 
is assured by tailoring the high-frequency 
magnitude never to exceed 1, we refer to this 
process as amplitude or gain stabilization. 
Of course, if the resonance characteristics are 
known exactly and remain the same under all 
conditions, a specially tailored compensation, 
such as a notch filter at the resonant frequency, 
can be used to tailor the phase for stability even 
though the amplitude does exceed magnitude 1 
as explained in Franklin et al. (2010). Design 
of a notch filter is more easily carried out using 
root locus or state-space design methods, all of 
which are discussed in Franklin et al. (2010). This 
method of stabilization is referred to as phase 
stabilization. A drawback to phase stabilization 
is that the resonance information is often not 
available with adequate precision or varies with 
time; therefore, the method is more susceptible 
to errors in the plant model used in the design. 
Thus, we see that sensitivity to plant uncertainty 
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and sensor noise are both reduced by sufficiently 
low gain at high-frequency. 

Summary and Future Directions 

Before the common use of computers in design, 
frequency-response design was the only widely 
used method. While it is still the most widely 
used method for routine designs, complex sys¬ 
tems and their design are being carried out using 
a multitude of methods. This section introduces 
just one of many possible methods. 

Cross-References 

► Frequency-Response and Frequency-Domain 
Models 

► Polynomial/Algebraic Design Methods 

► Quantitative Feedback Theory 

► Spectral Factorization 
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Abstract 

Robust control theory has introduced several 
new and challenging problems for researchers. 
Some of these problems have been solved by 


innovative approaches and led to the development 
of new and efficient algorithms. However, 
some of the other problems in robust control 
theory had attracted significant amount of 
research, but none of the proposed algorithms 
were efficient, namely, had execution time 
bounded by a polynomial of the “problem 
size.” Several important problems in robust 
control theory are either of decision type or 
of computation/approximation type, and one 
would like to have an algorithm which can be 
used to answer all or most of the possible cases 
and can be executed on a classical computer in 
reasonable amount of time. There is a branch 
of theoretical computer science, called theory 
of computation, which can be used to study the 
difficulty of problems in robust control theory. 
In the following, classical computer system, 
algorithm, efficient algorithm, unsolvability, 
tractability, AF-hardness, and AF-completeness 
will be introduced in a more rigorous fashion, 
with applications to problems from robust control 
theory. 

Keywords 

Approximation complexity; Computational com¬ 
plexity; AF-complete; AF-hard; Unsolvability 

Introduction 

The term algorithm is used to refer to differ¬ 
ent notions which are all somewhat consistent 
with our intuitive understanding. This ambiguity 
may sometimes generate significant confusion, 
and therefore, a rigorous definition is of extreme 
importance. One commonly accepted “intuitive” 
definition is a set of rules that a person can per¬ 
form with paper and pencil. However, there are 
“algorithms” which involve random number gen¬ 
eration, for example, finding a primitive root in 
1L P (Knuth 1997). Based on this observation, one 
may ask whether a random number generation- 
based set of rules should be also considered 
as an algorithm, provided that it will terminate 
after finitely many steps for all instances of the 
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problem or for a significant majority of the cases. 
In a similar fashion, one may ask whether any 
real number, including irrational ones which can¬ 
not be represented on a digital computer with¬ 
out an approximation error, should be allowed 
as an input to an algorithm and, furthermore, 
should all calculations be limited to algebraic 
functions only or should exact calculation of non- 
algebraic functions, e.g., trigonometric functions, 
the gamma function, etc., be acceptable in an 
algorithm. Although all of these seem acceptable 
with respect to our intuitive understanding of the 
algorithm, from a rigorous point of view, they 
are different notions. In the context of robust 
control theory, as well as many other engineer¬ 
ing disciplines, there is a separate and widely 
accepted definition of algorithm, which is based 
on today’s digital computers, more precisely the 
Turing machine (Turing 1936). Alan M. Turing 
defined a “hypothetical computation machine” 
to formally define the notions of algorithm and 
computability. A Turing machine is, in principle, 
quite similar to today’s digital computers widely 
used in many engineering applications. The engi¬ 
neering community seems to widely accept the 
use of current digital computers and Turing’s 
definitions of algorithm and computability. 

However, depending on new scientific, engi¬ 
neering, and technological developments, supe¬ 
rior computation machines may be constructed. 
For example, there is no guarantee that quantum 
computing research will not lead to superior com¬ 
putation machines (Chen et al. 2006; Kaye et al. 
2007). In the future, the engineering community 
may feel the need to revise formal definitions 
of algorithm, computability, tractability, etc., if 
such superior computation machines can be con¬ 
structed and used for scientific/engineering appli¬ 
cations. 

Turing Machines and Unsolvability 

Turing machine is basically a mathematical 
model of a simplified computation device. The 
original definition involves a tape-like device 
for memory. For an easy-to-read introduction 
to the Turing machine model, see Garey and 
Johnson (1979) and Papadimitriou (1995), and 
for more details, see Hopcroft et al. (2001), 


Lewis and Papadimitriou (1998), and Sipser 
(2006). Despite this being a quite simple 
and low-performance “hardware” compared to 
today’s engineering standards, the following 
two observations justify their use in the study 
of computational complexity of engineering 
problems. Anything which can be solved on 
today’s current digital computers can be solved 
on a Turing machine. Furthermore, a polynomial¬ 
time algorithm on today’s digital computers will 
correspond to again a polynomial-time algorithm 
on the original Turing machine, and vice versa. 
A widely accepted definition for an algorithm 
is a Turing machine with a program, which is 
guaranteed to terminate after finitely many steps. 

For some mathematical and engineering prob¬ 
lems, it can be shown that there can be no algo¬ 
rithm which can handle all possible cases. Such 
problems are called unsolvable. The condition 
“all cases” may be considered too tough, and 
one may argue that such negative results have 
only theoretical importance and have no practi¬ 
cal implications. But such results do imply that 
we should concentrate our efforts on alterna¬ 
tive research directions, like the development of 
algorithms only for cases which appear more 
frequently in real scientific/engineering applica¬ 
tions, without asking the algorithm to work for 
the remaining cases as well. 

Here is a famous unsolvable mathematical 
problem: Hilbert’s tenth problem is basically 
the development of an algorithm for testing 
whether a Diophantine equation has an integer 
solution. However, in 1970, Matijasevich 
showed that there can be no such algorithm 
(Matiyasevich 1993). Therefore, we say that 
the problem of checking whether a Diophantine 
equation has an integer solution is unsolvable. 

Several unsolvability results for dynamical 
systems can be proved by using the Post 
correspondence problem (Davis 1985) and the 
embedding of free semigroups into matrices. 
For example, the problem of checking the 
stability of saturated linear dynamical systems 
is proved to be undecidable (Blondel et al. 2001), 
meaning that no general stability test algorithm 
can be developed for such systems. A similar 
unsolvability result is reported in Blondel and 
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Tsitsiklis (2000a) for boundedness of switching 
systems of the type 

x(k + 1) = Affile ), 

where / is assumed to be an arbitrary and un¬ 
known function from N into {0,1}. A closely 
related asymptotic stability problem is equivalent 
to testing whether the joint spectral radius (JSR) 
(Rota and Strang 1960) of a set of matrices is less 
than one. For a quite long period of time, there 
was a conjecture called the finiteness conjecture 
(FC) (Lagarias and Wang 1995), which was gen¬ 
erally believed or hoped to be true, at least for 
a group of researchers. FC may be interpreted 
as “For asymptotic stability of x(k + 1) = 
Af(k)x(k) type switching systems, it is enough 
to consider periodic switchings only.” There was 
no known counterexample, and the truth of this 
conjecture would imply existence of an algorithm 
for the abovementioned JSR problem. However, 
it was shown in Bousch and Mairesse (2002) 
that FC is not true (see Blondel et al. (2003) for 
a simplified proof). There are numerous known 
computationally very valuable procedures related 
to JSR approximation, for example, see Blon¬ 
del and Nesterov (2005) and references therein. 
However, the development of an algorithm to test 
whether JSR is less than one remains as an open 
problem. 

For further results on unsolvability and un¬ 
solved problems in robust control, see Blondel 
et al. (1999), Blondel and Megretski (2004), and 
references therein. 

Tractability, A/P-Hardness, and 
A/P-Completeness 

The engineering community is interested in not 
only solution algorithms but algorithms which 
are fast even in the worst case and if not on the 
average. Sometimes, this speed requirement may 
be relaxed to being fast for most of the cases 
and sometimes to only a significant percentage 
of the cases. Currently, the theory of computa¬ 
tion is developed around the idea of algorithms 
which are polynomial time even in the worst case, 
namely, execution time bounded by a polynomial 
of the problem size (Garey and Johnson 1979; 


Papadimitriou 1995). Such algorithms are also 
called efficient, and associated problems are clas¬ 
sified as tractable. The term problem size means 
number of bits used in a suitable encoding of 
the problem (Garey and Johnson 1979; Papadim¬ 
itriou 1995). 

One may argue that this worst-case approach 
of being always polynomial time is a quite con¬ 
servative requirement. In reality, a practicing en¬ 
gineer may consider being polynomial time on 
the average quite satisfactory for many appli¬ 
cations. The same may be true for algorithms 
which are polynomial time for most of the cases. 
However, the existing computational complex¬ 
ity theory is developed around this idea of be¬ 
ing polynomial time even in the worst case. 
Therefore, many of the computational complex¬ 
ity results proved in the literature do not imply 
the impossibility of algorithms which are nei¬ 
ther polynomial time on the average nor poly¬ 
nomial time for most of the cases. Note that 
despite not being efficient, such algorithms may 
be considered quite valuable by a practicing en¬ 
gineer. Tractability and efficiency can be defined 
in several different ways, but the abovementioned 
polynomial-time solvability even in the worst- 
case approach is widely adopted by the engineer¬ 
ing community. 

AP-hardness and AP-completeness are origi¬ 
nally defined to express the inherent difficulty of 
decision-type problems, not for approximation- 
type problems. Although approximation com¬ 
plexity is an important and active research area in 
the theory of computation (Papadimitriou 1995), 
most of the classical results are on decision-type 
problems. Many robust control-related problems 
can be formulated as “Check whether y < 1,” 
which is a decision-type problem. Approximate 
value of y may not be always good enough to 
“solve” the problem, i.e., to decide about robust 
stability. For certain other engineering applica¬ 
tions for which approximate values of optimiza¬ 
tion problems are good enough to “solve” the 
problem, the complexity of a decision problem 
may not be very relevant. For example, in a 
minimum effort control problem, usually there 
may be no point in computing the exact value of 
the minimum, because good approximations will 
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be just fine for most cases. However, for a robust 
control problem, a result like y = 0.99 d= 0.02 
may be not enough to decide about robust sta¬ 
bility, although the approximation error is about 
2 % only. Basically, both the conservativeness of 
the current tractability definition and the differ¬ 
ences between decision- and approximation-type 
problems should be always kept in mind when 
interpreting computational complexity results re¬ 
ported here as well as in the literature. 

In this subsection, and in the next one, we 
will consider decision problems only. The class 
P corresponds to decision problems which can 
be solved by a Turing machine with a suitable 
program in polynomial time (Garey and Johnson 
1979). This is interpreted as decision problems 
which have polynomial-time solution algorithms. 
The definition of the class NP is more technical 
and involves nondeterministic Turing machines 
(Garey and Johnson 1979). It may be interpreted 
as the class of decision problems for which the 
truth of the problem can be verified in polynomial 
time. It is currently unknown whether P and NP 
are equal or not. This is a major open problem, 
and the importance of it in the theory of computa¬ 
tion is comparable to the importance of Riemann 
hypothesis in number theory. 

A problem is AP-complete if it is NP and 
every NP problem polynomially reduces to it 
(Garey and Johnson 1979). For an AP-complete 
problem, being in P is equivalent to P = NP. 
There are literally hundreds of such problems, 
and it is generally argued that since after sev¬ 
eral years of research nobody was able to de¬ 
velop a polynomial-time algorithm for these NP- 
complete problems, there is probably no such 
algorithm, and most likely P ^ NP. Although 
current evidence is more toward P ^ NP , this 
does not constitute a formal proof, and the history 
of mathematics and science is full of surprising 
discoveries. 

A problem (not necessarily NP) is called NP- 
hard if and only if there is an AP-complete 
problem which is polynomial time reducible to 
it (Garey and Johnson 1979). Being NP -hard is 
sometimes called being intractable and means 
that unless P = NP , which is considered to 
be very unlikely by a group of researchers, no 


polynomial-time solution algorithm can be de¬ 
veloped. All AP-complete problems are also NP- 
hard, but they are only as “hard” as any other 
problem in the set of AP-complete problems. 

The first known AP-complete problem is 
SATISFIABILITY (Cook 1971). In this problem, 
there is a single Boolean equation with several 
variables, and we would like to test whether 
there is an assignment to these variables which 
make the Boolean expression true. This important 
discovery enabled proofs of AP-completeness 
or AP-hardness of several other problems by 
using simple polynomial reduction techniques 
only (Garey and Johnson 1979). Among these, 
quadratic programming is an important one and 
led to the discovery of several other complexity 
results in robust control theory. The quadratic 
programming (QP) can be defined as 

q = min x T Qx + c T x, 

Ax<b 

more precisely testing whether q < 1 or not 
(decision version). When the matrix Q is positive 
definite, convex optimization techniques can be 
used; however, the general version of the problem 
is AP-hard. 

A related problem is linear programming (LP) 

q = min c T x, 

Ax<b 

which is used in certain robust control problems 
(Dahleh and Diaz-Bobillo 1994) and has a quite 
interesting status. Simplex method (Dantzig 
1963) is a very popular computational technique 
for LP and is known to have polynomial-time 
complexity on the “average” (Smale 1983). 
However, there are examples where the simplex 
method requires exponentially growing number 
of steps with the problem size (Klee and 
Minty 1972). In 1979, Khachiyan proposed 
the ellipsoid algorithm for LP, which was the 
first known polynomial-time approximation 
algorithm (Schrijver 1998). Because of the nature 
of the problem, one can answer the decision 
version of LP in polynomial time by using 
the ellipsoid algorithm for approximation and 
stopping when the error is below a certain level. 
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But all of these results are for standard Turing 
machines with input parameters restricted to 
rational numbers. An interesting open problem is 
whether LP admits a polynomial algorithm in the 
real number model of computation. 

Complexity of Certain Robust Control 
Problems 

There are several computational complexity re¬ 
sults for robust control problems (see Blondel and 
Tsitsiklis (2000b) for a more detailed survey). 
Here we summarize some of the key results on 
interval matrices and structured singular values. 

Kharitonov theorem is about robust Hurwitz 
stability of polynomials with coefficients 
restricted to intervals (Kharitonov 1978). The 
problem is known to have a surprisingly simple 
solution; however, the matrix version of the 
problem has a quite different nature. If we have a 
matrix family 

A= {A e K" x " : ai j < Aij <p u , 

ij = 1(1) 

where are given constants for i,j = 

l,... ,n, then it is called an interval matrix. Such 
matrices do occur in descriptions of uncertain 
dynamical systems. The following two stability 
problems about interval matrices are known to be 
AP-hard: 

Interval Matrix Problem 1 (IMP1): Decide 
whether a given interval matrix, A, is robust 
Hur- witz stable or not. Namely, check 
whether all members of A are Hurwitz-stable 
matrices, i.e., all eigenvalues are in open left 
half plane. 

Interval Matrix Problem 2 (IMP2): Decide 
whether a given interval matrix, A, has a 
Hurwitz-stable matrix or not. Namely, check 
whether there exists at least one matrix in A 
which is Hurwitz stable. 

For a proof of AP-hardness of IMP1, see Pol- 
jak and Rohn (1993) and Nemirovskii (1993), 
and for a proof of IMP2, see Blondel and 
Tsitsiklis (1997). 

Another important problem is related to 
structured singular values (SSV) and linear 


fractional transformations (LFT), which 
are mainly used to study systems which 
have component-level uncertainties (Packard 
and Doyle 1993). Basically, bounds on the 
component-level uncertainties are given, and 
we would like to check whether the system 
is robustly stable or not. This is known to be 
AP-hard. 

Structured Singular Value Problem (SSVP): 

Given a matrix M and uncertainty structure 
A, check whether the structured singular value 
/x a (M) < 1 . 

This is proved to be AP-hard for real, and mixed, 
uncertainty structures (Braatz et al. 1994), as well 
as for complex uncertainties with no repetitions 
(Toker and Ozbay 1996, 1998). 


Approximation Complexity 

Decision version of QP is AP-hard, but approx¬ 
imation of QP is quite “difficult” as well. An 
approximation is called a /x-approximation if 
the absolute value of the error is bounded by 
/x times the absolute value of max-min of the 
function. The following is a classical result on 
QP (Bellare and Rogaway 1995): Unless P = 
AP, QP does not admit a polynomial-time /x- 
approximation algorithm even for /x < 1/3. 
This is sometimes informally stated as “QP is 
AP-hard to approximate.” Much work is needed 
toward similar results on robustness margin and 
related optimization problems of the classical 
robust control theory. 

An interesting case is the complex structured 
singular value computation with no repeated un¬ 
certainties. There is a convex relaxation of the 
problem, the standard upper bound ~jl , which is 
known to result in quite tight approximations for 
most cases of the original problem (Packard and 
Doyle 1993). However, despite strong numerical 
evidence, a formal proof of “good approximation 
for most cases” result is not available. We also do 
not have much theoretical information about how 
hard it is to approximate the complex structured 
singular value. For example, it is not known 
whether it admits a polynomial-time approxima¬ 
tion algorithm with error bounded by, say, 5 %. 
In summary, much work needs to be done in these 
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directions for many other robust control problems 
whose decision versions are AF-hard. 


Summary and Future Directions 

The study of the “Is P ^ NPT question turned 
out to be a quite difficult one. Researchers agree 
that really new and innovative tools are needed 
to study this problem. On one other extreme, one 
can question whether we can really say some¬ 
thing about this problem within the Zermelo- 
Fraenkel (ZF) set theory or will it have a status 
similar to axiom of choice (AC) and the contin¬ 
uum hypothesis (CH) where we can neither refute 
nor provide a proof (Aaronson 1995). Therefore, 
the question may be indeed much deeper than we 
thought, and standard axioms of today’s mathe¬ 
matics may not be enough to provide an answer. 
As for any such problem, we can still hope that 
in the future, new “self-evident” axioms may be 
discovered, and with the help of them, we may 
provide an answer. 

All of the complexity results mentioned here 
are with respect to the standard Turing machine 
which is a simplified model of today’s digital 
computers. Depending on the progress in science, 
engineering, and technology, if superior compu¬ 
tation machines can be constructed, then some 
of the abovementioned problems can be solved 
much faster on these devices, and current re¬ 
sults/problems of the theory of computation may 
no longer be of great importance or relevance for 
engineering and scientific applications. In such a 
case, one may also need to revise definitions of 
the terms algorithm, tractable, etc., according to 
these new devices. 

Currently, there are several AF-hardness 
results about robust control problems, mostly 
AF-hardness of decision problems about 
robustness. However, much work is needed on 
the approximation complexity and conservatism 
of various convex relaxations of these problems. 
Even if a robust stability problem is AF-hard, a 
polynomial-time algorithm to estimate robustness 
margin with, say, 5 % error is not ruled out 
with the AF-hardness of the decision version 
of the problem. Indeed, a polynomial-time 


and 5 % error-bounded result will be of great 
importance for practicing engineers. Therefore, 
such directions should also be studied, and 
various meaningful alternatives, like being 
polynomial time on the average or for most 
of cases or anything which makes sense for a 
practicing engineer, should be considered as an 
alternative direction. 

In summary, computational complexity the¬ 
ory guides research on the development of algo¬ 
rithms, indicating which directions are dead ends 
and which directions are worth to investigate. 

Cross-References 

► Optimization Based Robust Control 

► Robust Control in Gap Metric 

► Robust Fault Diagnosis and Control 

► Robustness Issues in Quantum Control 
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Analyzing the Effect of Linear Time-Invariant 
Uncertainty in Linear Systems 
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CACSD 


Abstract 

Computer-aided control system design (CACSD) 
encompasses a broad range of Methods and 
tools and technologies for system modelling, 
control system synthesis and tuning, dynamic 
system analysis and simulation, as well as 
validation and verification. The domain of 
CACSD enlarged progressively over decades 
from simple collections of algorithms and 
programs for control system analysis and 
synthesis to comprehensive tool sets and user- 
friendly environments supporting all aspects 
of developing and deploying advanced control 
systems in various application fields. This entry 
gives a brief introduction to CACSD and reviews 
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the evolution of key concepts and technologies 
underlying the CACSD domain. Several 
cornerstone achievements in developing reliable 
numerical algorithms; implementing robust 
numerical software; and developing sophisticated 
integrated modelling, simulation, and design 
environments are highlighted. 

Keywords 

CACSD; Modelling; Numerical analysis; Simu¬ 
lation; Software tools 

Introduction 

To design a control system for a plant, a typical 
computer-aided control system design (CACSD) 
work flow comprises several interlaced activities. 

Model building is often a necessary first step 
consisting in developing suitable mathematical 
models to accurately describe the plant dynami¬ 
cal behavior. High-fidelity physical plant models 
obtained, for example, by using the first prin¬ 
ciples of modelling, primarily serve for anal¬ 
ysis and validation purposes using appropriate 
simulation techniques. These dynamic models 
are usually defined by a set of ordinary differ¬ 
ential equations (ODEs), differential algebraic 
equation (DAEs), or partial differential equations 
(PDEs). However, for control system synthesis 
purposes simpler models are required, which are 
derived by simplifying high-fidelity models (e.g., 
by linearization, discretization, or model reduc¬ 
tion) or directly determined in a specific form 
from input-output measurement data using sys¬ 
tem identification techniques. Frequently used 
synthesis models are continuous or discrete-time 
linear time-invariant (LTI) models describing the 
nominal behavior of the plant in a specific oper¬ 
ating point. The more accurate linear parameter 
varying (LPV) models may serve to account for 
uncertainties due to various performed approxi¬ 
mations, nonlinearities, or varying model param¬ 
eters. 

Simulation of dynamical systems is a closely 
related activity to modelling and is concerned 


with performing virtual experiments on a given 
plant model to analyze and predict the dynamic 
behavior of a physical plant. Often, modelling 
and simulation are closely connected parts of 
dedicated environments, where specific classes of 
models can be built and appropriate simulation 
methods can be employed. Simulation is also a 
powerful tool for the validation of mathematical 
models and their approximations. In the context 
of CACSD, simulation is frequently used as a 
control system tuning-aid, as, for example, in an 
optimization-based tuning approach using time- 
domain performance criteria. 

System analysis and synthesis are concerned 
with the investigation of properties of the un¬ 
derlying synthesis model and in the determi¬ 
nation of a control system which fulfills basic 
requirements for the closed-loop controlled plant, 
such as stability or various time or frequency re¬ 
sponse requirements. The analysis also serves to 
check existence conditions for the solvability of 
synthesis problems, according to established de¬ 
sign methodologies. An important synthesis goal 
is the guarantee of the performance robustness. 
To achieve this goal, robust control synthesis 
methodologies often employ optimization-based 
parameter tuning in conjunction with worst-case 
analysis techniques. A rich collection of reliable 
numerical algorithms are available to perform 
such analysis and synthesis tasks. These algo¬ 
rithms form the core of CACSD and their devel¬ 
opment represented one of the main motivations 
for CACSD-related research. 

Performance robustness assessment of the 
resulting closed-loop control system is a key 
aspect of the verification and validation activ¬ 
ity. For a reliable assessment, simulation-based 
worst-case analysis represents, often, the only 
way to prove the performance robustness of the 
synthesized control system in the presence of 
parametric uncertainties and variabilities. 

Development of CACSD Tools 

The development of CACSD tools for system 
analysis and synthesis started around 1960, 
immediately after general-purpose digital 
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computers, and new programming languages 
became available for research and development 
purposes. In what follows, we give a historical 
survey of these developments in the main 
CACSD areas. 

Modelling and Simulation Tools 

Among the first developments were modelling 
and simulation tools for continuous-time systems 
described by differential equations based on ded¬ 
icated simulation languages. Over 40 continuous- 
system simulation languages had been developed 
as of 1974 (Nilsen and Karplus 1974), which 
evolved out of attempts at digitally emulating the 
behavior of widely used analog computers before 
1960. A notable development in this period was 
the CSSL standard (Augustin et al. 1967), which 
defined a system as an interconnection of blocks 
corresponding to operators which emulated the 
main analog simulation blocks and implied the 
integration of the underlying ODEs using suitable 
numerical methods. For many years, the ACSL 
preprocessor to Fortran (Mitchel and Gauthier 
1976) was one of the most successful implemen¬ 
tations of the CSSF standard. 

A turning point was the development of 
graphical user interfaces allowing graphical 
block diagram-based modelling. The most 
important developments were SystemBuild (Shah 
et al. 1985) and SIMUFAB (later marketed as 
Simulink) (Grace 1991). Both products used 
a customizable set of block libraries and were 
seamlessly integrated in, respectively, MATRIXx 
and MATFAB, two powerful interactive 
matrix computation environments (see below). 
SystemBuild provided several advanced features 
such as event management, code generation, and 
DAE-based modelling and simulation. Simulink 
excelled from the beginning with its intuitive, 
easy-to-use user interface. Recent extensions of 
Simulink allow the modelling and simulation of 
hybrid systems, code generation for real-time 
applications, and various enhancements of the 
model building process (e.g., object-oriented 
modelling). 

The object-oriented paradigm for system mod¬ 
elling was introduced with Dymola (Elmqvist 
1978) to support physical system modelling 


based on interconnections of subsystems. The 
underlying modelling language served as the 
basis of the first version of Modelica (Elmquist 
et al. 1997), a modern equation-based modelling 
language which was the result of a coordinated 
effort for the unification and standardization of 
expertise gained over many years with object- 
oriented physical modelling. The latest devel¬ 
opments in this area are comprehensive model 
libraries for different application domains such 
as mechanical, electrical, electronic, hydraulic, 
thermal, control, and electric power systems. 
Notable commercial front-ends for Modelica 
are Dymola, MapleSim, and SystemModeler, 
where the last two are tightly integrated in the 
symbolic computation environments Maple and 
Mathematica, respectively. 

Numerical Software Tools 

The computational tools for CACSD rely on 
many numerical algorithms whose development 
and implementation in computer codes was 
the primary motivation of this research area 
since its beginnings. The Automatic Synthesis 
Program (ASP) developed in 1966 (Kalman and 
Englar 1966) was implemented in FAP (Fortran 
Assembly Program) and ran only on an IBM 
7090-7094 machine. The Fortran II version of 
ASP (known as FASP) can be considered to be 
the first collection of computational CACSD 
tools ported to several mainframe computers. 
Interestingly, the linear algebra computations 
were covered by only three routines (diagonal 
decomposition, inversion, and pseudoinverse), 
and no routines were used for eigenvalue or 
polynomial roots computation. The main analysis 
and synthesis functions covered the sampled-data 
discretization (via matrix exponential), minimal 
realization, time-varying Riccati equation solu¬ 
tion for quadratic control, filtering, and stability 
analysis. The FASP itself performed the required 
computational sequences by interpreting simple 
commands with parameters. The extensive docu¬ 
mentation containing a detailed description of 
algorithmic approaches and many examples 
marked the starting point of an intensive 
research on algorithms and numerical software, 
which culminated in the development of the 
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high-performance control and systems library 
SLICOT (Benner et al. 1999; Huffel et al. 
2004). In what follows, we highlight the main 
achievements along this development process. 

The direct successor of FASP is the Variable 
Dimension Automatic Synthesis Program 
(VASP) (implemented in Fortran IV on IBM 
360) (White and Lee 1971), while a further 
development was ORACLS (Armstrong 1978), 
which included several routines from the 
newly developed eigenvalue package EISPACK 
(Garbow et al. 1977; Smith et al. 1976) as 
well as solvers for linear (Lyapunov, Sylvester) 
and quadratic (Riccati) matrix equations. 
From this point, the mainstream development 
of numerical algorithms for linear system 
analysis and synthesis closely followed the 
development of algorithms and software for 
numerical linear algebra. A common feature of all 
subsequent developments was the extensive use 
of robust linear algebra software with the Basic 
Linear Algebra Subprograms (BLAS) (Lawson 
et al. 1979) and the Linear Algebra Package 
(LINPACK) for solving linear systems (Dongarra 
et al. 1979). Several control libraries have been 
developed almost simultaneously, relying on the 
robust numerical linear algebra core software 
formed of BLAS, LINPACK, and EISPACK. 
Notable examples are RASP (based partly on 
VASP and ORACLS) (Griibel 1983) - developed 
originally by the University of Bochum and later 
by the German Aerospace Center (DLR); BIMAS 
(Varga and Sima 1985) and BIMASC (Varga and 
Davidoviciu 1986) - two Romanian initiatives; 
and SLICOT (Boom et al. 1991) - a Benelux 
initiative of several universities jointly with the 
Numerical Algorithm Group (NAG). 

The last development phase was marked 
by the availability of the Linear Algebra 
Package (LAPACK) (Anderson et al. 1992), 
whose declared goal was to make the widely 
used EISPACK and LINPACK libraries run 
efficiently on shared memory vector and parallel 
processors. To minimize the development efforts, 
several active research teams from Europe 
started, in the framework of the NICONET 
project, a concentrated effort to develop a 
high-performance numerical software library 


for CACSD as a new significantly extended 
version of the original SLICOT. The goals 
of the new library were to cover the main 
computational needs of CACSD, by relying 
exclusively on LAPACK and BLAS, and to 
guarantee similar numerical performance as 
that of the LAPACK routines. The software 
development used rigorous standards for 
implementation in Fortran 77, modularization, 
testing, and documentation (similar to that used 
in LAPACK). The development of the latest 
versions of RASP and SLICOT eliminated 
practically any duplication of efforts and led to 
a library which contained the best software from 
RASP, SLICOT, BIMAS, and BIMASC. The 
current version of SLICOT is fully maintained by 
the NICONET association (http://www.niconet- 
ev.info/en/) and serves as basis for implementing 
advanced computational functions for CACSD in 
interactive environments as MATLAB (http:// 
www.mathworks.com), Maple (http://www. 
maplesoft.com/products/maple/), Scilab (http:// 
www.scilab.org/) and Octave (http://www.gnu. 
org/software/octave/) . 

Interactive Tools 

Early experiments during 1970-1985 included 
the development of several interactive CACSD 
tools employing menu-driven interaction, 
question-answer dialogues, or command 
languages. The April 1982 special issue of IEEE 
Control Systems Magazine was dedicated to 
CACSD environments and presented software 
summaries of 20 interactive CACSD packages. 
This development period was marked by the 
establishment of new standards for programming 
languages (Fortran 77, C), availability of high- 
quality numerical software libraries (BLAS, 
EISPACK, LINPACK, ODEPACK), transition 
from mainframe computers to minicomputers, 
and finally to the nowadays-ubiquitous personal 
computers as computing platforms, spectacular 
developments in graphical display technolo¬ 
gies, and application of sound programming 
paradigms (e.g., strong data typing). 

A remarkable event in this period was the 
development of MATLAB, a command language- 
based interactive matrix laboratory (Moler 1980). 
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The original version of MATLAB was written 
in Fortran 77. It was primarily intended as a 
student teaching tool and provided interactive 
access to selected subroutines from LINPACK 
and EISPACK. The tool circulated for a while 
in the public domain, and its high flexibility 
was soon recognized. Several CACSD-oriented 
commercial clones have been implemented in 
the C language, the most important among them 
being MATRIXx (Walker et al. 1982) and PC- 
MATLAB (Moler et al. 1985). 

The period after 1985 until around 2000 can 
be seen as a consolidation and expansion period 
for many commercial and noncommercial tools. 
In an inventory of CACSD-related software 
issued by the Benelux Working Group on 
Software (WGS) under the auspices of the 
IEEE Control Systems Society, there were in 
1992 in active development 70 stand-alone 
CACSD packages, 21 tools based on or similar 
to MATLAB, and 27 modelling/simulation 
environments. It is interesting to look more 
closely at the evolutions of the two main 
players MATRIXx and MATLAB, which 
took place under harshly competitive condi¬ 
tions. 

MATRIXx with its main components Xmath, 
SystemBuild, and AutoCode had over many 
years a leading role (especially among industrial 
customers), excelling with a rich functionality 
in domains such as system identification, 
control system synthesis, model reduction, 
modelling, simulation, and code generation. 
After 2003, MATRIXx (http://www.ni.com/ 
matrixx/) became a product of the National 
Instruments Corporation and complements 
its main product family Lab View, a visual 
programming language-based system design 
platform and development environment (http:// 
www.ni.com/labview). 

MATLAB gained broad academic acceptance 
by integrating many new methodological devel¬ 
opments in the control field into several control- 
related toolboxes. MATLAB also evolved as a 
powerful programming language, which allows 
easy object-oriented manipulation of different 
system descriptions via operator overloading. 
At present, the CACSD tools of MATLAB and 


Simulink represent the industrial and academic 
standard for CACSD. The existing CACSD 
tools are constantly extended and enriched 
with new model classes, new computational 
algorithms (e.g., structure-exploiting eigenvalue 
computations based on SLICOT), dedicated 
graphical user interfaces (e.g., tuning of PID 
controllers or control-related visualizations), 
advanced robust control system synthesis, etc. 
Also, many third-party toolboxes contribute to 
the wide usage of this tool. 

Basic CACSD functionality incorporating 
symbolic processing techniques and higher 
precision computations is available in the Maple 
product MapleSim Control Design Toolbox as 
well as in the Mathematica Control Systems 
product. Free alternatives to MATLAB are the 
MATLAB-like environments Scilab, a French 
initiative pioneered by INRIA, and Octave, which 
has recently added some CACSD functionality. 


Summary and Future Directions 

The development and maintenance of integrated 
CACSD environments, which provide support 
for all aspects of the CACSD cycle such as mod¬ 
elling, design, and simulation, involve sustained, 
strongly interdisciplinary efforts. Therefore, the 
CACSD tool development activities must rely 
on the expertise of many professionals covering 
such diverse fields as control system engineering, 
programming languages and techniques, man- 
machine interaction, numerical methods in linear 
algebra and control, optimization, computer 
visualization, and model building techniques. 
This may explain why currently only a few of 
the commercial developments of prior years are 
still in use and actively maintained/developed. 
Unfortunately, the number of actively developed 
noncommercial alternative products is even 
lower. The dominance of MATLAB, as a 
de facto standard for both industrial and 
academic usage of integrated tools covering 
all aspects of the broader area of computer- 
aided control engineering (CACE), cannot be 
overseen. 



Computer-Aided Control Systems Design: Introduction and Historical Overview 


127 


The new trends in CACSD are partly 
related to handling more complex applications, 
involving time-varying (e.g., periodic, multi¬ 
rate sampled-data, and differential algebraic) 
linear dynamic systems, nonlinear systems with 
many parametric uncertainties, and large-scale 
models (e.g., originating from the discretization 
of PDEs). To address many computational 
aspects of model building (e.g., model reduction 
of large order systems), optimization-based 
robust controller tuning using multiple-model 
approaches, or optimization-based robustness 
assessment using global-optimization techniques, 
parallel computation techniques allow substantial 
savings in computational times and facilitate 
addressing computational problems for large- 
scale systems. A topic which needs further 
research is the exploitation of the benefits of 
combining numerical and symbolic computations 
(e.g., in model building and manipulation). 

Cross-References 

► Interactive Environments and Software Tools 
for CACSD 

► Model Building for Control System Synthesis 

► Model Order Reduction: Techniques and Tools 

► Multi-domain Modeling and Simulation 

► Optimization-Based Control Design Tech¬ 
niques and Tools 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 

► Validation and Verification Techniques and 
Tools 


Recommended Reading 

The historical development of CACSD concepts 
and techniques was the subject of several ar¬ 
ticles in reference works Rimvall and Jobling 
(1995) and Schmid (2002). A selection of papers 
on numerical algorithms underlying the develop¬ 
ment of CACSD appeared in Patel et al. (1994). 
The special issue No. 2/2004 of the IEEE Con¬ 
trol Systems Magazine on Numerical Awareness 
in Control presents several surveys on different 


aspects of developing numerical algorithms and 
software for CACSD. 

The main trends over the last three decades 
in CACSD-related research can be followed in 
the programs/proceedings of the biannual IEEE 
Symposia on CACSD from 1981 to 2013 (partly 
available at http://ieeexplore.ieee.org) as well as 
of the triennial IFAC Symposia on CACSD from 
1979 to 2000. Additional information can be 
found in several CACSD-focused survey articles 
and special issues (e.g., No. 4/1982; No. 2/2000) 
of the IEEE Control Systems Magazine. 
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Abstract 

This entry provides a broad overview of the basic 
elements of consensus dynamics. It describes the 
classical Perron-Frobenius theorem that provides 
the main theoretical tool to study the convergence 
properties of such systems. Classes of consensus 
models that are treated include simple random 
walks on grid-like graphs and in graphs with a 
bottleneck, consensus on graphs with intermit¬ 
tently randomly appearing edges between nodes 
(gossip models), and models with nodes that 
do not modify their state over time (stubborn 
agent models). Application to cooperative con¬ 
trol, sensor networks, and socioeconomic models 
are mentioned. 
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Multi-agent Systems and Consensus 

Multi-agent systems constitute one of the fun¬ 
damental paradigms of science and technology 
of the present century (Castellano et al. 2009; 
Strogatz 2003). The main idea is that of creating 
complex dynamical evolutions from the interac¬ 
tions of many simple units. Indeed such collective 
behaviors are quite evident in biological and 
social systems and were indeed considered in 
earlier times. More recently, the digital revolu¬ 
tion and the miniaturization in electronics have 
made possible the creation of man-made com¬ 
plex architectures of interconnected simple de¬ 
vices (computers, sensors, cameras). Moreover, 
the creation of the Internet has opened a totally 
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new form of social and economic aggregation. 
This has strongly pushed towards a systematic 
and deep study of multi-agent dynamical sys¬ 
tems. Mathematically they typically consist of a 
graph where each node possesses a state vari¬ 
able; states are coupled at the dynamical level 
through dependences determined by the edges 
in the graph. One of the challenging problems 
in the field of multi-agent systems is to analyze 
the emergence of complex collective phenom¬ 
ena from the interactions of the units which are 
typically quite simple. Complexity is typically 
the outcome of the topology and the nature of 
interconnections. 

Consensus dynamics (also known as average 
dynamics) (Carli et al. 2008; Jadbabaie et al. 
2003) is one of the most popular and simplest 
multi-agent dynamics. One convenient way 
to introduce it is with the language of social 
sciences. Imagine that a number of independent 
units possess an information represented by a real 
number, for instance, such number can represent 
an opinion on a given fact. Units interact and 
change their opinion by averaging with the opin¬ 
ions of other units. Under certain assumptions, 
this will lead the all community to converge to a 
consensus opinion which takes into consideration 
all the initial opinion of the agents. In social 
sciences, empiric evidences (Galton 1907) have 
shown how such aggregate opinion may give a 
very good estimation of unknown quantities: such 
phenomenon has been proposed in the literature 
as wisdom of crowds (Surowiecki 2004). 

Consensus Dynamics, Graphs, and 
Stochastic Matrices 

Mathematically, consensus dynamics are special 
linear dynamical systems of type 

x(t + 1) = Px(t) (1) 

where x(t ) G M v and P G M VxV is a stochastic 
matrix (e.g., a matrix with nonnegative elements 
such that every row sums to 1). V represents the 
finite set of units (agents) in the network and 
x(t) v is to be interpreted has the state (opin¬ 
ion) of agent v at time t. Equation (1) implies 
that states of agents at time t + 1 are convex 


combinations of the components of v (t): this mo¬ 
tivates the term averaging dynamics. Stochastic 
matrices owe their name to their use in prob¬ 
ability: the term P vw can be interpreted as the 
probability of making a jump in the graph from 
state v to state w. In this way you construct what 
is called a random walk on the graph Q. 

The network structure is hidden in the nonzero 
pattern of P. Indeed we can associate to P a 
graph: Qp = (V, £p) where the set of edges is 
given by £p := {(u, v) e V x V | P uv > 0}. 
Elements in represent the communication 
edges among the units: if ( u , v) G £p , it means 
that unit u has access to the state of unit v. Denote 
by 1 G M v the all l’s vector. Notice that P 1 = 1: 
this shows that once the states of units are at 
consensus, they will no longer evolve. Will the 
dynamics always converge to a consensus point? 

Remarkably, some of the key properties of 
P responsible for the transient and asymptotic 
behavior of the linear system (1) are determined 
by the connectivity properties of the underlying 
graph Qp. We recall that, given two vertices 
u, v G V, a path (of length /) from u to v in Qp is 
any sequence of vertices u = u\, U 2 , ..., w/+i = 
v such that (w/,w/+ 1 ) e £p for every i = 
l,... ,s. Qp is said to be strongly connected if 
for any pair of vertices u ^ v in V there is a 
path in Qp connecting u to v. The period of a 
node u is defined as the greatest common divisor 
of the lengths of all closed paths from u to u. In 
the strongly connected graph, all nodes have the 
same period, and the graph is called aperiodic if 
such a period is 1. If v is a vector, we will use 
the notation v* to denote its transpose. If A is 
a finite set, \A\ denotes the number of elements 
in A. The following classical result holds true 
(Gantmacher 1959): 

Theorem 1 (Perron-Frobenius) Assume that 
P G M VxV is such that Qp is strongly connected 
and aperiodic. Then, 

1. 1 is an algebraically simple eigenvalue of P. 

2. There exists a (unique) probability vector n G 
M v (jt v > 0 for all v and ^ v 7t v = 1) which is 
a left eigenvector for P, namely, jt* P = tv*. 

3. All the remaining eigenvalues of P are of 
modulus strictly less than 1. 
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A straightforward linear algebra consequence 
of this result is that P l Itt* for t -> +oo. 
This yields 

lim x(t) = lim P r x(0) = 1 (tt*x(0)) 

t —>-|-oo t —>+00 

( 2 ) 

All agents’ state are thus converging to the 
common value 7T*x(0), called consensus point 
which is a convex combination of the initial states 
with weights given by the invariant probability 
components. 

If Tt is the uniform vector (i.e., 7t v = |V| -1 
for all units v), the common asymptotic value is 
simply the arithmetic mean of the initial states: 
all agents equally contribute to the final com¬ 
mon state. A special case when this happens 
is when P is symmetric. The distributed com¬ 
putation of the arithmetic mean is an impor¬ 
tant step to solve estimation problems for sensor 
networks. As a specific example, consider the 
situation where there are A sensors deployed 
in a certain area and each of them makes a 
noisy measurement of a physical quantity x. Let 
y v = x + co v be the measure obtained by 
sensor v, where co v is a zero mean Gaussian 
noise. It is well known that if noises are inde¬ 
pendent and identically distributed, the optimal 
mean square estimator of the quantity x given the 
entire set of measurements is exactly given 
by x = A -1 y v . Other fields of application 
is in the control of cooperative autonomous ve¬ 
hicles (Fax and Murray 2004; Jadbabaie et al. 
2003). 

Basic linear algebra allows to study the 
rate of convergence to consensus. Indeed, 
if Qp is strongly connected and aperiodic, 
the matrix P has all its eigenvalues in the 
unit ball: 1 is the only eigenvalue with 
modulo equal to 1, while all the others have 
modulo strictly less than one. If we denote 
by P 2 < 1 the largest modulo of such 

eigenvalues (different from 1), we can show 
that x(t) — 1(tt*x( 0)) converges exponentially 
to 0 as p\. In the following, we will briefly 
refer to P 2 as to the second eigenvalue 
of P. 


Examples and Large-Scale Analysis 

In this section, we present some classical exam¬ 
ples. Consider a strongly connected graph Q = 
(V,£). The adjacency matrix of Q is a square 
matrix Ag e {0, l} VxV such that ( Ag) uv = 1 
iff (u, v) e £ . Q is said to be symmetric if Ag is 
symmetric. Given a symmetric graph Q = (V, £), 
we can consider the stochastic matrix P given 
by Puv — d u (Ag^) uv where d u — ( Ag) uv 

is the degree of node u. P is called the simple 
random walk (SRW) on Q\ each agent gives the 
same weight to the state of its neighbors. Clearly, 
Qp = Q. A simple check shows that n v = d v /\£\ 
is the invariant probability for P . The consensus 
point is given by 

7T*X (0) = \£\~ l y]<4x(0)„ 

V 

Each node contributes with its initial state to this 
consensus with a weight which is proportional to 
the degree of the node. Notice that the SRW P 
is symmetric iff the graph is regular, namely, all 
units have the same degree. 

We now present a number of classical ex¬ 
amples based on families of graphs with larger 
and larger number of nodes A. In this setting, 
particularly relevant is to understand the behavior 
of the second eigenvalue P 2 as a function of 
A. Typically, one considers € > 0 fixed and 
solves the equation p\ = e. The solution r = 
(Inp^ -1 ) -1 Inc -1 will be called the convergence 
time : it essentially represents the time needed to 
shrink of a factor e the distance to consensus. 
Dependence of p 2 on A will also yield that r 
will be a function of A. In the sequel, we will 
investigate such dependence for SRW’s on certain 
classical families of graphs. 

Example 1 (SRW on a complete graph) Consider 
the complete graph on the set V: K\? := 

(V, Vx V) (also self loops are present). The SRW 
on Ky is simply given by P = A _1 ll* where 
N = | V |. Clearly, tt = A -1 !. Eigenvalues of P 
are 1 with multiplicity 1 and 0 with multiplicity 
A — 1. Therefore, P 2 = 0. Consensus in this case 
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is achieved in just one step: x(t) = N 1 11 *x(0) 
for all t > 1. 

Example 2 (SRW on a cycle graph) Consider the 
symmetric cycle graph Cn = (V, £) where V = 
{0,..., N — 1} and £ = {(r>, v + 1), (v + 1, v)} 
where sum is mod AC The graph Cn is clearly 
strongly connected and is also aperiodic if N is 
odd. The corresponding SRW P has eigenvalues 

2 7tk 

kr — cos 

N 

Therefore, if N is odd, the second eigenvalue is 
given by 

P2=oos 2 ^r = l-2n 2 E +o(N -2 ) ^ 

for N +oo 

while the corresponding convergence time is 
given by 

r = (In p- 1 ) -1 In <T 1 x N 2 for N -» +oo 

Example 2 (SRW on toroidal grids) The toroidal 
d -grids C d is formally obtained as a product 
of cycle graphs. The number of nodes is N = 
n d . It can be shown that the convergence time 
behaves as 

r x N 1/d for N -» +oo 

Convergence time exhibits a slower growth in N 
as the dimension d of the grid increases: this is 
intuitive since the increase in d determines a bet¬ 
ter connectivity of the graph and a consequently 
faster diffusion of information. 

For a general stochastic matrix (even for SRW 
on general graphs), the computation of the second 
eigenvalue is not possible in closed form and can 
actually be also difficult from a numerical point 
of view. It is therefore important to develop tools 
for efficient estimation. One of these is based 
on the concept of bottleneck: if a graph can be 
splitted into two loosely connected parts, then 


consensus dynamics will necessarily exhibit a 
slow convergence. 

Formally, given a symmetric graph Q = 
( V ,£) and a subset of nodes S c V, define 
es as the number of edges with at least one node 
in S and ess as the number of edges with both 
nodes in S . The bottleneck of S in G is defined 
as 0(A) = ess/es • Finally, the bottleneck ratio 
of Q is defined as 

O* := min 0(A) 

S:e s /e< 1/2 

where e = \£\ is the number of edges in the 
graph. 

Let P be the SRW on Q and let P 2 be its second 
eigenvalue. Then, 

Proposition 1 (Cheeger bound Levin et al. 
2008) 

l-p 2 <20*. (4) 

Example 4 (Graphs with a bottleneck) Consider 
two complete graphs with n nodes connected by 
just one edge. If A is the set of nodes of one of 
the two complete graphs, we obtain 


Bound (4) implies that the convergence time is at 
least of the order of n 2 in spite of the fact that 
in each complete graph convergence would be in 
finite time! 


Other Models 

The systems studied so far are based on the 
assumptions that units all behave the same, and 
they share a common clock and update their state 
in a synchronous fashion. In this section, we 
discuss more general models. 

Random Consensus Models 

Regarding the assumption of synchronicity, it 
turns out to be unfeasible in many contexts. For 
instance, in the opinion dynamics modeling, it 





132 


Consensus of Complex Multi-agent Systems 


is not realistic to assume that all interactions 
happen at the same time: agents are embedded in 
a physical continuous time, and interactions can 
be imagined to take place at random, for instance, 
in a pairwise fashion. 

One of the most famous random consensus 
model is the gossip model. Fix a real number 
q G (0,1) and a symmetric graph Q = (V,£). 
At every time instant t , an edge (u, v) G £ 
is activated with uniform probability \£\~ l , and 
nodes u and v exchange their states and produce 
a new state according to the equations 

x u {t + 1) = (1 - q)x u (t) + qx v (t) 
x v (t + 1) = qx u (t) + (1 - q)x v (t) 

The states of the other units remain unchanged. 

Will this dynamics lead to a consensus? If 
the same edge is activated at every time instant, 
clearly consensus will not be achieved. However, 
it can be shown that, with probability one, con¬ 
sensus will be reached (Boyd et al. 2006). 

Consensus Dynamics with Stubborn 
Agents 

In this entry, we investigate consensus dynamics 
models where some of the agents do not modify 
their own state (stubborn agents). These systems 
are of interest in socioeconomic models (Ace- 
moglu et al. 2013). 

Consider a symmetric connected graph Q = 
(V, £)• We assume a splitting V = S U 1Z with 
the understanding that agents in S are stubborn 
agents not changing their state, while those in 
7 Z are regular agents whose state modifies in 
time according to a SRW consensus dynamics, 
namely, 

x u (t + 1) = J- y^ j (Ag) uv x v (t) , Wu G 1Z 

u veV 

By assembling the state of the regular and of the 
stubborn agents in vectors denoted, respectively, 
as x n (t) and x s (t), dynamics can be recasted in 
matrix form as 

( 5 ) 


It can be proven that Q 11 is asymptotically stable 
(( Q n y —> 0). Henceforth, x n (t) -» x n (oo) for 
t —> -|-oo with the limit opinions satisfying the 
relation 

x n (oo) = Q n x n ( oo) + Q n x s ( 0) (6) 

If we define E := (/ — Q n )~ l Q 12 , we can write 
x n (oo) = Ex 5 (0). It is easy to see that E has 
nonnegative elements and that ^ E us = 1 for 
all u G 7 Z: asymptotic opinions of regular agents 
are thus convex combinations of the opinions 
of stubborn agents. If all stubborn agents are 
in the same state x, then, consensus is reached 
by all agents in the point v. However, typically, 
consensus is not reached in such a system: we 
will discuss an example below. 

There is a useful alternative interpretation 
of the asymptotic opinions. Interpreting the 
graph Q as an electrical circuit where edges 
are unit resistors, relation (6) can be seen as 
a Laplace-type equation on the graph Q with 
boundary conditions given by assigning the 
voltage ^^(0) to the stubborn agents. In this 
way, x n (oo) can be interpreted as the vector of 
voltages of the regular agents when stubborn 
agents have fixed voltage x 5 (0). Thanks to 
the electrical analogy, we can compute the 
asymptotic opinion of the agents by computing 
the voltages in the various nodes in the graph. We 
propose a concrete application in the following 
example. 

Example 5 (Stubborn agents in a line graph) 
Consider the line graph L # = (V,£) where 
V = {1,2,..., A} and where £ = {(u,u + 
1), (u + 1, u) | u = 1,..., N — 1}. Assume that 
S = {1, A} and TZ = V \ S. Consider the graph 
as an electrical circuit. Replacing the line with a 
single edge connecting 1 and A having resistance 
A — 1 and applying Ohm’s law, we obtain that 
the current flowing from 1 to A is equal to 
O = (A — l) -1 [v^(0) — xf (0)]. If we now fix 
an arbitrary node v G V and applying again the 
same arguments in the part of graph from 1 to v, 
we obtain that the voltage at v, x^(oo) satisfies 
the relation x^(oo) — xf (0) = 0(n — 1). We 
thus obtain 


x n (t + 1) = Q n x n (t) + Q n x s (t) 
x s (t + 1) = x s (t) 
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X? ( 00 ) = xf (0) + ^-[4(0) - xf (0)]. 

In Acemoglu et al. (2013), further examples 
are discussed showing how, because of the 
topology of the graph, different asymptotic 
configurations may show up. While in graphs 
presenting bottlenecks polarization phenomena 
can be recorded, in graphs where the con¬ 
vergence rate is low, there will be a typical 
asymptotic opinion shared by most of the regular 
agents. 


Cross-References 

► Averaging Algorithms and Consensus 

► Information-Based Multi-Agent Systems 


Bibliography 

Acemoglu D, Como G, Fagnani F, Ozdaglar A (2013) 
Opinion fluctuations and disagreement in social net¬ 
works. Math Oper Res 38(1): 1-27 
Boyd S, Ghosh A, Prabhakar B, Shah D (2006) Ran¬ 
domized gossip algorithms. IEEE Trans Inf Theory 
52(6):2508-2530 

Carli R, Fagnani F, Speranzon A, Zampieri S (2008) 
Communication constraints in the average consensus 
problem. Automatica 44(3):671-684 
Castellano C, Fortunato S, Loreto V (2009) Statistical 
physics of social dynamics. Rev Modem Phys 81:591- 
646 

Fax JA, Murray RM (2004) Information flow and cooper¬ 
ative control of vehicle formations. IEEE Trans Autom 
Control 49(9): 1465-1476 
Galton F (1907) Vox populi. Nature 75:450-451 
Gantmacher FR (1959) The theory of matrices. Chelsea 
Publishers, New York 

Jadbabaie A, Lin J, Morse AS (2003) Coordination of 
groups of mobile autonomous agents using nearest 
neighbor rules. IEEE Trans Autom Control 48(6):988- 
1001 

Levin DA, Peres Y, Wilmer EL (2008) Markov chains and 
mixing times. AMS, Providence 
Strogatz SH (2003) Sync: the emerging science of sponta¬ 
neous order. Hyperion, New York 
Surowiecki J (2004) The wisdom of crowds: why the 
many are smarter than the few and how collective 
wisdom shapes business, economies, societies and na¬ 
tions. Little, Brown. (Traduzione italiana: La saggezza 
della folia, Fusi Orari, 2007) 


Control and Optimization of Batch 
Processes 

Dominique Bonvin 

Laboratoire d’Automatique, Ecole 

Poly technique Federale de Lausanne (EPFL), 

1015 Lausanne, Switzerland 

Abstract 

A batch process is characterized by the 
repetition of time-varying operations of finite 
duration. Due to the repetition, there are two 
independent “time” variables, namely, the run 
time during a batch and the batch counter. 
Accordingly, the control and optimization 
objectives can be defined for a given batch 
or over several batches. This entry describes 
the various control and optimization strategies 
available for the operation of batch processes. 
These include conventional feedback control, 
predictive control, iterative learning control, and 
run-to-run control on the one hand and model- 
based repeated optimization and model-free self- 
optimizing schemes on the other. 

Keywords 

Batch control; Batch process optimization; Dy¬ 
namic optimization; Iterative learning control; 
Run-to-run control; Run-to-run optimization 

Introduction 

Batch processing is widely used in the manu¬ 
facturing of goods and commodity products, in 
particular in the chemical, pharmaceutical, and 
food industries. These industries account for sev¬ 
eral billion US dollars in annual sales. Batch 
operation differs significantly from continuous 
operation. While in continuous operation the pro¬ 
cess is maintained at an economically desirable 
operating point, the process evolves continuously 
from an initial to a final time in batch processing. 
In the chemical industry, for example, since the 
design of a continuous plant requires substantial 
engineering effort, continuous operation is rarely 
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Implementation 

aspect 

Control 

Run-time references 
y r ef(t) or y ref [0,tf] 

objectives 

Run-end references 

z ref 

Online 

(within-run) 

1 Feedback control 

u M yd^df] 

t FBC 1 

2 Predictive control 

«*(0 z pred,k( t ) 

t MPC | 

Iterative 

(run-to-run) 

3 Iterative learning control 

u k [0 ,t f ] -» y k W f ] 

f ILC with run delay | 

4 Run-to-run control 

Tf (%) = u k [0 ,tf] — > Zk 
f R2R with run delay | 


Control and Optimization of Batch Processes, Fig. 1 

Control strategies for batch processes. The strategies are 
classified according to the control objectives (horizontal 
division) and the implementation aspect (vertical divi¬ 
sion). Each objective can be met either online or iteratively 
over several batches depending on the type of measure¬ 
ments available. Uk represents the input vector for the 

used for low-volume production. Discontinuous 
operations can be of the batch or semi-batch type. 
In batch operations, the products to be processed 
are loaded in a vessel and processed without ma¬ 
terial addition or removal. This operation permits 
more flexibility than continuous operation by 
allowing adjustment of the operating conditions 
and the final time. Additional flexibility is avail¬ 
able in semi-batch operations, where products are 
continuously added by adjusting the feed rate 
profile. We use the term batch process to include 
semi-batch processes. 

Batch processes dealing with reaction 
and separation operations include reaction, 
distillation, absorption, extraction, adsorption, 
chromatography, crystallization, drying, fil¬ 
tration, and centrifugation. The operation of 
batch processes involves recipes developed in 
the laboratory. A sequence of operations is 
performed in a prespecified order in specialized 
process equipment, yielding a fixed amount 
of product. The sequence of tasks to be 
carried out on each piece of equipment, 
such as heating, cooling, reaction, distillation, 
crystallization, and drying, is predefined. The 
desired production volume is then achieved by 


kth batch, Uk[0,tf] the corresponding input trajectories, 
yk(t) the run-time outputs measured online, and Zk the 
run-end outputs available at the final time. FBC stands for 
“feedback control,” MPC for “model predictive control,” 
ILC for “iterative learning control,” and R2R for “run-to- 
run control” 


repeating the processing steps on a predetermined 
schedule. 

The main characteristics of batch process op¬ 
erations include the absence of steady state, the 
presence of constraints, and the repetitive nature. 
These characteristics bring both challenges and 
opportunities to the operation of batch processes 
(Bonvin 1998). The challenges are related to the 
fact that the available models are often poor and 
incomplete, especially since they need to repre¬ 
sent a wider range of operating conditions than 
in the case of continuous processes. Furthermore, 
although product quality must be controlled, this 
variable is usually not available online but only 
at run end. On the other hand, opportunities 
stem from the fact that industrial chemical pro¬ 
cesses are often slow, which facilitates larger 
sampling periods and extensive online computa¬ 
tions. In addition, the repetitive nature of batch 
processes opens the way to run-to-run process 
improvement (Bonvin et al. 2006). More infor¬ 
mation on batch processes and their operation 
can be found in Seborg et al. (2004) and Nagy 
and Braatz (2003). Next, we will successively 
address the control and the optimization of batch 
processes. 
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Control of Batch Processes 

Control of batch processes differs from control 
of continuous processes in two main ways. First, 
since batch processes have no steady-state oper¬ 
ating point, at least some of the set points are 
time-varying profiles. Second, batch processes 
are repeated over time and are characterized by 
two independent variables, the run time t and 
the run counter k. The independent variable k 
provides additional degrees of freedom for meet¬ 
ing the control objectives when these objectives 
do not necessarily have to be completed in a 
single batch but can be distributed over several 
successive batches. This situation brings into fo¬ 
cus the concept of run-end outputs, which need 
to be controlled but are only available at the 
completion of the batch. The most common run- 
end output is product quality. Consequently, the 
control of batch processes encompasses four dif¬ 
ferent strategies (Fig. 1): 

1. Online control of run-time outputs. This con¬ 
trol approach is similar to that used in con¬ 
tinuous processing. However, although some 
controlled variables, such as temperature in 
isothermal operation, remain constant, the key 
process characteristics, such as process gain 
and time constants, can vary considerably be¬ 
cause operation occurs along state trajectories 
rather than at a steady-state operating point. 
Hence, adaptation in run time t is needed to 
handle the expected variations. Feedback con¬ 
trol is implemented using PID techniques or 
more sophisticated alternatives (Seborg et al. 
2004). 

2. Online control of run-end outputs. In this case 
it is necessary to predict the run-end out¬ 
puts z based on measurements of the run-time 
outputs y. Model predictive control (MPC) 
is well suited to this task (Nagy and Braatz 
2003). However, the process models available 
for prediction are often simplified and thus of 
limited accuracy. 

3. Iterative control of run-time outputs. The 
manipulated variable profiles can be generated 
using iterative learning control (ILC), which 
exploits information from previous runs 


(Moore 1993). This strategy exhibits the 
limitations of open-loop control with respect 
to the current run, in particular the fact 
that there is no feedback correction for 
run-time disturbances. Nevertheless, this 
scheme is useful for generating a time-varying 
feedforward input term. 

4. Iterative control of run-end outputs. In this 
case the input profiles are parameterized as 
u k [0,t f ] = Id(Ttk) using the input parameters 
7 Tk . The batch process is thus seen as a static 
map between the input parameters 7tk and the 
run-end outputs Zk (Francois et al. 2005). 

It is also possible to combine online and run- 
to-run control for both y and z. However, in such 
a combined scheme, care must be taken so that 
the online and run-to-run corrective actions do 
not oppose each other. Stability during run time 
and convergence in run index must be guaranteed 
(Srinivasan and Bonvin 2007a). 

Optimization of Batch Processes 

The process variables undergo significant 
changes during batch operation. Hence, the 
major objective in batch operations is not to 
keep the system at optimal constant set points but 
rather to determine input profiles that optimize 
an objective function expressing the system 
performance. 

Problem Formulation 

A typical optimization problem in the context of 
batch processes is 

min J k = 4>(x k (tf)) 

U k [ 0 ,tf] 

r tf 

+ / L(x k (t), u k (t), t) dt (1) 

Jo 

subject to 

x k (t) = F (x k (0 , u k ( 0 ) , Xk (0) = x k ,o (2) 
S(x k (t),u k (t)) < 0, T(x k (t f )) < 0, (3) 
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where x represents the state vector, J the scalar 
cost to be minimized, S the run-time constraints, 
T the run-end constraints, and t / the final time. 

In constrained optimal control problems, the 
solution often lies on the boundary of the fea¬ 
sible region. Batch processes involve run-time 
constraints on inputs and states as well as run-end 
constraints. 

Optimization Strategies 

As can be seen from the cost objective (1), op¬ 
timization requires information about the com¬ 
plete run and thus cannot be implemented in 
real time using only online measurements. Some 
information regarding the future of the run is 
needed in the form of either a process model 
capable of prediction or measurements from pre¬ 
vious runs. Accordingly, measurement-based op¬ 
timization methods can be classified depending 
on whether or not a process model is used ex¬ 
plicitly for implementation, as illustrated in Fig. 2 
and discussed next: 

1. Online explicit optimization. This approach 
is similar to model predictive control (Nagy 
and Braatz 2003). Optimization uses a process 
model explicitly and is repeated whenever 
a new set of measurements becomes avail¬ 
able. This scheme involves two steps, namely, 


updating the initial conditions for the sub¬ 
sequent optimization (and optionally the pa¬ 
rameters of the process model) and numerical 
optimization based on the updated process 
model (Abel et al. 2000). Since both steps 
are repeated as measurements become avail¬ 
able, the procedure is also referred to as re¬ 
peated online optimization. The weakness of 
this method is its reliance on the model; if 
the model is not updated, its accuracy plays 
a crucial role. However, when the model is 
updated, there is a conflict between parameter 
identification and optimization since parame¬ 
ter identification requires persistency of exci¬ 
tation, that is, the inputs must be sufficiently 
varied to uncover the unknown parameters, a 
condition that is usually not satisfied when 
near-optimal inputs are applied. Note that, 
instead of computing the input u k [t, tf], it is 
also possible to use a receding horizon and 
compute only u k [t,t + T], with T the control 
horizon (Abel et al. 2000). 

2. Online implicit optimization. In this scenario, 
measurements are used to update the inputs 
directly, that is, without the intermediary of 
a process model. Two classes of techniques 
can be identified. In the first class, an update 
law that approximates the optimal solution 


Implementation 

aspect 

Use of pn 

Explicit optimization 
(with process model) 

ocess model 

Implicit optimization 
(without process model) 

Online 

(within-run) 

1 Repeated online optimization 

y k W\ > *k(t) u k [t,t f ] 

f repeat online | 

2 Online input update using 

measurements 

Approx, of opt. solution 

^(t) — - - -► u k(y) 

y k W] N CO prediction „ 

Iterative 

(run-to-run) 

3 Repeated run-to-run optimization 

IDENT ^ OPT * 

yA^dfl — > §k u k+ A0,t f ] 

^ repeat with run delay | 

. Run-to-run input update using 
measurements 

m , n NCO evaluation , T „„ * , 

>7c[0.'/]- >NCO — ► u k+1 [0,t f ] 

f repeat with run delay 


Control and Optimization of Batch Processes, Fig. 2 

Optimization strategies for batch processes. The strate¬ 
gies are classified according to whether or not a process 
model is used for implementation (horizontal division). 
Furthermore, each class can be implemented either online 


or iteratively over several runs (vertical division). EST 
stands for “estimation,” IDENT for “identification,” OPT 
for “optimization,” and NCO for “necessary conditions of 
optimality” 
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is sought. For example, a neural network is 
trained with data corresponding to optimal be¬ 
havior for various uncertainty realizations and 
used to update the inputs (Rahman and Palanki 
1996). The second class of techniques relies 
on transforming the optimization problem into 
a control problem that enforces the neces¬ 
sary conditions of optimality (NCO) (Srini- 
vasan and Bonvin 2007b). The NCO involve 
constraints that need to be made active and 
sensitivities that need to be pushed to zero. 
Since some of these NCO are evaluated at 
run time and others at run end, the control 
problem involves both run-time and run-end 
outputs. The main issue is the measurement or 
estimation of the controlled variables, that is, 
the constraints and sensitivities that constitute 
the NCO. 

3. Iterative explicit optimization. The steps 
followed in run-to-run explicit optimization 
are the same as in online explicit optimization. 
However, there is substantially more data 
available at the end of the run as well as 
sufficient computational time to refine the 
model by updating its parameters and, if 
needed, its structure. Furthermore, data from 
previous runs can be collected for model 
update (Rastogi et al. 1992). As with online 
explicit optimization, this approach suffers 
from the conflict between estimation and 
optimization. 

4. Iterative implicit optimization. In this sce¬ 
nario, the optimization problem is transformed 
into a control problem, for which the control 
approaches in the second row of Fig. 1 are 
used to meet the run-time and run-end ob¬ 
jectives (Francois et al. 2005). The approach, 
which is conceptually simple, might be ex¬ 
perimentally expensive since it relies more on 
data. 

These complementary measurement-based 
optimization strategies can be combined by 
implementing some aspects of the optimization 
online and others on a run-to-run basis. For 
instance, in explicit schemes, the states can be 
estimated online, while the model parameters 
can be estimated on a run-to-run basis. Similarly, 
in implicit optimization, approximate update 


laws can be implemented online, leaving the 
responsibility for satisfying terminal constraints 
and sensitivities to run-to-run controllers. 


Summary and Future Directions 

Batch processing presents several challenges. 
Since there is little time for developing 
appropriate dynamic models, there is a need for 
improved data-driven control and optimization 
approaches. These approaches require the 
availability of online concentration-specific 
measurements such as chromatographic and 
spectroscopic sensors, which are not yet readily 
available in production. 

Technically, the main operational difficulty in 
batch process improvement lies in the presence 
of run-end outputs such as final quality, which 
cannot be measured during the run. Although 
model-based solutions are available, process 
models in the batch area tend to be poor. On 
the other hand, measurement-based optimization 
for a given batch faces the challenge of having 
to know about the future to act during the batch. 
Consequently, the main research push is in the 
area of measurement-based optimization and the 
use of data from both the current and previous 
batches for control and optimization purposes. 

Cross-References 

► Industrial MPC of continuous processes 

► Iterative Learning Control 

► Multiscale Multivariate Statistical Process 
Control 

► Scheduling of Batch Plants 

► State Estimation for Batch Processes 
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Abstract 

This entry gives a brief overview of the recent 
developments in audio sound reproduction via 
modern sampled-data control theory. We first re¬ 
view basics in the current sound processing tech¬ 
nology and then proceed to the new idea derived 
from sampled-data control theory, which is dif¬ 
ferent from the conventional Shannon paradigm 
based on the perfect band-limiting hypothesis. 
The hybrid nature of sampled-data systems pro¬ 
vides an optimal platform for dealing with signal 


processing where the ultimate objective is to 
reconstruct the original analog signal one started 
with. After discussing some fundamental prob¬ 
lems in the Shannon paradigm, we give our basic 
problem formulation that can be solved using 
modern sampled-data control theory. Examples 
are given to illustrate the results. 


Keywords 

Digital signal processing; Multirate signal pro¬ 
cessing; Sampled-data control; Sampling theo¬ 
rem; Sound reconstruction 


Introduction: Status Quo 

Consider the problem of reproducing sounds 
from recorded media such as compact discs. The 
current CD format is recorded at the sampling 
frequency 44.1 kHz. It is commonly claimed that 
the highest frequency for human audibility is 
20 kHz, whereas the upper bound of reproduction 
in this format is believed to be the half of 
44.1kHz, i.e., 22.1kHz, and hence, this format 
should have about 10% margin against the 
alleged audible limit of 20 kHz. 

CD players of early days used to process such 
digital signals with the simple zero-order hold 
at this frequency, followed by an analog low- 
pass filter. This process requires a sharp low- 
pass characteristic to cut out unnecessary high 
frequency beyond 20 kHz. However, a sharp cut¬ 
off low-pass characteristic inevitably requires a 
high-order filter which in turn introduces a large 
amount of phase shift distortion around the cutoff 
frequency. 

To circumvent this defect, there was intro¬ 
duced the idea of oversampling DA converter that 
is realized by the combination of a digital filter 
and a low-order analog filter (Zelniker and Taylor 
1994). This is based on the following principle: 

Let {f(nh)}^ = _ OQ be a discrete-time signal 
obtained from a continuous-time signal /(•) by 
sampling it with sampling period h. The upsam¬ 
ple r appends the value 0, M — 1 times, between 
two adjacent sampling points: 
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Fig. 1 Upsampler for 
M = 2 
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(f Mw)[k] := 


w(l), k = Ml 
0, elsewhere. 


( 1 ) 


See Fig. 1 for the case M = 2. This has the 
effect of making the unit operational time M 
times faster. 

The bandwidth will also be expanded by M 
times and the Nyquist frequency (i.e., half the 
sampling frequency) becomes Mn/h [rad/sec]. 
As we see in the next section, the Nyquist fre¬ 
quency is often regarded as the true bandwidth 
of the discrete-time signal {/ (nh)}^ = _ 00 . But 
this upsampling process just insert zeros between 
sampling points, and the real information con¬ 
tents (the true bandwidth) is not really expanded. 
As a result, the copy of the frequency content 
for [0,7 r/h) appears as a mirror image repeatedly 
over the frequency range above jr/h. This dis¬ 
tortion is called imaging. In order to avoid the 
effect of such mirrored frequency components, 
one often truncates the frequency components 
beyond the (original) Nyquist frequency via a 
digital low-pass filter that has a sharp roll-off 
characteristic. One can then complete the digital 
to analog (DA) conversion process by postposing 
a slowly decaying analog filter. This is the idea of 
an oversampling DA converter (Zelniker and Tay¬ 
lor 1994). The advantage here is that by allowing 
a much wider frequency range, the final analog 
filter can be a low-order filter and hence yields a 
relatively small amount of phase distortion sup¬ 
ported in part by the linear-phase characteristic 
endowed on the digital filter preceding it. 


Signal Reconstruction Problem 

As before, consider the sampled discrete¬ 
time signal {f(nh)}^ = _ OQ obtained from a 
continuous-time signal /. The main question is 
how we can recover the original continuous-time 


signal /(•) from sampled data. This is clearly 
an ill-posed problem without any assumption on 
/ because there are infinitely many functions 
that can match the sampled data {/ (nh)}^ = _ OQ . 
Hence, one has to impose a reasonable a 
priori assumption on / to sensibly discuss this 
problem. 

The following sampling theorem gives one 
answer to this question: 

Theorem 1 Suppose that the signal f e L 2 
is perfectly band-limited, in the sense that there 
exists coo < 7 r/h such that the Fourier transform 
f of f satisfies 


f (co) = 0, \co\ > coo, . (2) 


Then 


m = 


oo 

X /( w/ o 

n =—oo 


sin 7 x(t / h —n) 
7 x(t / h — n) 


( 3 ) 


This theorem states that if the signal / does 
not contain any high-frequency components be¬ 
yond the Nyquist frequency n/ h, then the origi¬ 
nal signal / can be uniquely reconstructed from 
its sampled-data {/ (nh)}^f = _ 00 . On the other 
hand, if this assumption does not hold, then the 
result does not necessarily hold. This is easy to 
see via a schematic representation in Fig. 2. 

If we sample the sinusoid in the upper fig¬ 
ure in Fig. 2, these sampled values would turn 
out to be compatible with another sinusoid with 
much lower frequency as the lower figure shows. 
In other words, this sampling period does not 
have enough resolution to distinguish these two 
sinusoids. The maximum frequency below where 
there does not occur such a phenomenon is the 
Nyquist frequency. The sampling theorem above 
asserts that it is half of the sampling frequency 
2 tt//z, that is, n/h [rad/sec]. In other words, if 
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Aliasing 

we can assume that the original signal contains 
no frequency components beyond the Nyquist 
frequency, then one can uniquely reconstruct the 
original analog signal / from its sampled-data 
{f(nh)}^ = _ 00 . On the other hand, if this as¬ 
sumption does not hold, the distortion depicted 
in Fig. 2 occurs; this is called aliasing. 

This is the content of the sampling theorem. 
It has been widely accepted as the basis for 
digital signal processing that bridges analog to 
digital. Concrete applications such as CD, MP3, 
or images are based on this principle in one way 
or another. 

Difficulties 

However, this paradigm (hereafter the Shannon 
paradigm) of the perfect band-limiting hypoth¬ 
esis and the resulting sampling theorem renders 
several difficulties as follows: 

• The reconstruction formula (3) is not causal, 
i.e., one needs future sampled values to recon¬ 
struct the present value / (t). One can remedy 
this defect by allowing a certain amount of 
delay in reconstruction, but this delay can 
depend on how fast the formula converges. 

• This formula is known to decay slowly; that 
is, we need many terms to approximate if we 
use this formula as it is. 

• The perfect band-limiting hypothesis is hardly 
satisfied in reality. For example, for CDs, the 


Nyquist frequency is 22.05 kHz, and the en¬ 
ergy distribution of real sounds often extends 
way over 20 kHz. 

• To remedy this, one often introduces a band- 
limiting low-pass filter, but it can introduce 
distortions due to the Gibbs phenomenon, due 
to a required sharp decay in the frequency 
domain. See Fig. 3. 

This is the Gibbs phenomenon well known 
in Fourier analysis. A sharp truncation in the 
frequency domain yields such a ringing effect. 

In view of such drawbacks, there has been 
revived interest in the extension of the sampling 
theorem in various forms since the 1990s. There 
is by now a stream of papers that aim at studying 
signal reconstruction under the assumption of 
nonideal signal acquisition devices; an excellent 
survey is given in Unser (2000). In this research 
framework, the incoming signal is supposed to be 
acquired through a nonideal analog filter (acqui¬ 
sition device) and sampled, and then the recon¬ 
struction process attempts to recover the original 
signal. The idea is to place the problem into 
the framework of the (orthogonal or oblique) 
projection theorem in a Hilbert space (usually 
L 2 ) and then project the signal space to the 
subspace generated by the shifted reconstruction 
functions. It is often required that the process 
give a consistent result, i.e., if we subject the 
reconstructed signal to the whole process again, it 
should yield the same sampled values from which 
it was reconstructed (Unser and Aldroubi 1994). 

In what follows, we take a similar viewpoint, 
that is, the incoming signals are acquired through 
a nonideal filter, but develop a methodology dif¬ 
ferent from the projection method, relying on 
sampled-data control theory. 


The Signal Class 

We have seen that the perfect band-limiting hy¬ 
pothesis is restrictive. Even if we adopt it, it is a 
fairly crude model for analog signals to allow for 
a more elaborate study. 

Let us now pose the question: What class of 
functions should we process in such systems? 
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Gibbs phenomenon 
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Consider the situation where one plays a mu¬ 
sical instrument, say, a guitar. A guitar naturally 
has a frequency characteristic. When one picks a 
string, it produces a certain tone along with its 
harmonics, as well as a characteristic transient 
response. All these are governed by a certain 
frequency decay curve, demanded by the physical 
characteristics of the guitar. Let us suppose that 
such a frequency decay is governed by a rational 
transfer function F(s ), and it is driven by varied 
exogenous inputs. 

Consider Fig. 4. The exogenous analog signal 
w c e L 2 is applied to the analog filter F(s). 
This F(s ) is not an ideal filter and hence its 
bandwidth is not limited below the Nyquist fre¬ 
quency. The signal drives F(s) to produce 
the target analog signal y c , which should be the 
signal to be reconstructed. It is then sampled 
by sampler Sh and becomes the recorded or 
transmitted digital signal yd. The objective here 
is to reconstruct the target analog signal y c out 
of this sampled signal yd. In order to recover 


the frequency components beyond the Nyquist 
frequency, one needs a faster sampling period, 
so we insert the upsampler f L to make the 
sampling period h/L. This upsampled signal is 
processed by digital filter K(z) and then becomes 
a continuous-time signal again by going through 
the hold device FLh/l- It will then be processed 
by analog filter P(s) to be smoothed out. The 
obtained signal is then compared with delayed 
analog signal y c (t — mh ) to form the delayed 
error signal e c . The objective is then to make 
this error e c as small as possible. The reason for 
allowing delay e~ mhs is to accommodate certain 
processing delays. This is the idea of the block 
diagram Fig. 4. 

The performance index we minimize is the 
induced norm of the transfer operator T ew from 

w c to e c : 


sup 

WcT^O 


Ike lb 

Ike lb 


( 4 ) 
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In other words, the H 00 -norm of the sampled- 
data control system Fig. 4. Our objective is then 
to solve the following problem: 

Filter Design Problem 

Given the system specified by Fig. 4. For a given 
performance level y > 0, find a filter K(z ) such 
that 

II7;Joe < y- 

This is a sampled-data H°° (sub-)optimal 
control problem. This can be solved by using 
the standard solution method for sampled-data 
control systems (Chen and Francis 1995a; 
Yamamoto 1999; Yamamoto et al. 2012). 
The only anomaly here is that the system in 
Fig. 4 contains a delay element e~ mhs which 
is infinite dimensional. However, by suitably 
approximating this delay by successive series of 
shift registers, one can convert the problem to 
an appropriate finite-dimensional discrete-time 
problem (Yamamoto et al. 1999, 2002, 2012). 

This problem setting has the following fea¬ 
tures: 

1. One can optimize the continuous-time perfor¬ 
mance under the constraint of discrete-time 
filters. 

2. By setting the class of input functions as L 2 
functions band-limited by F(s ), one can cap¬ 
ture the continuous-time error signal e c and its 
worst-case norm in the sense of (4). 

The first feature is due to the advantage of 
sampled-data control theory. It is a great ad¬ 
vantage of sampled-data control theory that al¬ 
lows the mixture of continuous- and discrete¬ 
time components. This is in marked contrast to 
the Shannon paradigm where continuous-time 
performance is really demanded by the artificial 
perfect band-limiting hypothesis. 

The second feature is an advantage due to 
H°° control theory. Naturally, we cannot have an 
access to each individual error signal e c , but 
we can still control the overall performance 
from w c to e c in terms of the H°° norm that 
guarantees the worst-case performance. This is in 
clear contrast with the classical case where only 
a representative response, e.g., impulse response 


in the case of Ft 2 , is targeted. Furthermore, since 
we can control the continuous-time performance 
of the worst-case error signal, the present 
method can indeed minimize (continuous-time) 
phase errors. This is an advantage usually not 
possible with conventional methods since they 
mainly discuss the gain characteristics of the 
designed filters only. By the very property of 
minimizing the H 00 norm of the continuous¬ 
time error signal e c , the present method can 
even control the phase errors and yield much 
less phase distortion even around the cutoff 
frequency. 

Figure 5 shows the response of the proposed 
sampled-data filter against a rectangular wave, 
with a suitable first- or second-order analog fil¬ 
ter F(s)', see Yamamoto et al. (2012) for more 
details. Unlike Fig. 3, the overshoot is controlled 
to be minimum. 

The present method has been patented 
(Fujiyama et al. 2008; Yamamoto 2006; 
Yamamoto and Nagahara 2006) and implemented 
into sound processing LSI chips as a core 
technology by Sanyo Semiconductors and 
successfully used in mobile phones, digital voice 
recorders, and MP3 players; their cumulative 
production has exceeded 40 million units as of 
the end of 2012. 

Summary and Future Directions 

We have presented basic ideas of new signal pro¬ 
cessing theory derived from sampled-data control 
theory. The theory has the advantage that is not 
possible with the conventional projection meth¬ 
ods, whether based on the perfect band-limiting 
hypothesis or not. 

The application of sampled-data control 
theory to digital signal processing was first made 
by Chen and Francis (1995b) with performance 
measure in the discrete-time domain; see also 
Hassibi et al. (2006). The present author and his 
group have pursued the idea presented in this 
entry since 1996 (Khargonekar and Yamamoto 
1996). See Yamamoto et al. (2012) and references 
therein. For the background of sampled-data 
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Fig. 5 Response of the 
proposed sampled-data 
filter against a rectangular 
wave 



control theory, consult, e.g., Chen and Francis 
(1995a) and Yamamoto (1999). 

The same philosophy of emphasizing 
the importance of analog performance was 
proposed and pursued recently by Unser and 
co-workers (1994), Unser (2005), and Eldar 
and Dvorkind (2006). The crucial difference 
is that they rely on L 2 /H 2 type optimization 
and orthogonal or oblique projections, which 
are very different from our method here. In 
particular, such projection methods can behave 
poorly for signals outside the projected space. 
The response shown in Fig. 3 is a typical such 
example. 

Applications to image processing is discussed 
in Yamamoto et al. (2012). An application 
to Delta-Sigma DA converters is studied in 
Nagahara and Yamamoto (2012). Again, the 
crux of the idea is to assume a signal generator 
model and then design an optimal filter in 
the sense of Fig. 4 or a similar diagram with 
the same idea. This idea should be applicable 
to a much wider class of problems in signal 
processing and should prove to have more 
impact. 

Some processed examples of still and moving 
images are downloadable from the site: http:// 
www-ics.acs.i.kyoto-u.ac.jp/~yy/ 


For sampling theorem, see Shannon (1949), 
Unser (2000), and Zayed (1996), for example. 
Note, however, that Shannon himself (1949) did 
not claim originality on this theorem; hence, it 
is misleading to attribute this theorem solely to 
Shannon. See Unser (2000) and Zayed (1996) 
for some historical accounts. For a general back¬ 
ground in signal processing, Vetterli et al. (2013) 
is useful. 

Cross-References 

► H-Infinity Control 

► Optimal Sampled-Data Control 

► Sampled-Data Systems 
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Abstract 

Over the last two and a half decades we have 
observed astonishing progress in the field of 
nanotechnology. This progress is largely due to 
the invention of Scanning Tunneling Microscope 
(STM) and Atomic Force Microscope (AFM) 
in the 1980s. Central to the operation of AFM 
and STM is a nanopositioning system that 
moves a sample or a probe, with extremely 
high precision, up to a fraction of an Angstrom, 
in certain applications. This note concentrates 
on the fundamental role of feedback, and the 
need for model-based control design methods in 
improving accuracy and speed of operation of 
nanopositioning systems. 

Keywords 

Atomic force microscopy; High-precision 
mechatronic systems; Nanopositioining; Scan¬ 
ning probe microscopy 

Introduction 

Controlling motion of an actuator to within a 
single atom, known as nanopositioning, may 
seem as an impossible task. Yet, it has become 
a key requirement in many systems to emerge 
in recent years. In scanning probe microscopy 
nanopositioning is needed to scan a probe over 
a sample surface for imaging and to control the 
interaction between the probe and the surface 
during interrogation and manipulation (Meyer 
et al. 2004). Nanopositioning is the enabling tech¬ 
nology for mask-less lithography tools under de¬ 
velopment to replace optical lithography systems 
(Vettiger et al. 2002). Novel nanopositioning 
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tools are required for positioning of wafers 
and for mask alignment in the semiconductor 
industry (Verma et al. 2005). Nanopositioning 
systems are vital in molecular biology for 
imaging, alignment, and nanomanipulation in 
applications such as DNA analysis (Meldrum 
et al. 2001) and nanoassembly (Whitesides and 
Christopher Love 2001). Nanopositioning is 
an important technology in optical alignment 
systems (Krogmann 1999). In data storage 
systems, nanometer-scale precision is needed 
for emerging probe-storage devices, for dual¬ 
stage hard-disk drives, and for next generation 
tape drives (Cherubini et al. 2012). 

The Need for High-Speed 
Nanopositioning 

In all applications of nanopositioning, there is a 
significant and growing demand for high speeds. 
The ability to operate a nanopositioner at a band¬ 
width of tens of kHz, as opposed to today’s 
hundreds of Hz, is the key to unlocking countless 
technological possibilities in the future (Gao et al. 
2000; Pantazi et al. 2008; Salapaka 2003; Sebas¬ 
tian et al. 2008b; Yong et al. 2012). The atomic 
force microscope (AFM) is an example of such 
technologies. A typical commercial atomic force 
microscope is a slow device, taking up to a minute 
or longer to generate an image. Such imaging 
speeds are too slow to investigate phenomena 
with fast dynamics. For example, rapid biological 
processes that occur in seconds, such as rapid 
movement of cells or fast dehydration and denat- 
uration of collagen, are too fast to be observed 
by a typical commercial AFM (Zou et al. 2004). 
A key obstacle in realizing high-speed and video¬ 
rate atomic force microscopy is the limited speed 
of nanopositioners. 

The Vital Role of Feedback Control in 
High-Speed Nanopositioning 

The systems described above depend on a 
precision mechatronic device, known as a 
nanopositioner , or a scanner for their operation. 


Piezoelectric stack actuator 
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3DoF flexure-guided high-speed nanopositioner (Yong 
et al. 2013). The three axes are actuated independently 
using piezoelectric stack actuators. Movement of lateral 
axes is measured using capacitive sensors 

A high-speed scanner is shown in Fig. 1. 
In all applications where nanopositioning is 
a necessity, the key objective is to make the 
scanner follow, or track, a given reference 
trajectory (Devasia et al. 2007). A large number 
of control design methods have been proposed 
for this purpose, including feedforward control 
(Clayton et al. 2009), feedback control (Salapaka 
2003), and combinations of those (Yong et al. 
2009). These control techniques are required 
in order to compensate for the mechanical 
resonances of the scanner as well as for various 
nonlinearities and uncertainties in the dynamics 
of the nanopositioner. At low speeds, feedforward 
techniques are usually sufficient to address 
many of the arising challenges. However, over 
a wide bandwidth, model uncertainties, sensor 
noise, and mechanical cross-couplings become 
significant, and hence feedback control becomes 
essential to achieve the requisite nanoscale 
accuracy and precision at high speeds (Devasia 
et al. 2007; Salapaka 2003). 

Control Design Challenges 

A feedback loop typically encountered in 
nanopositioning is illustrated in Fig. 2. The 
purpose of the feedback controller is to control 
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Uncertain Dynamics 
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feedback loop typically encountered in nanopositioning. 
Purpose of the controller is to control the position of 

the position of the scanner such that it follows 
a given reference trajectory based on the 
measurement provided by a displacement 
sensor. The resulting tracking error contains 
both deterministic and stochastic components. 
Deterministic errors are typically due to 
insufficient closed-loop bandwidth. They may 
also arise from excitation of mechanical resonant 
modes of the scanner or actuator nonlinearities 
such as piezoelectric hysteresis and creep (Croft 
et al. 2001). The factors that limit the achievable 
closed-loop bandwidth include phase delays and 
non-minimum phase zeros associated with the 
actuator and scanner dynamics (Devasia et al. 
2007). The dynamics of the nanopositioner, the 
controller, and the reference trajectory selected 
for scanning play a key role in minimizing the 
deterministic component of the tracking error. 

Tracking errors of a stochastic nature mostly 
arise from external noise and vibrations and from 
position measurement noise. External noise and 
vibrations can be significantly reduced by oper¬ 
ating the nanopositioner in a controlled environ¬ 
ment. However, dealing with the measurement 
noise is a significant challenge (Sebastian et al. 
2008a). The feedback loop allows the sensing 
noise to generate a random positioning error that 
deteriorates the positioning precision. Increasing 
the closed-loop bandwidth (to decrease the deter¬ 
ministic errors) tends to worsen this effect. Low 
sensitivity to measurement noise is, therefore, 
a key requirement in feedback control design 
for high-speed nanopositioning and a very hard 
problem to address. 


the scanner such that it follows the intended reference 
trajectory based on the position measurement obtained 
from a position sensor 


Summary and Future Directions 

While high-precision nanoscale positioning 
systems have been demonstrated at low speeds, 
despite an intensive international race spanning 
several years, the longstanding challenge remains 
to achieve high-speed motion and positioning 
with Angstrom-level accuracy. Overcoming this 
barrier is believed to be the necessary catalyst for 
emergence of ground breaking innovations across 
a wide range of scientific and technological fields. 
Control is a critical technology to facilitate the 
emergence of such systems. 
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Abstract 

This entry provides an overview of the so-called 
control pyramid, which organizes the different 
types of control tasks in a processing plant in a 
set of interconnected layers, from basic control 
and instrumentation to plant-wide economic op¬ 
timization. These layers have different functions, 
all of them necessary for the optimal functioning 
of large processing plants. 
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Introduction 

Operating a process plant is a complex task in¬ 
volving many different aspects ranging from the 
control of individual pieces of equipment and of 
process units to the management of the plant or 
factory as a whole, including relations with other 
plants or suppliers. 

From the control point of view, the corre¬ 
sponding tasks are traditionally organized in sev¬ 
eral layers, placing in the bottom the ones closer 
to the physical processes and in the top those 
closer to plant-wide management, forming the so- 
called control pyramid represented in Fig. 1 . 

The process industry currently faces many 
challenges, originated from factors such as 
increased competition among companies and bet¬ 
ter global market information, new environmental 
regulations and safety standards, improved 
quality, or energy efficiency requirements. Many 
years ago, the main tasks were associated to the 
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Control Hierarchy of Large Processing Plants: An 
Overview, Fig. 1 The control pyramid 

correct and safe functioning of the individual 
process units and to the global management 
of the factory from the point of view of 
organization and economy. Therefore, only the 
lower and top layers of the control pyramid were 
realized by computer-based systems, whereas 
the intermediate tasks were largely performed 
by human operators and managers, but more 
and more the intermediate layers are gaining 
importance in order to face the abovementioned 
challenges. 

Above the physical plant represented in Fig. 1, 
there is a layer related to instrumentation and 
basic control, devoted to obtaining direct pro¬ 
cess information and maintaining selected pro¬ 
cess variables close to their desired targets by 
means of local controllers. Motivated by the need 
for more efficient operation and better-quality 
assurance, an improvement of this basic control 
can be obtained using control structures such 
as cascades, feed forwards, ratios, and selectors. 
This is called advanced control in industry, but 
not in academia, where the word is reserved for 
more sophisticated controls. 

A big step forward took place in the control 
field with the introduction of model-based 
predictive control (MBPC/MPC) in the late 1970s 


and 1980s, (►Industrial MPC of Continuous 
Processes; Camacho and Bordons (2004)). 
MPC aims at regulating a process unit as 
a whole considering all manipulated and 
controlled variables simultaneously. It handles 
all interactions, disturbances, and process 
constraints using a process model in order to 
compute the control actions that optimize a 
control performance index. MPC is built on top 
of the basic control loops and partly replaces 
the complex control structures of the advanced 
control layer adding new functionalities and 
better control performance. The improvements 
in control quality and the management of 
constraints and interactions of the model- 
predictive controllers open the door for the 
implementation of local economic optimization. 
Linked to the MPC controller and taking 
advantage of its model, an optimizer may look 
for the best operating point of the unit by 
computing the controller set points that optimize 
an economic cost function of the process unit 
considering the operational constraints of the 
unit. This task is usually formulated and solved 
as a linear programming (LP) problem, i.e., based 
on linear or linearized economic models and cost 
function (see Fig. 2). 

A natural extension of these ideas was to 
consider the interrelations among the different 
parts of the processing plants and to look for 
the steady-state operating point that provides the 
best economic return and minimum energy ex¬ 
pense or optimizes any other economic criterion 
while satisfying the global production aims and 
constraints. These optimization tasks are known 
as real-time optimization (RTO) (►Real-Time 
Optimization of Industrial Processes) and form 
another layer of the control pyramid. 

Finally, when we consider the whole plant 
operation, obvious links between the RTO and 
the planning and economic management of the 
company appear. In particular, the organization 
and optimization of the flows of raw materials, 
purchases, etc., involved in the supply chains 
present important challenges that are placed in 
the top layer of Fig. 1 . 

This entry provides an overview of the dif¬ 
ferent layers and associated tasks so that the 
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Control Hierarchy of Large Processing Plants: An 
Overview, Fig. 2 MPC implementation with a local op¬ 
timizer 

reader can place in context the different con¬ 
trollers and related functionalities and tools, as 
well as appreciate the trends in process control 
focusing the attention toward the higher levels of 
the hierarchy and the optimal operation of large- 
scale processes. 

An Alternative View 

The implementation in a process factory of the 
tasks and layers previously mentioned is possible 
nowadays due to important advances in many 
fields, such as modeling and identification, con¬ 
trol and estimation, optimization methods, and, in 
particular, software tools, communications, and 
computing power. Today it is rather common 
to find in many process plants an information 
network that follows also a pyramidal structure 
represented in Fig. 3. 

At the bottom, there is the instrumentation 
layer that includes, besides sensors and 
actuators connected by the classical analog 
4-20 mA signals, possibly enhanced by the 
transmission of information to and from the 
sensors by the HART protocol, digital field 
buses and smart transmitters and actuators 
that incorporate improved information and 
intelligence. New functionalities, such as 
remote calibration, filtering, self-test, and 
disturbance compensation, provide more accurate 
measurements that contribute to improving 
the functioning of local controllers, in the 
same way as that of new methods and tools 
available nowadays for instrument monitoring 
and fault detection and diagnosis. The increased 


installation of wireless transmitters and the 
advances in analytical instrumentation will lead, 
without doubt, to the development of a stronger 
information base to support better decisions and 
operations in the plants. 

Information from transmitters is collected in 
the control rooms that are the core of the second 
layer. Many of them are equipped with distributed 
control systems (DCS) that implement monitor¬ 
ing and control tasks. Field signals are received 
in the control cabinets where a large number 
of microprocessors execute the data acquisition 
and regulatory control tasks, sending signals back 
to the field actuators. Internal buses connect the 
controllers with the computers that support the 
displays of the human-machine interface (HMI) 
for the plant operators of the control room. In 
the past, DCS were mostly in charge of the 
regulatory control tasks, including basic control, 
alarm management, and historians, while inter¬ 
locking systems related to safety and sequences 
related to batch operations were implemented 
either in the DCS or in programmable logic 
controllers (PLCs): ► Programmable Logic Con¬ 
trollers. Today, the bounds are not so clear, due to 
the increase of the computing power of the PLCs 
and the added functionalities of the DCS. Safety 
instrumented systems (SIS) for the maintenance 
of plant safety are usually implemented in dedi¬ 
cated PLCs, if not hard-wired, but for the rest of 
the functions, a combination of PLC-like proces¬ 
sors with I/O cards and SCADAs (Supervision, 
Control, And Data Acquisition Systems) is the 
prevailing architecture. SCADAs act as HMI and 
information systems collecting large amounts of 
data that can be used at other levels with different 
purposes. 

Above the basic and advanced control layer, 
using the information stored in the SC ADA 
as well as other sources, there is an increased 
number of applications covering diverse fields. 
Figure 3 depicts the perspective of the computing 
and information flow architecture and includes 
a level called supervisory control, placed in 
direct connection with the control room and 
the production tasks. It includes, for instance, 
MPC with local optimizers, statistical process 
control (SPC) for quality and production 
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supervision (► Multiscale Multivariate Statistical 
Process Control), data reconciliation, inferences 
and estimation of unmeasured quantities, 
fault detection and diagnosis, or performance 
controller monitoring (► Controller Performance 
Monitoring) (CPM). 

The information flow becomes more complex 
when we move up the basic control layer, looking 
more like a web than a pyramid when we enter 
the world of what can be called generally as 
asset (plant and equipment) management: a col¬ 
lection of different activities oriented to sustain 
performance and economic return, considering 
their entire cycle of life and, in particular, aspects 
such as maintenance, efficiency, or production 
organization. Above the supervisory layer, one 
can usually distinguish at least two levels denoted 
generically as manufacturing execution systems 
(MES) and enterprise resource planning (ERP) 
(Scholten 2009) as can be seen in Fig. 4. 


MES are information systems that support the 
functions that a production department must 
perform in order to prepare and to manage 
work instructions, schedule production activities, 
monitor the correct execution of the production 
process, gather and analyze information about 
the production process, and optimize procedures. 
Notice that regarding the control of process units, 
up to this level no fundamental differences appear 
between continuous and batch processes. But at 
the MES level, which corresponds to RTO of 
Fig. 1, many process units may be involved, and 
the tools and problems are different, the main task 
in batch production being the optimal scheduling 
of those process units (►Scheduling of Batch 
Plants; Mendez et al. 2006). 

MES are part of a larger class of systems 
called manufacturing operation management 
(MOM) that cover not only the management of 
production operations but also other functions 
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Control Hierarchy of 
Large Processing Plants: 
An Overview, Fig. 4 

Software/hardware view 



Software / Hardware 
Enterprise view 


such as maintenance, quality, laboratory 
information systems, or warehouse management. 
One of their main tasks is to generate elaborate 
information, quite often in the form of key 
performance indicators (KPIs), with the purpose 
of facilitating the implementation of corrective 
actions. 

ERP systems represent the top of the pyramid, 
corresponding to the enterprise business planning 
activities that allows assigning global targets to 
production scheduling. For many years, it has 
been considered to be out of the scope of the field 
of control, but nowadays, more and more, supply 
chain management is viewed and addressed as a 
control and optimization problem in research. 



Control Hierarchy of Large Processing Plants: An 
Overview, Fig. 5 Two possible implementations of RTO 


Future Control and Optimization at 
Plant Scale 

Going back to Fig. 1, the variety of control and 
optimization problems increases as we move up 
in the control hierarchy, entering the field of 
dynamic process operations and considering not 
only individual process units but also larger sets 
of equipment or whole plants. Examples at the 
RTO (or MES) level are optimal management of 
shared resources or utilities, production bottle¬ 
neck avoidance, optimal energy use or maximum 
efficiency, smooth transitions against production 
changes, etc. 

Above, we have mentioned RTO as the most 
common approach for plant-wide optimization. 
Normally, RTO systems perform the optimization 
of an economic cost function using a nonlinear 


process model in steady state and the correspond¬ 
ing operational constraints to generate targets for 
the control systems on the lower layers. The 
implementation of RTO provides consistent ben¬ 
efits by looking at the optimal operation problem 
from a plant-wide perspective. Nevertheless, in 
practice, when MPCs with local optimizers are 
operating the process units, many coordination 
problems appear between these layers, due to dif¬ 
ferences in models and targets, so that driving the 
operation of these process units in a coherent way 
with the global economic targets is an additional 
challenge. 

A different perspective is taken by the 
so-called self-optimizing control (Fig. 5 right, 
Skogestad 2000) that, instead of implementing 
the RTO solution online, uses it to design a 
control structure that assures a near optimum 
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Control Hierarchy of Large Processing Plants: An 
Overview, Fig. 6 Direct dynamic optimization 


operation if some specially chosen variables are 
maintained closed to their targets. 

As in any model-based approach, the problem 
of how to implement or modify the theoretical 
optimum computed by RTO so that the optimum 
computed with the model and the real optimum 
of the process coincide in spite of model errors, 
disturbances, etc., emerges. A common choice to 
deal with this problem is to update periodically 
the model using parameter estimation methods 
or data reconciliation with plant data in steady 
state. Also, uncertainty can be explicitly taken 
into account by considering different scenarios 
and optimizing the worst case, but this is con¬ 
servative and does not take advantage of the 
plant measurements. Along this line, there are 
proposals of other solutions such as modifier- 
adaptation methods that use a fixed model and 
process measurements to modify the optimization 
problem so that the final result corresponds to the 
process optimum (Marchetti et al. 2009) or the 
use of stochastic optimization where several sce¬ 
narios are taken into account and future decisions 
are used as recourse variables (Lucia et al. 2013). 

RTO is formulated in steady state, but in prac¬ 
tice, most of the time the plants are in transients, 
and there are many problems, such as start-up 
optimization, that require a dynamic formulation. 
A natural evolution in this direction is to combine 
nonlinear MPC with economic optimization so 
that the target of the NMPC is not set point 
following but direct economic optimization as in 
the right-hand side of Fig. 6: ► Economic Model 


Hierarchical 




Distributed 



Control Hierarchy of Large Processing Plants: An 
Overview, Fig. 7 Hierarchical, price coordination, and 
distributed approaches 


Predictive Control and ►Model-Based Perfor¬ 
mance Optimizing Control (Engell 2007). 

The type of problems that can be formulated 
within this framework is very wide, as are the 
possible fields of application. Processes with dis¬ 
tributed parameter structure or mixtures of real 
and on/off variables, batch and continuous units, 
statistical distribution of particle sizes or proper¬ 
ties, etc., give rise to special type of NMPC prob¬ 
lems (see, e.g., Lunze and Lamnabhi-Lagarrigue 
2009), but a common characteristic of all of them 
is the fact that they are computational intensive 
and should be solved taking into account the 
different forms of uncertainty always present. 

Control and optimization are nowadays 
inseparable essential parts of any advanced 
approach to dynamic process operation. 
Progress in the field and spreading of the 
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industrial applications are possible thanks to 
the advances in optimization methods and tools 
and computing power available on the plant 
level, but implementation is still a challenge 
from many points of view, not only technical. 
Few suppliers offer commercial products, 
and finding optimal operation policies for a 
whole factory is a complex task that requires 
taking into consideration many aspects and 
elaborate information not available directly 
as process measurements. Solving large 
NMPC problems in real time may require 
breaking the associated optimization problem 
in subproblems that can be solved in parallel. 
This leads to several local controllers/optimizers, 
each one solving one subproblem involving 
variables of a part of the process and linked 
by some type of coordination. This offers a 
new point of view of the control hierarchy. 
Typically, three types of architectures are 
mentioned for dealing with this problem, 
represented in Fig. 7: In the hierarchical 
approach, coordination between local controllers 
is made by an upper layer that deals with 
the interactions, assigning targets to them. In 
price coordination, the coordination task is 
performed by a market-like mechanism that 
assigns different prices to the cost functions of 
every local controller/optimizer. Finally, in 
the distributed approach, the local controllers 
coordinate their actions by interchanging 
information about its decisions or states with 
neighbors (Scattolini 2009). 

Summary and Future Research 

Process control is a key element in the operation 
of process plants. At the lowest layer, it can 
be considered a mature, well-proven technology, 
even if many problems such as control structure 
selection and controller tuning in reality are often 
not solved well. The range of problems under 
consideration is continuously expanding to the 
upper layers of the hierarchy, merging control 
with process operation and optimization, creating 
new challenges that range from modeling and 
estimation to efficient large-scale optimization 


and robustness against uncertainty, and leading 
to new challenges and problems for research 
and possibly large improvements of plant oper¬ 
ations. 


Cross-References 

► Controller Performance Monitoring 

► Economic Model Predictive Control 

► Industrial MPC of Continuous Processes 

► Model-Based Performance Optimizing Control 

► Multiscale Multivariate Statistical Process 
Control 
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► Real-Time Optimization of Industrial Pro¬ 
cesses 
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Abstract 

Closed-loop control can significantly improve the 
performance of bioprocesses, e.g., by an increase 
of the production rate of a target molecule or 
by guaranteeing reproducibility of the production 
with low variability. In contrast to the control 
of chemical reaction systems, the biological re¬ 
actions take place inside cells which constitute 
highly regulated, i.e., internally controlled sys¬ 
tems by themselves. As a result, through evolu¬ 
tion, the same cell can and will mimic a system 
of first order in some situations and a high¬ 
dimensional, highly nonlinear system in others. 
A complete mathematical description of the pos¬ 
sible behaviors of the cell is still beyond reach 
and would be far too complicated as a basis for 
model-based process control. This makes super¬ 
vision, control, and optimization of biosystems 
very demanding. 

Keywords 

Bioprocess control; Control of uncertain sys¬ 
tems; Optimal control; Parameter identification; 
State estimation; Structure identification; Struc¬ 
tured models 

Introduction 

Biotechnology offers solutions to a broad 
spectrum of challenges faced today, e.g., for 
health care, remediation of environmental 
pollution, new sources for energy supplies, 
sustainable food production, and the supply of 


bulk chemicals. To explain the needs for control 
of bioprocesses, especially for the production of 
high-value and/or large-volume compounds, it 
is instructive to have a look on the development 
of a new process. If a potential strain is found 
or genetically engineered, the biologist will 
determine favorable environmental factors for the 
growth of and the production of the target product 
by the cells. These factors typically comprise the 
levels of temperature, pH, dissolved oxygen, etc. 
Moreover, concentration regions for the nutrients, 
precursors, and so-called trace elements are 
specified. Whereas for the former variables 
often “optimal” setpoints are provided which, 
at least in smaller scale reactors, can be easily 
maintained by independent classically designed 
controllers, information about the best nutrient 
supply is incomplete from a control engineering 
point of view. It is this dynamic nutrient supply 
which is most often not revealed in the biological 
laboratory and which, however, offers substantial 
room for production improvements by control. 

Irrespective whether bacteria, yeasts, fungi, or 
animal cells are used for production, these cells 
will consist of thousands of different compounds 
which react with each other in hundreds or more 
reactions. All reactions are tightly regulated on a 
molecular and genetic basis; see ► Deterministic 
Description of Biochemical Networks. For so- 
called unlimited growth conditions, all cellular 
compartments will be built up with the same 
specific growth rate, meaning that the cellular 
composition will not change over time. In a 
mathematical model describing growth and pro¬ 
duction, only one state variable will be needed 
to describe the biotic phase. This will give rise 
to unstructured models; see below. Whenever a 
cell enters a limitation, which is often needed 
for production, the cell will start to reorganize 
its internal reaction pathways. Model-based ap¬ 
proaches of supervision and control based on 
unstructured models are now bound to fail. More 
biotic state variables are needed. However, it is 
not clear which and how many. As a result, mod¬ 
eling of limiting behaviors is challenging and cru¬ 
cial for the control of biotechnological processes. 
It requires a large amount of process-specific 
information. Moreover, model-based estimates of 
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the state of the cell and of the environment are a 
key factor as online measurements of the internal 
processes in the cell and of the nutrient concen¬ 
trations are usually impossible. Finally, as models 
used for process control have to be limited in size 
and thus only give an approximative description, 
robustness of the methods has to be addressed. 

Mathematical Models 

For the production of biotechnical goods, many 
up- and downstream unit operations are involved 
besides the biological reactions. As these pose 
no typical bio-related challenges, we will concen¬ 
trate here on the cultivation of the organisms only. 
This is mostly performed in aqueous solutions in 
special bioreactors through which air is sparged 
for a supply with oxygen. In some cases, other 
gases are supplied as well; see Fig. 1. Disregard¬ 
ing wastewater treatment plants, most cultiva¬ 
tions are still performed in a fed-batch mode, 
meaning that a small amount of cells and part 
of the nutrients are put into the reactor initially. 
Then more nutrients and correcting fluids, e.g., 


for pH or antifoam control, are added with vari¬ 
able rates leading to an unsteady behavior. The 
system to be modeled consists of the gaseous, the 
liquid, and the biotic phase inside the reactor. For 
the former ones, balance equations can be formu¬ 
lated readily. The biotic phase can be modeled 
in a structured or unstructured way. Moreover, as 
not all cells behave similarly, this may give rise to 
a segregated model formulation which is omitted 
here for brevity. 

Unstructured Models 

If the biotic phase is represented by just one 
state variable, mx, a typical example of a simple 
unstructured model of the liquid phase would be 

m x — lixmx 
rhp = fipmx 

ms = —a\\ixmx — a2jipmx + csjeedu 
mo = a 3 (a A - c 0 ) a 5 ii x m x a 6 /ipm x 
V = u 
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Control of Biotechnological Processes, Fig. 1 Modern laboratory reactor platform for control-oriented process 
development 
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Control of Biotechnological Processes, Table 1 Multiplicative rates depending on several concentrations c\,...,Ck 
with possible kinetic terms 
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with the masses m z with i = X, P, S, O for 
cells, product, substrate or nutrient, and dissolved 
oxygen, respectively. The volume is given by V, 
and the specific growth and production rates fix 
and fip depend on concentrations C/ = rrii/V , 
e.g., of the substrate S or oxygen O according to 
formal kinetics, e.g., 

_ aicscp _ 

(c s + a 8 )(c 0 + a 9 ) 
a\oc s 

^p = ---- 

a n c s + c s + a 12 

The nutrient supply can be changed by the feed 
rate u(t ) as a control input, with inflow concen¬ 
tration csjeed • Very often, just one feed stream 
is considered in unstructured models. As all pa¬ 
rameters cii have to be identified from noisy and 
infrequently sampled data, a low-dimensional 
nonlinear uncertain model results. All steps prior 
to the cultivation in which, e.g., from frozen 
cells, enough cells are produced to start the 
fermentation add to the uncertainty. Whereas the 
balance equations follow from first principles- 
based modeling, the structure of the kinetics fix 
and fip is unknown, i.e., empirical relations are 
exploited. Many different kinetic expressions can 
be used here; see Bastin and Dochain (1990) or a 
small selection shown in Table 1 . 

It has to be pointed out that, most often, 
neither cx , cp , nor cs are measured online. As 
the measurement of Co might be unreliable, the 
exhaust gas concentration of the gaseous phase 
is the main online measurement which can be 
used by employing an additional balance equa¬ 
tion for the gaseous phase. Infrequent at-line 
measurements, though, are sometimes available 
for X, P, S, especially at the lab-scale during 
process development. 


Structured Models 

In structured models, the changing composition 
and reaction pathways of the cell is accounted for. 
As detailed information about the cell’s complete 
metabolism including all regulations is missing 
for the majority if not all cells exploited in bio¬ 
processes, an approximative description is used. 
Examples are models in which a part of the real 
metabolism is described on a mechanistic level, 
whereas the rest is lumped together into one or 
very few states (Goudar et al. 2006), cybernetic 
models (Varner and Ramkrishna 1998), or com¬ 
partment models (King 1997). As an example, 
all compartment models can be written down as 

m = Aji(c) + f_ jn (u) + f^ ut (u) 

i> = y> 

i 

with vectors of streams into and out of the re¬ 
action mixture, f and f , which depend 

—i n — out 

on control inputs u ; a matrix of (stoichiomet¬ 
ric) parameters, A; a vector of reaction rates 
fi = /i(c); and, finally, a vector m comprising 
substrates, products, and more than one biotic 
state. These biotic states can be motivated, for 
example, by physiological arguments, describing 
the total amounts of macromolecules in the cell, 
such as the main building blocks DNA, RNA, and 
proteins. In very simple compartment models, the 
cell is only divided up into what is called ac¬ 
tive and inactive biomass. Again, all coefficients 
in A and the structure and the coefficients of 
all entries in /z(c) (see Table 1) are unknown 
and have to be identified based on experimental 
data. Issues of structural and practical identifia- 
bility are of major concern. For models of sys¬ 
tem biology (see ► Deterministic Description of 
Biochemical Networks), algebraic equations are 
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added that describe the dependencies between in¬ 
dividual fluxes. Then at least part of A is known. 

Describing the biotic phase with a higher de¬ 
gree of granularity does not change the mea¬ 
surement situation in the laboratory or in the 
production scale, i.e., still only very few online 
measurements will be available for control. 


Identification 

Even if the growth medium initially “only” con¬ 
sists of some 10-20 different, chemically well- 
defined substances, from which only few are 
described in the model, this situation will change 
over the cultivation time as the organisms release 
further compounds from which only few may be 
known. If, for economic reasons, complex raw 
materials are used, even the initial composition 
is unknown. Hence, measuring the concentrations 
of some of the compounds of the large set of 
substances as a basis for modeling is not trivial. 
For structured models, intracellular substances 
have to be determined additionally. These are 
embedded in an even larger matrix of compounds 
making chemical analysis more difficult. There¬ 
fore, the basis for parameter and structure identi¬ 
fication is uncertain. 

As the expensive experiments and chemical 
analysis tasks are very time consuming, some¬ 
times lasting up to several weeks, methods of 
optimal experimental design should always be 
considered in biotechnology; see ►Experiment 
Design and Identification for Control. 

The models to be built up should possess 
some predictive capability for a limited range 
of environmental conditions. This rules out un¬ 
structured models for many practical situations. 
However, for process control, the models should 
still be of manageable complexity. Medium-sized 
structured models seem to be well suited for 
such a situation. The choice of biotic states in 
m and possible structures for the reaction rates 
/X/, however, is hardly supported by biological 
or chemical evidence. As a result, a combined 
structure and parameter identification problem 
has to be solved. The choices of possible terms 
[lij in all fii give rise to a problem that exhibits 


a combinatorial explosion. Although approaches 
exist to support this modeling step (see Herold 
and King 2013 or Mangold et al. 2005) finally, 
the modeler will have to settle with a compromise 
with respect to the accuracy of the model found 
versus the number of fully identified model can¬ 
didates. As a result, all control methods applied 
should be robust in some sense. 

Soft Sensors 

Despite many advantages in the development 
of online measurements (see Mandenius and 
Titchener-Hooker 2013) systems for supervision 
and control of biotechnical processes often 
include model-based estimations schemes, such 
as extended Kalman filters (EKF); see ► Kalman 
Filters. Concentration estimates are needed for 
unmeasured substances and for quantities which 
depend on these concentrations like the growth 
rate of the cells. In real applications, formulations 
have to be used which account for delays in 
laboratory analysis of up to several hours and for 
situations in which results from the laboratory 
will not be available in the same sequence as 
the samples were taken. An example from a 
real cultivation is shown in Fig. 2. Here, the at- 
line measurement of the biomass concentration, 
c x = mx/V, is the only measurement available. 
The result of a single measurement is obtained 
about 30 min after sampling. For reference, 
unaccessible state variables, which were analyzed 
later, are shown as well along with the online 
estimates. The scatter of the data, especially of 
DNA and RNA, gives a qualitative impression of 
the measurement accuracy in biotechnology. 

Control 

Beside the relatively simple control of physical 
parameters, such as temperature, pH, dissolved 
oxygen, or carbon dioxide concentration, only 
few biotic variables are typically controlled 
with respect to a setpoint. The most prominent 
example is the growth rate of the biomass with 
the goal to reach a high cell concentration 
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Control of Biotechnological Processes, Fig. 2 Estima¬ 
tion of states of a structured model with an EKF with an 
unexpected growth delay initially. At-line measurement 
mx (red filled circles), initially predicted evolution of 


states (black), online estimated evolution (blue), off-line 
data analyzed after the experiment (open red circles) (Data 
obtained by T. Heine) 


in the reactor as fast as possible. This is the 
predominant goal when the cells are the primary 
target as in baker’s yeast cultivations or when 
the expression of the desired product is growth 
associated. For other non-growth-associated 
products, a high cell mass is desirable as well, 
as production is proportional to the amount of 
cells. If the nutrient supply is maintained above a 
certain level, unlimited growth behavior results, 
allowing the use of unstructured models for 
model-based control. An excess of nutrients 
has to be avoided, though, as some organisms, 
like baker’s yeast, will initiate an overflow 
metabolism, with products which may be 
inhibitory in later stages of the cultivation. For 
some products, such as the antibiotic penicillin, 
the organism has to grow slowly to obtain 
a high production rate. For these so-called 
secondary metabolites, low but not vanishing 
concentrations for some limiting substrates 
are needed. If setpoints are given for these 


concentrations instead, this can pose a rather 
challenging control problem. As the organisms 
try to grow exponentially, the controller must 
be able to increase the feed exponentially as 
well. The difficulty mainly arises from the 
inaccurate and infrequent measurements that 
the soft sensors/controller has to work with 
and from the danger that an intermediate 
shortage or oversupply with nutrients may switch 
the metabolism to an undesired state of low 
productivity. 

For control of biotechnical processes, 
many methods explained in this encyclopedia 
including feedforward, feedback, model-based, 
optimal, adaptive, fuzzy, neural nets, etc., can 
be and have been used (cf. Dochain 2008; 
Gnoth et al. 2008; Rani and Rao 1999). 
As in other areas of application, (robust) 
model-predictive control schemes (MPC) (see 
► Industrial MPC of Continuous Processes) are 
applied with great success in biotechnology. 
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Control of Biotechnological Processes, Fig. 3 MPC 

control and state estimation of a cultivation with S. ten- 
dae. At-line measurement mx (red filled circles ), initially 
predicted evolution of states (black), online estimated evo¬ 


lution (blue), off-line data analyzed after the experiment 
(open red circles). Off-line optimal feeding profiles Uj 
(blue broken line), MPC-calculated feeds (black, solid) 
(Data obtained by T. Heine) 


For the antibiotic production shown in Fig. 3, 
optimal feeding profiles u\ for ammonia (AM), 
phosphate (PH), and glucose (C) were calculated 
before the experiment was performed in a 
trajectory optimization such that the final 
mass of the desired antibiotic nikkomycin 
(Ni) was maximized. This resulted in the blue 
broken lines for the feeds w/. However, due to 
disturbances and model inaccuracies, an MPC 
scheme had to significantly change the feeding 
profiles, to actually obtain this high amount 
of nikkomycin; see the feeding profiles given 
in black solid lines. This example shows that, 
especially in biotechnology, off-line trajectory 
planning has to be complemented by closed-loop 
concepts. 

On the other hand, the experimental data given 
in Fig. 2 shows that significant disturbances, such 
as an unexpected initial growth delay, may occur 
in real systems as well. For this reason, the 
classical receding horizon MPC with an off-line 


determined optimal reference trajectory will not 
always be the best solution, and an online op¬ 
timization over the whole horizon has a larger 
potential (cf. Kawohl et al. 2007). 

Summary and Future Directions 

Advanced process control including soft sen¬ 
sors can significantly improve biotechnical pro¬ 
cesses. Using these techniques promotes qual¬ 
ity and reproducibility of processes (Junker and 
Wang 2006). These methods should, however, 
not only be exploited in the production scale. 
For new pharmaceutical products, the time to 
market is the decisive factor. Methods of (model- 
based) monitoring and control can help here to 
speed up process development. Since a few years, 
a clear trend can be seen in biotechnology to 
miniaturize and parallelize process development 
using multi-fermenter systems and robotic tech- 
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nologies. This trend gives rise to new challenges 
for modeling on the basis of huge data sets and 
for control in very small scales. At the same 
time, it is expected that a continued increase 
of information from bioinformatic tools will be 
available which has to be utilized for process 
control as well. Going to large-scale cultivations 
adds further spatial dimensions to the problem. 
Now, the assumption of a well-stirred, ideally 
mixed reactor does not longer hold. Substrate 
concentrations will be space dependent. Cells 
will experience changing good and bad nutrient 
environments frequently. Thus, mass transfer has 
to be accounted for, leading to partial differential 
equations as models for the process. 

Cross-References 

► Control and Optimization of Batch Processes 

► Deterministic Description of Biochemical 
Networks 

► Extended Kalman Filters 

► Experiment Design and Identification for 
Control 

► Industrial MPC of Continuous Processes 

► Nominal Model-Predictive Control 

► Nonlinear System Identification: An Overview 
of Common Approaches 
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Abstract 

We introduce control and stabilization issues for 
fluid flows along with known results in the field. 
Some models coupling fluid flow equations and 
equations for rigid or elastic bodies are presented, 
together with a few controllability and stabiliza¬ 
tion results. 

Keywords 

Control; Fluid flows; Fluid-structure systems; 
Stabilization 


Some Fluid Models 

We consider a fluid flow occupying a bounded 
domain £2 F C R N , with N = 2 or N = 3, 
at the initial time t = 0, and a domain £2 F (t) 
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at time t > 0. Let us denote by p(x,t ) e M + 
the density of the fluid at time t at the point 
x e Qp(t) and by u(x,t) e R N its velocity. 
The fluid flow equations are derived by writing 
the mass conservation 

dp 

7 ^+di \(pu) = 0 in £2f(J), for t > 0, (1) 
dt 

and the balance of momentum 
fdu \ 

p\+ (u-V)uj = diva+pf 
in ^^(0, for? > 0 


This model has to be completed with boundary 
conditions on dQp(t) and an initial condition at 
time t = 0. 

The incompressible Euler equations with con¬ 
stant density are obtained by setting v = 0 in the 
above system. 

The compressible Navier-Stokes system is ob¬ 
tained by coupling the equation of conservation 
of mass Eq. (1) with the balance of momentum 
Eq. (2), where the tensor a is defined by Eq. (4), 
and by completing the system with a constitutive 
law for the pressure. 

Control Issues 


where a is the so-called constraint tensor and 
/ represents a volumic force. For an isothermal 
fluid, there is no need to complete the system 
by the balance of energy. The physical nature 
of the fluid flow is taken into account in the 
choice of the constraint tensor a. When the vol¬ 
ume is preserved by the fluid flow transport, the 
fluid is called incompressible. The incompress¬ 
ibility condition reads as div u = 0 in £2^(0- 
The incompressible Navier-Stokes equations are 
the classical model to describe the evolution of 
isothermal incompressible and Newtonian fluid 
flows. When in addition the density of the fluid 
is assumed to be constant, p(x, t) = po , the 
equations reduce to 

div u = 0, 

fdu \ 

Po V ~dt + ( M * ) = vAu ~ + ' 

in £2f( 0> t > 0, (3) 

which are obtained by setting 
g — V + (Vu) T ^ + ^/z — div u I — pi, 

(4) 

in Eq. (2). When div u- 0, the expression of a 
simplifies. The coefficients v > 0 and /x > 0 are 
the viscosity coefficients of the fluid, and p(x,t) 
its pressure at the point x e £2f(0 and at time 
t > 0. 


There are unstable steady states of the Navier- 
Stokes equations which give rise to interesting 
control problems (e.g., to maximize the ratio 
“lift over drag”), but which cannot be observed 
in real life because of their unstable nature. In 
such situations, we would like to maintain the 
physical model close to an unstable steady state 
by the action of a control expressed in feedback 
form, that is, as a function either depending on 
an estimation of the velocity or depending on the 
velocity itself. The estimation of the velocity of 
the fluid may be recovered by using some real¬ 
time measurements. In that case, we speak of a 
feedback stabilization problem with partial infor¬ 
mation. Otherwise, when the control is expressed 
in terms of the velocity itself, we speak of a feed¬ 
back stabilization problem with full information. 

Another interesting issue is to maintain a fluid 
flow (described by the Navier-Stokes equations) 
in the neighborhood of a nominal trajectory (not 
necessarily a steady state) in the presence of 
perturbations. This is a much more complicated 
issue which is not yet solved. 

In the case of a perturbation in the initial 
condition of the system (the initial condition at 
time t = 0 is different from the nominal velocity 
held at time t = 0), the exact controllability 
to the nominal trajectory consists in looking for 
controls driving the system in finite time to the 
desired trajectory. 

Thus, control issues for fluid flows are those 
encountered in other fields. However there are 
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specific difficulties which make the correspond¬ 
ing problems challenging. When we deal with the 
incompressible Navier-Stokes system, the pres¬ 
sure plays the role of a Lagrange multiplier asso¬ 
ciated with the incompressibility condition. Thus, 
we have to deal with an infinite-dimensional non¬ 
linear differential algebraic system. In the case 
of a Dirichlet boundary control, the elimination 
of the pressure, by using the so-called Leray or 
Helmholtz projector, leads to an unusual form of 
the corresponding control operator; see Raymond 
(2006). In the case of an internal control, the 
estimation of the pressure to prove observability 
inequalities is also quite tricky; see Fernandez- 
Cara et al. (2004). From the numerical viewpoint, 
the approximation of feedback control laws leads 
to very large-size problems, and new strategies 
have to be found for tackling these issues. 

Moreover, the issues that we have described 
for the incompressible Navier-Stokes equations 
may be studied for other models like the 
compressible Navier-Stokes equations, the 
Euler equations (describing nonviscous fluid 
flows) both for compressible and incom¬ 
pressible models, or even more complicated 
models. 


Feedback Stabilization of Fluid Flows 

Let us now describe what are the known results 
for the incompressible Navier-Stokes equations 
in 2D or 3D bounded domains, with a control act¬ 
ing locally in a Dirichlet boundary condition. Let 
us consider a given steady state ( u s , p s ) satisfying 
the equation 

-vA u s + (u s • V)wy + V p s = f s , 
and div u s = 0 in Q p , 

with some boundary conditions which may be 
of Dirichlet type or of mixed type (Dirichlet- 
Neumann-Navier type). For simplicity, we only 
deal with the case of Dirichlet boundary condi¬ 
tions 

u s — gs on d£2p, 


where g s and f s are time-independent functions. 
In the case Qp (t) = Qp, not depending on t, the 
corresponding instationary model is 

— vA u -f- (u • V)i/ -|- V p — f s 
and div u- 0 in Qp x (0, oo), ^ 

U = gs + Ya =1 X (0, oo) 

u( 0) = uo on Qp. 

In this model, we assume that uo ^ u s , gi are 
given functions with localized supports in dQp 
and f{t) = (/i(0, In c it)) is a finite¬ 
dimensional control. Due to the incompressibility 
condition, the functions gi have to satisfy 



where n is the unit normal to 3^, outward £2p. 

The stabilization problem, with a prescribed 
decay rate —a <0, consists in looking for a 
control / in feedback form, that is, of the form 

f(t) = K(u(t)-u s ), (6) 

such that the solution to the Navier-Stokes system 
Eq. (5), with / defined by Eq. (6), obeys 

\\e at (u(t) - U S )\ Z < <P (IIM 0 - Ms Hz) » 

for some norm Z, provided \\uo — u s \\ z is small 
enough and where (p is a nondecreasing function. 
The mapping K , called the feedback gain, may 
be chosen linear. 

The usual procedure to solve this stabilization 
problem consists in writing the system satisfied 
by u — u s , in linearizing this system, and in 
looking for a feedback control stabilizing this 
linearized model. The issue is first to study the 
stabilizability of the linearized model and, when 
it is stabilizable, to find a stabilizing feedback 
gain. Among the feedback gains that stabilize 
the linearized model, we have to find one able 
to stabilize, at least locally, the nonlinear system 
too. 

The linearized controlled system associated 
with Eq. (5) is 
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— vAv + (u s • V)r> + (v • W)u s + Vg = 0 
and div u = 0 in x(0, oo), 
v = E^i fi(*)gi on x (0, oo), 
v (0) = vq on Q F . 


(7) 

The easiest way for proving the stabilizability of 
the controlled system Eq. (7) is to verify the Hau- 
tus criterion. It consists in proving the following 
unique continuation result. If (<pj, xj/j, Ay) is the 
solution to the eigenvalue problem 


Xj4>j - vA<pj - (u s ■ V)0y + (Vu s ) T <pj 

+V0y = 0 and div0 7 = 0 in Q F , 
<pj = 0 on 3^2/7, Re Ay > —a, (8) 


and if in addition (0y , xj/j ) satisfies 



gi -a(<pj, fj)n = 0 


for all 1 < i < N c , 


then (0y, xf/j) = 0. By using a unique continu¬ 
ation theorem due to Fabre and Lebeau (1996), 
we can explicitly determine the functions gi so 
that this condition is satisfied; see Raymond and 
Thevenet (2010). For feedback stabilization re¬ 
sults of the Navier-Stokes equations in two or 
three dimensions, we refer to Fursikov (2004), 
Raymond (2006), Barbu et al. (2006), Raymond 
(2007), Badra (2009), and Vazquez and Krstic 
(2008). 


Controllability to Trajectories of Fluid 
Flows 

If (u (t), p (0)o<r<oo i s a solution to the Navier- 
Stokes system, the controllability problem to the 
trajectory (u(t), p (t)) 0<t<oo , in time T > 0, 
may be rewritten as a null controllability problem 
satisfied by (v, q) = (u — u,p — p). The local 
null controllability in time T > 0 follows from 
the null controllability of the linearized system 
and from a fixed point argument. The linearized 
controlled system is 


^j—vAv + (u ( t ) • V) v + (v • V)u ( t ) + Vq = 0 
and divr> = 0 in Q F x (0, T ), 
v = m c f on x (0, T ), 
u(0) = Vo G L 2 (Q f ;R n ), div vo=0. 

(9) 

The nonnegative function m c is used to lo¬ 
calize the boundary control /. The control / is 
assumed to satisfy 



( 10 ) 


As for general linear dynamical systems, the null 
controllability of the linearized system follows 
from an observability inequality for the solutions 
to the following adjoint system 


—^ — vA0 — (u (t) • V) 0 + (Vw (t)) T (p+ Vxfs = 0 
and div 0 = 0 in Q F x (0, T), 

0 = 0 on x(0, T), 

0(7) G L 2 (Q f ;R n ), div 0(7) = 0. 

(ID 

Contrary to the stabilization problem, the null 
controllability by a control of finite dimension 
seems to be out of reach and it will be 
impossible in general. We look for a control 
/ G L 2 (dQ F ; R n ), satisfying Eq. (10), driving 
the solution to system Eq. (9) in time 7 to 
zero, that is, such that the solution v VQt / 
obeys v VOt f(T) = 0. The linearized system 
Eq. (9) is null controllable in time 7 > 0 by a 
boundary control / G L 2 (d£l F \ R N ) obeying 
Eq. (10), if and only if there exists C > 0 such 
that 



dx < C 



m c |cr(0, x/f)n\ 2 dx, 

( 12 ) 


for all solution (0,0) of Eq. (11). The 
observability inequality Eq. (12) may be proved 
by establishing weighted energy estimates called 
“Carleman-type estimates”; see Fernandez-Cara 
et al. (2004) and Fursikov and Imanuvilov 
(1996). 
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Additional Controllability Results for 
Other Fluid Flow Models 

The null controllability of the 2D incompressible 
Euler equation has been obtained by J.-M. Coron 
with the so-called Return Method (Coron 1996). 
See also Coron (2007) for additional references 
(in particular, the 3D case has been treated by O. 
Glass). 

Some null controllability results for the 
one-dimensional compressible Navier-Stokes 
equations have been obtained in Ervedoza et al. 
( 2012 ). 


Fluid-Structure Models 

Fluid-structure models are obtained by coupling 
an equation describing the evolution of the fluid 
flow with an equation describing the evolution 
of the structure. The coupling comes from the 
balance of momentum and by writing that at 
the fluid-structure interface, the fluid velocity is 
equal to the displacement velocity of the struc¬ 
ture. 

The most important difficulty in studying 
those models comes from the fact that the domain 
occupied by the fluid at time t evolves and 
depends on the displacement of the structure. 
In addition, when the structure is deformable, 
its evolution is usually written in Lagrangian 
coordinates while fluid flows are usually 
described in Eulerian coordinates. 

The structure may be a rigid or a deformable 
body immersed into the fluid. It may also be a 
deformable structure located at the boundary of 
the domain occupied by the fluid. 


The domain £2^(0 and the flow Xs associated 
with the motion of the structure obey 

Xs(y, t)=h(t) + Q(t)Q^(y-hm, 
for y e £2^(0) = £2s, (13) 

a s (t) = X s (Qs(P),t), 

and the matrix 2(0 is related to the angular 
velocity co : (0, T) i-> M 3 , by the differential 
equation 

Q'(t) = co(t) X 2(0, 2(0) = 2o. (14) 

We consider the case when the fluid flow satisfies 
the incompressible Navier-Stokes system Eq. (3) 
in the domain £2^(0 corresponding to Fig. 1. 
Denoting by J(t) G M 3x3 the tensor of inertia 
at time t, and by m the mass of the rigid body, the 
equations of the structure are obtained by writing 
the balance of linear and angular momenta 

mh" = / o(u, p)ndx, 

JdQsit) r 

Jco' = Jco x co - h / (x — h) x o(u , p)ndx , 

JdQ s (t) 

h( 0) = ho, h\ 0) = h\, co( 0) = co 0 , 

(15) 

where n is the normal to (0 outward £2^(0- 
The system Eqs. (3) and (13)—(15) has to be com¬ 
pleted with boundary conditions. At the fluid- 
structure interface, the fluid velocity is equal to 
the displacement velocity of the rigid solid: 

u(x, t ) = h'(t) + co(t) x (x — (16) 

for all v G d£2s(t), t > 0. The exterior bound¬ 
ary of the fluid domain is assumed to be fixed 


A Rigid Body Immersed in a 
Three-Dimensional Incompressible 
Viscous Fluid 

In the case of a 3D rigid body £2^(0 immersed 
in a fluid flow occupying the domain ^/?(2, the 
motion of the rigid body may be described by 
the position h(t) e M 3 of its center of mass 
and by a matrix of rotation 2(0 G M 3 x 3 . 



Fig. 1 
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T e = d£2p(t )\9£2 s (t). The boundary condition 
on T e x (0, T) may be of the form 

u — m c f on T e x (0, oo), (17) 

with f T m c f • n = 0, / is a control, and m c a 
localization function. 


An Elastic Beam Located at the 
Boundary of a Two-Dimensional 
Domain Filled by an Incompressible 
Viscous Fluid 

When the structure is described by an infinite¬ 
dimensional model (a partial differential equation 
or a system of p.d.e.), there are a few existence 
results for such systems and mainly existence of 
weak solutions (Chambolle et al. 2005). But for 
stabilization and control problems of nonlinear 
systems, we are usually interested in strong so¬ 
lutions. Let us describe a two-dimensional model 
in which a one-dimensional structure is located 
on a flat part Ts = (0, L)x (y 0 } of the boundary 
of the reference configuration of the fluid domain 
Qf- We assume that the structure is a Euler- 
Bernoulli beam with or without damping. The 
displacement rj of the structure in the direction 
normal to the boundary Ts is described by the 
partial differential equation 

?]tt brj xx CTjtxx~\~^^lxxxx = F, in Ts x (0, oo), 
rj = 0 and r] x = 0 on dTs x (0, oo), 
r)( 0) = rf x and rj,( 0) = rf 2 in F s . 

(18) 

where rj x , r] xx , and r] X xxx stand for the first, 
the second, and the fourth derivative of rj with 
respect to x e Ts. The other derivatives are 
defined in a similar way. The coefficients b 
and c are nonnegative, and a >0. The term 
cTjtxx is a structural damping term. At time 
t, the structure occupies the position Ts(t) = 
{(*, y) \x e (0, L), y = y 0 + rj(x, t)}. When 
is a two-dimensional model, Ts is of 
dimension one, and dTs is reduced to the two 
extremities of Ts. The momentum balance is 


obtained by writing that F in Eq. (18) is given 
by F = —y/\ + rjx cr(w, p)n • n, where n(x, y ) 
is the unit normal at (x,y) e Ts(t) to Ts(t) 
outward £2^(0> an d n i s the unit normal to 
T s outward £2^(0) = Qp- If in addition, a 

control / acts as a distributed control in the 
beam equation, we shall have 

F = - y 1 + r\\ a(u, p)h - n + f (19) 
The equality of velocities on Ts(t) reads as 

u(x, y 0 + rj(x, t)) = (0, rj t (x, t)), 
x e (0, L), t > 0. (20) 

Control of Fluid-Structure Models 

To control or to stabilize fluid-structure models, 
the control may act either in the fluid equation or 
in the structure equation or in both equations. 
There are a very few controllability and 
stabilization results for systems coupling the 
incompressible Navier-Stokes system with a 
structure equation. We state below two of those 
results. Some other results are obtained for 
simplified one-dimensional models coupling 
the viscous Burgers equation coupled with the 
motion of a mass; see Badra and Takahashi 
(2013) and the references therein. 

We also have to mention here recent papers on 
control problems for systems coupling quasi¬ 
stationary Stokes equations with the motion 
of deformable bodies, modeling microorganism 
swimmers at low Reynolds number; see Alouges 
et al. (2008). 


Null Controllability of the 
Navier-Stokes System Coupled with 
the Motion of a Rigid Body 

The system coupling the incompressible Navier- 
Stokes system Eq. (3) in the domain drawn 
in Fig. 1, with the motion of a rigid body 
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described by Eqs. (13)—(16), with the boundary 
control Eq. (17) is null controllable locally in a 
neighborhood of 0. Before linearizing the system 
in a neighborhood of 0, the fluid equations have 
to be rewritten in Lagrangian coordinates, that 
is, in the cylindrical domain x (0, oo). The 
linearized system is the Stokes system coupled 
with a system of ordinary differential equations. 
The proof of this null controllability result relies 
on a Carleman estimate for the adjoint system; 
see, e.g., Boulakia and Guerrero (2013). 

Feedback Stabilization of the 
Navier-Stokes System Coupled with a 
Beam Equation 

The system coupling the incompressible Navier- 
Stokes system Eq. (3) in the domain drawn in 
Fig. 2, with beam Eqs. (18)-(20), can be locally 
stabilized with any prescribed exponential decay 
rate —a < 0, by a feedback control / acting in 
Eq. (18) via Eq. (19); see Raymond (2010). The 
proof consists in showing that the infinitesimal 
generator of the linearized model is an analytic 
semigroup (when c > 0), that its resolvent is 
compact, and that the Hautus criterion is satisfied. 

When the control acts in the fluid equation, 
the system coupling Eq. (3) in the domain drawn 
in Fig. 2, with the beam Eqs. (18)-(20), can be 
stabilized when c > 0. To the best of our 
knowledge, there is no null controllability result 
for such systems, even with controls acting both 
in the structure and fluid equations. The case 
where the beam equation is approximated by a 
finite-dimensional model is studied in Lequeurre 
(2013). 



Fig. 2 
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Abstract 

The presence of time delays in dynamical sys¬ 
tems may induce complex behavior, and this be¬ 
havior is not always intuitive. Even if a system’s 
equation is scalar, oscillations may occur. Time 
delays in control loops are usually associated 
with degradation of performance and robustness, 
but, at the same time, there are situations where 
time delays are used as controller parameters. 


Keywords 

Delay differential equations; Delays as controller 
parameters; Functional differential equation 


Introduction 

Time-delays are important components of many 
systems from engineering, economics, and the 
life sciences, due to the fact that the transfer 
of material, energy, and information is mostly 
not instantaneous. They appear, for instance, as 
computation and communication lags, they 


model transport phenomena and heredity, and 
they arise as feedback delays in control loops. 
An overview of applications, ranging from traffic 
flow control and lasers with phase-conjugate 
feedback, over (bio)chemical reactors and cancer 
modeling, to control of communication networks 
and control via networks, is included in Sipahi 
etal. (2011). 

The aim of this contribution is to describe 
some fundamental properties of linear control 
systems subjected to time-delays and to outline 
principles behind analysis and synthesis methods. 
Throughout the text, the results will be illustrated 
by means of the scalar system 

x(t) = u(t — r), (1) 

which, controlled with instantaneous state feed¬ 
back, u(t ) = — kx(t ), leads to the closed-loop 
system 

x(t ) = —kx(t — r). (2) 

Although this didactic example is extremely sim¬ 
ple, we shall see that its dynamics are already 
very rich and shed a light on delay effects in 
control loops. 

In some works, the analysis of (2) is called the 
hot shower problem , as it can be interpreted as 
a (over)simplified model for a human adjusting 
the temperature in a shower: x(t ) then denotes 
the difference between the water temperature and 
the desired temperature as felt by the person, the 
term - kx(t ) models the reaction of the person 
by further opening or closing taps, and the delay 
is due to the propagation with finite speed of the 
water in the ducts. 


Basis Properties of Time-Delay 
Systems 

Functional Differential Equation 

We focus on a model for a time-delay system 
described by 

x(t^ — Aqv(^) T- A\x(jt — t), x(t ) G M". (3) 
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This is an example of a functional differential 
equation (FDE) of retarded type. The term FDE 
stems from the property that the right-hand side 
can be interpreted as a functional evaluated at a 
piece of trajectory. The term retarded expresses 
that the right-hand side does not explicitly depend 
on x. 

As a first difference with an ordinary differ¬ 
ential equation, the initial condition of (3) at 
t = 0 is a function f from [—r, 0] to W 1 . For 
all 0 G C ([—r, 0], IT), where C ([—r, 0], R") is 
the space of continuous functions mapping the 
interval [—r, 0] into R", a forward solution x(f) 
exists and is uniquely defined. In Fig. 1, a solution 
of the scalar system (2) is shown. 

The discontinuity in the derivative at t = 0 
stems from Aof(0) + Aif(— r) ^ lim^o0- 
Due to the smoothing property of an integrator, 
however, at t = n e N, the discontinuity will 
only be present in the (n + l)th derivative. 
This illustrates a second property offunctional 


differential equations of retarded type: solutions 
become smoother as time evolves. As a 
third major difference with ODEs, backward 
continuation of solutions is not always possible 
(Michiels and Niculescu 2007). 

Reformulation in a First-Order Form 

The state of system (3) at time t is the minimal in¬ 
formation needed to continue the solution, which, 
once again, boils down to a function segment 
Xt (0)where x t (f)(6) = x(t + 6), 6 e [—r, 0] (in 
Fig. 1, the function x t is shown in red for t = 5). 
This suggests that (3) can be reformulated as a 
standard ordinary differential equation over the 
infinite-dimensional space C([— r, 0], W 1 ). This 
equation takes the form 

fz(t) = Az(t), z(t ) e C ([— t, 0], K") (4) 

at 

where operator A is given by 


. UeC([-T„,fl],r): <fi e C ([—r m , 0], K") ) 

' ( <P (0) = Ao<p (0) + A\<p (—r) j 

M = % (5) 


The relation between solutions of (3) and (4) 
is given by z(t)(6) = x(t + 0), 0 e [—r, 0]. 
Note that all system information is concentrated 
in the nonlocal boundary condition describing the 
domain of A. The representation (4) is closely 
related to a description by an advection PDE with 
a nonlocal boundary condition (Krstic 2009). 

Asymptotic Growth Rate of Solutions 
and Stability 

The reformulation of (3) into the standard 
form (4) allows us to define stability notions 
and to generalize the stability theory for ordinary 
differential equations in a straightforward way, 
with the main change that the state space is 
C([— r, 0],M n ). For example, the null solution 
of (3) is exponentially stable if and only if there 
exist constants C > 0 and y > 0 such that 

v* 6 C ([—r m , 0], R") ||*, m\ s < Ce~y 1 1|0||,, 


where 11 • | \ s is the supremum norm and | \<j>\\ s = 
sup^g[_ r>0 ] | \f(6) 11 2 • As the system is linear, 
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Solution of (2) for r = 1 , k = 1 , and initial condition 
0—1 
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asymptotic stability and exponential stability are 
equivalent. A direct generalization of Lyapunov’s 
second method yields: 

Theorem 1 The null solution of linear system 
(3) is asymptotically stable if there exist a 
continuous functional V : C([—r, 0], R”) —> R (a 
so-called Lyapunov-Krasovskii functional) and 
continuous nondecreasing functions u,v,w : 
M + -> M + with 

u{ 0) = v (0) = w(0) = 0 and u(s) > 0, 
v(s) > 0,w(s) > for s > 0, 

such that for all 0 e C([—r, 0], W 1 ) 

V(cf>)<-w( II 0 OII 2 ), 

where 

V{f) = Yxm sup \[V{x h m-Vm- 
0 + n 

Converse Lyapunov theorems and the con¬ 
struction of the so-called complete-type 
Lyapunov-Krasovskii functionals are discussed 
in Kharitonov (2013). Imposing a particular 
structure on the functional, e.g., a form depending 
only on a finite number of free parameters, 
often leads to easy-to-check stability criteria 
(for instance, in the form of LMIs), yet as price 
to pay, the obtained results may be conservative 
in the sense that the sufficient stability conditions 
might not be close to necessary conditions. 
As an alternative to Lyapunov functionals, 
Lyapunov functions can be used as well, provided 
that the condition V < 0 is relaxed (the so- 
called Lyapunov-Razumikhin approach); see, for 
example, Gu et al. (2003). 

Delay Differential Equations as 
Perturbation of ODEs 

Many results on stability, robust stability, and 
control of time-delay systems are explicitly or 
implicitly based on a perturbation point of view, 
where delay differential equations are seen as 
perturbations of ordinary differential equations. 
For instance, in the literature, a classification 
of stability criteria is often presented in terms 


of delay-independent criteria (conditions holding 
for all values of the delays) and delay-dependent 
criteria (usually holding for all delays smaller 
than a bound). This classification has its origin at 
two different ways of seeing (3) as a perturbation 
of an ODE, with as nominal system x(t) = 
Aox(t) and x(t) = (Ao + A\)x(t) (system 
for zero delay), respectively. This observation is 
illustrated in Fig. 2 for results based on input- 
output- and Lyapunov-based approaches. 


The Spectrum of Linear Time-Delay 
Systems 

Two Eigenvalue Problems 

The substitution of an exponential solution in (3) 
leads us to the nonlinear eigenvalue problem 

(A/ -A 0 -Aie~ XT )v = 0,AeC,t)eC>^0. 

( 6 ) 

The solutions of the equation det (XI — Ao — 
A\e~ Xx ) = 0 are called characteristic roots. 
Similarly, formulation (4) leads to the equivalent 
infinite-dimensional linear eigenvalue problem 

{XI — A)u = 0, X G C, u G C([— t, 0], (C 77 ), u^ 0. 

(7) 

The combination of these two viewpoints lays 
at the basis of most methods for computing 
characteristic roots; see Michiels (2012). On the 
one hand, discretizing (7), i.e., approximating 
A with a matrix, and solving the resulting 
standard eigenvalue problems allow to obtain 
global information, for example, estimates of 
all characteristic roots in a given compact set 
or in a given right half plane. On the other 
hand, the (finitely many) nonlinear equations (6) 
allow to make local corrections on characteristic 
root approximations up to the desired accuracy, 
e.g., using Newton’s method or inverse residual 
iteration. Linear time-delay systems satisfy 
spectrum-determined growth properties of 
solutions. For instance, the zero solution of (3) 
is asymptotically stable if and only if all 
characteristic roots are in the open left half plane. 
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Delay-independent results Delay-dependent results 

x(t) = A 0 x(t)+A 1 ^(t — t) x(t) = (A 0 + Al)a;(t)+A 1 (a;(t — r) — x(t)) 

input-output setting: 



11 I ju; I 

Lyapunov setting: 

V = x T Px + f ... V=x T Px + f... 

where where 

Aq T P + PA q < 0 (Aq + A x ) t P + P(A 0 + A-jJ < 0 


Control of Linear Systems with Delays, Fig. 2 The classification of stability criteria in delay-independent results and 
delay-dependent results stems from two different perturbation viewpoints. Here, perturbation terms are printed in red 



Control of Linear Systems with Delays, Fig. 3 (Left) Rightmost characteristic roots of (2) for kz = 1. (Right) Real 
parts of rightmost characteristic roots as a function of kz 


In Fig. 3 (left), the rightmost characteristic 
roots of (2) are depicted for kr = 1. Note that 
since the characteristic equation can be written 
as At + kre~ Xr = 0 ,k and r can be combined 
into one parameter. In Fig. 3 (right), we show the 
real parts of the characteristic roots as a func¬ 
tion of kr. The plots illustrate some important 
spectral properties of retarded-type FDEs. First, 
even though there are in general infinitely many 
characteristic roots, the number of them in any 
right half plane is always finite. Second, the indi¬ 
vidual characteristic roots, as well as the spectral 


abscissa , i.e., the supremum of the real parts of 
all characteristic roots, continuously depend on 
parameters. Related to this, a loss or gain of 
stability is always associated with characteristic 
roots crossing the imaginary axis. Figure 3 (right) 
also illustrates the transition to a delay-free sys¬ 
tem as kr —> 0 + . 

Critical Delays: A Finite-Dimensional 
Characterization 

Assume that for a given value of k , we are 
looking for values of the delay r c for which (2) 
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has a characteristic root j(D c on the imaginary 
axis. From jco = —ke~ ja)T , we get 




5 - -|- I2n 

= k, x c = 1 -, l 


(Dr 


°''.® 


Critical delay values r c are indicated with green 
circles on Fig. 3 (right). The above formulas first 
illustrate an invariance property of imaginary 
axis roots and their crossing direction with re¬ 
spect to delay shifts of lit/(D c . Second, the num¬ 
ber of possible values of co c is one and thus finite. 
More generally, substituting A = jco in (6) and 
treating r as a free parameter lead to a two- 
parameter eigenvalue problem 


spectral abscissa, is given by — 1/r; hence, large 
delays can only be tolerated at the price of a 
degradation of the rate of convergence. It should 
be noted that the limitations induced by delays are 
even more stringent if the uncontrolled systems 
are exponentially unstable, which is not the case 
for (2). 

The analysis in the previous sections gives 
a hint why control is difficult in the presence 
of delays: the system is inherently infinite 
dimensional. As a consequence, most control 
design problems which involve determining a 
finite number of parameters can be interpreted 
as reduced-order control design problems or 
as control design problems for under-actuated 
systems, which both are known to be hard 
problems. 


(jcol - A 0 - Aiz)v = 0 , (9) 

with to on the real axis and z := exp(— jcor) 
on the unit circle. Most methods to solve such a 
problem boil down to an elimination of one of the 
independent variables co or z- As an example of an 
elimination technique, we directly get from (9) 

j(D E (j(Ao + Aiz), —jco E ct(Aq + A*z i) 

=r> det ((Ao + A\z) ® (Aq + A*z *)) — 0, 

where cr(-) denotes the spectrum and ® the 
Kronecker sum. Clearly, the resulting eigenvalue 
problem in z is finite dimensional. 

Control of Linear Time-Delay System 

Limitations Induced by Delays 

It is well known that delays in control loop 
may lead to a significant degradation of per¬ 
formance and robustness and even to instability 
(Niculescu 2001; Richard 2003). Let us return 
to example (2). As illustrated with Fig. 3 and 
expressions (8), the system loses stability if r 
reaches the value jt/2k, while stability cannot 
be recovered for larger delays. The maximum 
achievable exponential decay rate of the solu¬ 
tions, which corresponds to the minimum of the 


Fixed-Order Control 

Most standard control design techniques lead to 
controllers whose dimension is larger or equal 
to the dimension of the system. For infinite¬ 
dimensional time-delay system, such controllers 
might have a disadvantage of being complicated 
and hard to implement. To see this, for a system 
with delay in the state, the generalization of 
static state feedback, u{t) = k(x), is given by 
u(t ) = f® T x(t + 6)dpi{0 ), where /x is a function 
of bounded variation. However, in the context 
of large-scale systems, it is known that reduced- 
order controllers often perform relatively well 
compared to full-order controllers, while they are 
much easier to implement. 

Recently, new methods for the design of con¬ 
trollers with a prescribed order (dimension) or 
structure have been proposed (Michiels 2012). 
These methods rely on a direct optimization of 
appropriately defined cost functions (spectral ab¬ 
scissa, criteria). While H 2 criteria can be 

addressed within a derivative-based optimization 
framework, Hoo criteria and the spectral abscissa 
require targeted methods for non-smooth opti¬ 
mization problems. To illustrate the need for such 
methods, consider again Fig. 3 (right): minimiz¬ 
ing the spectral abscissa for a given value of r 
as a function of the controller gain k leads to an 
optimum where the objective function is not dif¬ 
ferentiable, even not locally Lipschitz, as shown 
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by the red circle. In case of multiple controller 
parameters, the path of steepest descent in the pa¬ 
rameter space typically has phases along a man¬ 
ifold characterized by the non-differentiability of 
the objective function. 

Using Delays as Controller Parameters 

In contrast to the detrimental effects of delays, 
there are situations where delays have a beneficial 
effect and are even used as controller parameters; 
see Sipahi et al. (2011). For instance, delayed 
feedback can be used to stabilize oscillatory sys¬ 
tems where the delay serves to adjust the phase in 
the control loop. Delayed terms in control laws 
can also be used to approximate derivatives in 
the control action. Control laws which depend 
on the difference x(t) — x(t — r), the so-called 
Pyragas-type feedback, have the property that the 
position of equilibria and the shape of periodic 
orbits with period r are not affected, in contrary 
to their stability properties. Last but not least, 
delays can be used in control schemes to generate 
predictions or to stabilize predictors, which allow 
to compensate delays and improve performance 
(Krstic 2009; Zhong 2006). Let us illustrate the 
main idea once more with system (1). 

System (1) has a special structure, in the sense 
that the delay is only in the input, and it is advan¬ 
tageous to exploit this structure in the context of 
control. Coming back to the didactic example, the 
person who is taking a shower is - possibly after 
some bad experiences - aware about the delay 
and will take into account his/her prediction of 
the system’s reaction when adjusting the cold and 
hot water supply. Let us, to conclude, formalize 
this. The uncontrolled system can be rewritten as 
x(t) = v(t), where v{t) = u(t — r). We know 
u up to the current time t\ thus, we know v up 
to time t - hr, and if x(t) is also known, we can 
predict the value of x at time t - hr, 

v(s)ds 

= x(t) + / u{s)ds, 

J t—T 

and use the predicted state for feedback. With the 
control law u(t ) = —kx p (t + r), there is only 


one closed-loop characteristic root at A = —k, 
i.e., as long as the model used in the predictor 
is exact, the delay in the loop is compensated by 
the prediction. For further reading on prediction- 
based controllers, see, e.g., Krstic (2009) and the 
references therein. 


Conclusions 

Time-delay systems, which appear in a large 
number of applications, are a class of infinite¬ 
dimensional systems, resulting in rich dynamics 
and challenges from a control point of view. The 
different representations and interpretations and, 
in particular, the combination of viewpoints lead 
to a wide variety of analysis and synthesis tools. 
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Abstract 

Control of machining processes encompasses 
a broad range of technologies and innovations, 
ranging from optimized motion planning and 
servo drive loop design to on-the-fly regulation 
of cutting forces and power consumption to 
applying control strategies for damping out 
chatter vibrations caused by the interaction of 
the chip generation mechanism with the machine 
tool structural dynamics. This article provides a 
brief introduction to some of the concepts and 
technologies associated with machining process 
control. 

Keywords 
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Introduction 

Machining is used extensively in the manufac¬ 
turing industry as a shaping process, where high 
product accuracy, quality, and strength are re¬ 
quired. From automotive and aerospace compo¬ 
nents, to dies and molds, to biomedical implants, 
and even mobile device chassis, many manufac¬ 
tured products rely on the use of machining. 

Machining is carried out on machine tools, 
which are multi-axis mechatronic systems de¬ 
signed to provide the relative motion between 


the tool and workpiece, in order to facilitate the 
desired cutting operation. Figure 1 illustrates a 
single axis of a ball screw-driven machine tool, 
performing a milling operation. Here, the cutting 
process is influenced by the motion of the servo 
drive. The faster the part is fed in towards the ro¬ 
tating cutter, the larger the cutting forces become, 
following a typically proportional relationship 
that holds for a large class of milling operations 
(Altintas 2012). The generated cutting forces, in 
turn, are absorbed by the machine tool and feed 
drive structure. They cause mechanical deforma¬ 
tion and may also excite the vibration modes, 
if their harmonic content is near the structural 
natural frequencies. This may, depending on the 
cutting speed and tool and workpiece engagement 
conditions, lead to forced vibrations or chatter 
(Altintas 2012). 

The disturbance effect of cutting forces is 
also felt by the servo control loop, consisting of 
mechanical, electrical, and digital components. 
This disturbance may result in the degradation 
of tool positioning accuracy, thereby leading to 
part errors. Another input that influences the 
quality achieved in a machining operation is the 
commanded trajectory. Discontinuous or poorly 
designed motion commands, with acceleration 
discontinuity, lead typically to jerky motion, vi¬ 
brations, and poor surface finish. Beyond motion 
controller design and trajectory planning, emerg¬ 
ing trends in machining process control include 
regulating, by feedback, various outcomes of the 
machining process, such as peak resultant cutting 
force, spindle power consumption, and amplitude 
of vibrations caused by the machining process. 
In addition to using actuators and instrumentation 
already available on a machine tool, such as feed 
and spindle drives and current sensors, additional 
devices, such as dynamometers, accelerometers, 
as well as inertial or piezoelectric actuators, may 
need to be used in order to achieve the required 
level of feedback and control injection capability. 

Servo Drive Control 

Stringent requirements for part quality, typically 
specified in microns, coupled with disturbance 
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Control of Machining Processes, Fig. 1 Single axis of a ball screw-driven machine tool performing milling 


force inputs coming from the machining process, 
which can be in the order of tens to thousands 
of Newtons, require that the disturbance rejec¬ 
tion of feed drives, which act as dynamic (i.e., 
frequency dependent) “stiffness” elements, be 
kept as strong as possible. In traditional machine 
design, this is achieved by optimizing the me¬ 
chanical structure for maximum rigidity. After¬ 
wards, the motion control loop is tuned to yield 
the highest possible bandwidth (i.e., responsive 
frequency range), without interfering with the 
vibratory modes of the machine tool in a way that 
can cause instability. The P-PI position velocity 
cascade control structure, shown in Fig. 2, is 
the most widely used technique in machine tool 
drives. Its tuning guidelines have been well estab¬ 
lished in the literature (Ellis 2004). To augment 
the command following accuracy, velocity and 
acceleration feedforward, and friction compensa¬ 
tion terms are added. Increasing the closed-loop 
bandwidth yields better disturbance rejection and 
more accurate tracking of the commanded tra¬ 
jectory (Pritschow 1996), which is especially 
important in high-speed machining applications 
where elevated cutting speeds necessitate faster 
feed motion. 


It can be seen in Fig. 3 that increased axis 
tracking errors (e x and e y ) may result in increased 
contour error (e). A practical solution to mitigate 
this problem, in machine tool engineering, is 
to also match the dynamics of different motion 
axes, so that the tracking errors always assume 
an instantaneous proportion that brings the actual 
tool position as close as possible to the desired 
toolpath (Koren 1983). Sometimes, the control 
action can be designed to directly reduce the 
contour error as well, which leads to the structure 
known as “cross-coupling control” (Koren 1980). 


Trajectory Planning 

Smooth trajectory planning with at least accel¬ 
eration level continuity is required in machine 
tool control, in order to avoid inducing unwanted 
vibration or excessive tracking error during the 
machining process. For this purpose, computer 
numerical control (CNC) systems are equipped 
with various spline toolpath interpolation func¬ 
tions, such as B-splines, and NURBS. The fee¬ 
drate (i.e., progression speed along the toolpath) 
is planned in the “look-ahead” function of the 











































Control of Machining Processes 


175 


Control of Machining 
Processes, Fig. 2 P-PI 

position velocity cascade 
control used in machine 
tool drives 


Friction 

Compensation 







JH/rltf 


; Feedforward Acceleration 

J j & Velocity Compensation 

*r 



l Q/Gl rn a 1 ^ 



J-- G fp -, - » Ball [** . . 

■-H 

9 — 43 —*< 

H Vl: 4 


Control 




Drive f 


PI Velocity 
Controi 


Rotary Feedback 


Linear Feedback 


Control of Machining 
Processes, Fig. 3 

Formation of contour error 
(e), as a result of servo 
errors (e x and e y ) in the 
individual axes 



reference position —^ 


£ = contour error 
e x , e y = axis tracking 
errors 

reference toolpath 


velocity 


actual toolpath 
& tooi deflection 


tool position 


CNC so that the total machining cycle time is 
reduced as much as possible. This has to be 
done without violating the position-dependent 
feedrate limits already programmed into the nu¬ 
merical control (NC) code, which are specified 
by considering various constraints coming from 
the machining process. 

In feedrate optimization, axis level trajecto¬ 
ries have to stay within the velocity and torque 
limits of the drives, in order to avoid damaging 
the machine tool or causing actuator saturation. 
Moreover, as an indirect way of containing track¬ 
ing errors, the practice of limiting axis level jerk 
(i.e., rate of change of acceleration) is applied 
(Gordon and Erkorkmaz 2013). This results in 


reduced machining cycle time, while avoiding 
excessive vibration or positioning error due to 
“jerky” motion. 

An example of trajectory planning using quin- 
tic (5th degree) polynomials for toolpath param¬ 
eterization is shown in Fig. 4. Here, comparison 
is provided between unoptimized and optimized 
feedrate profiles subject to the same axis velocity, 
torque (i.e., control signal), and jerk limits. As 
can be seen, significant machining time reduc¬ 
tion can be achieved through trajectory optimiza¬ 
tion, while retaining the dynamic tool position 
accuracy. While Fig. 4 shows the result of an 
elaborate nonlinear optimization approach (Alt- 
intas and Erkorkmaz 2003), practical look-ahead 
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Control of Machining Processes, Fig. 4 Example of quintic spline trajectory planning without and with feedrate 
optimization 


algorithms have also been proposed which lead to 
more conservative cycle times but are much better 
suited for real-time implementation inside a CNC 
(Week et al. 1999). 

Adaptive Control of Machining 

There are established mathematical methods for 
predicting cutting forces, torque, power, and even 
surface finish for a variety of machining oper¬ 
ations like turning, boring, drilling, and milling 
(Altintas 2012). However, when machining com¬ 
plex components, such as gas turbine impellers, 
or dies and molds, the tool and workpiece en¬ 
gagement and workpiece geometry undergo con¬ 
tinuous change. Hence, it may be difficult to 
apply such prediction models efficiently, unless 


they are fully integrated inside a computer-aided 
process planning environment, as reported for 
3-axis machining by Altintas and Merdol (2007). 

An alternative approach, which allows the ma¬ 
chining process to take place within safe and ef¬ 
ficient operating bounds, is to use feedback from 
the machine tool during the cutting process. This 
measurement can be of the cutting forces using a 
dynamometer or the spindle power consumption. 
This measurement is then used inside a feedback 
control loop to override the commanded feedrate 
value, which has direct impact on the cutting 
forces and power consumption. This scheme can 
be used to ensure that the cutting forces do not 
exceed a certain limit for process safety or to 
increase the feed when the machining capacity 
is underutilized, thus boosting productivity. Since 
the geometry and tool engagement are generally 
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Control of Machining Processes, Fig. 5 Example of 5-axis impeller machining with adaptive force control (Source: 
Budak and Kops (2000), courtesy of Elsevier) 


Control of Machining 
Processes, Fig. 6 

Schematic of the chatter 
vibration mechanism for 
one degree of freedom 
(From: Altintas (2012), 
courtesy of Cambridge 
University Press) 
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continuously varying, the coefficients of a model 
that relates the cutting force (or power) to the feed 
command are also time-varying. Furthermore, 
in CNC controllers, depending on the trajectory 
generation architecture, the execution latency of 
a feed override command may not always be 
deterministic. Due to these sources of variabil¬ 
ity, rather than using classical fixed gain feed¬ 
back, machining control research has evolved 
around adaptive control techniques (Masory and 
Koren 1980; Spence and Altintas 1991), where 
changes in the cutting process dynamics are con¬ 
tinuously tracked and the control law, which com¬ 
putes the proceeding feedrate override, is updated 


accordingly. This approach has produced signifi¬ 
cant cycle time reduction in 5-axis machining of 
gas turbine impellers, as reported in Budak and 
Kops (2000) and shown in Fig. 5. 

Control of Chatter Vibrations 

Chatter vibrations are caused by the interaction 
of the chip generation mechanism with the 
structural dynamics of the machine, tool, and 
workpiece assembly (see Fig. 6). The relative 
vibration between the tool and workpiece gener¬ 
ates a wavy surface finish. In the consecutive tool 
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pass, a new wave pattern, caused by the current 
instantaneous vibration, is generated on top of 
the earlier one. If the formed chip, which has an 
undulated geometry, displays a steady average 
thickness, then the resulting cutting forces and 
vibrations also remain bounded. This leads to 
a stable steady-state cutting regime, known as 
“forced vibration.” On the other hand, if the 
chip thickness keeps increasing at every tool 
pass, resulting in increased cutting forces and 
vibrations, then chatter vibration is encountered. 
Chatter can be extremely detrimental to the 
machined part quality, tool life, and the machine 
tool. 

Chatter has been reported in literature to be 
caused by two main phenomena: self-excitation 
through regeneration and mode coupling. For 
further information on chatter theory, the reader is 
referred to Altintas (2012) as an excellent starting 
point. 

Various mitigation measures have been inves¬ 
tigated and proposed in order to avoid and control 
chatter. One widespread approach is to select 
chatter-free cutting conditions through detailed 
modal testing and stability analyses. Recently, 
to achieve higher material removal rates, the 
application of active damping has started to re¬ 
ceive interest. This has been realized through spe¬ 
cially designed tools and actuators (Munoa et al. 
2013; Pratt and Nayfeh 2001) and demonstrated 
productivity improvement in boring and milling 
operations. As another method for chatter sup¬ 
pression, modulation of the cutting (i.e., spindle) 
speed has been successfully applied as a means 
of interrupting the regeneration mechanism (Soli- 
man and Ismail 1997; Zatarain et al. 2008). 

Summary and Future Directions 

This article has presented an overview of various 
concepts and emerging technologies in the area of 
machining process control. The new generation 
of machine tools, designed to meet the ever¬ 
growing productivity and efficiency demands, 
will likely utilize advanced forms of these ideas 
and technologies in an integrated manner. As 
more computational power and better sensors 


become available at lower cost, one can expect 
to see new features, such as more elaborate 
trajectory planning algorithms, active vibration 
damping techniques, and real-time process 
and machine simulation and control capability, 
beginning to appear in CNC units. No doubt that 
the dynamic analysis and controller design for 
such complicated systems will require higher 
levels of rigor, so that these new technologies can 
be utilized reliably and at their full potential. 
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Abstract 

Control of networks of underwater vehicles is 
critical to underwater exploration, mapping, 
search, and surveillance in the multiscale, 
spatiotemporal dynamics of oceans, lakes, 
and rivers. Control methodologies have been 
derived for tasks including feature tracking and 
adaptive sampling and have been successfully 
demonstrated in the field despite the severe 
challenges of underwater operations. 
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Introduction 

The development of theory and methodology 
for control of networks of underwater vehicles 
is motivated by a multitude of underwater 


applications and by the unique challenges 
associated with operating in the oceans, 
lakes, and rivers. Tasks include underwater 
exploration, mapping, search, and surveillance, 
associated with problems that include pollution 
monitoring, human safety, resource seeking, 
ocean science, and marine archeology. Vehicle 
networks collect data on underwater physics, 
biology, chemistry, and geology for improving 
the understanding and predictive modeling 
of natural dynamics and human-influenced 
changes in marine environments. Because the 
underwater environment is opaque, inhospitable, 
uncertain, and dynamic, control is critical to the 
performance of vehicle networks. 

Underwater vehicles typically carry sensors 
to measure external environmental signals and 
fields, and thus a vehicle network can be regarded 
as a mobile sensor array. The underlying principle 
of control of networks of underwater vehicles 
leverages their mobility and uses an interacting 
dynamic among the vehicles to yield a high- 
performing collective behavior. If the vehicles 
can communicate their state or measure the rel¬ 
ative state of others, then they can cooperate and 
coordinate their motion. 

One of the major drivers of control of under¬ 
water mobile sensor networks is the multiscale, 
spatiotemporal dynamics of the environmental 
fields and signals. In Curtin et al. (1993), the 
concept of the autonomous oceanographic sam¬ 
pling network (AOSN), featuring a network of 
underwater vehicles, was introduced for dynamic 
measurement of the ocean environment and res¬ 
olution of spatial and temporal gradients in the 
sampled fields. For example, to understand the 
coupled biological and physical dynamics of the 
ocean, data are required both on the small-scale 
dynamics of phytoplankton, which are major ac¬ 
tors in the marine ecosystem and the global cli¬ 
mate, and on the large-scale dynamics of the flow 
field, temperature, and salinity. 

Accordingly, control laws are needed to co¬ 
ordinate the motion of networks of underwater 
vehicles to match the many relevant spatial and 
temporal scales. And for a network of underwater 
vehicles to perform complex missions reliably 
and efficiently, the control must address the many 
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uncertainties and real-world constraints including 
the influence of currents on the motion of the 
vehicles and the limitations on underwater com¬ 
munication. 

Vehicles 

Control of networks of underwater vehicles is 
made possible with the availability of small 
(e.g., 1.5-2 m long), relatively inexpensive 

autonomous underwater vehicles (AUVs). 
Propelled AUVs such as the REMUS provide 
maneuverability and speed. These kinds of AUVs 
respond quickly and agilely to the needs of 
the network, and because of their speed, they 
can often power through strong ocean flows. 
However, propelled AUVs are limited by their 
batteries; for extended missions, they need 
docking stations or other means to recharge their 
batteries. 

Buoyancy-driven autonomous underwater 
gliders, including the Slocum, the Spray, and 
the Seaglider, are a class of endurance AUVs 
designed explicitly for collecting data over large 
three-dimensional volumes continuously over 
periods of weeks or even months (Rudnick et al. 
2004). They move slowly and steadily, and, as a 
result, they are particularly well suited to network 
missions of long duration. 

Gliders propel themselves by alternately in¬ 
creasing and decreasing their buoyancy using 
either a hydraulic or a mechanical buoyancy en¬ 
gine. Lift generated by flow over fixed wings 
converts the vertical ascent/descent induced by 
the change in buoyancy into forward motion, re¬ 
sulting in a sawtooth-like trajectory in the vertical 
plane. Gliders can actively redistribute internal 
mass to control attitude, for example, they pitch 
by sliding their battery pack forward and aft. For 
heading control, they shift mass to roll, bank, 
and turn or deflect a rudder. Some gliders are 
designed for deep water, e.g., to 1,500m, while 
others for shallower water, e.g., to 200 m. 

Gliders are typically operated at their maxi¬ 
mum speed and thus they move at approximately 
constant speed relative to the flow. Because this is 
relatively slow, on the order of 0.3-0.5 m/s in the 
horizontal direction and 0.2 m/s in the vertical, 


ocean currents can sometimes reach or even ex¬ 
ceed the speed of the gliders. Unlike a propelled 
AUV, which typically has sufficient thrust to 
maintain course despite currents, a glider trying 
to move in the direction of a strong current will 
make no forward progress. This makes coordi¬ 
nated control of gliders challenging; for instance, 
two sensors that should stay sufficiently far apart 
may be pushed toward each other leading to less 
than ideal sampling conditions. 

Communication and Sensing 

Underwater communication is one of the biggest 
challenges to the control of networks of un¬ 
derwater vehicles and one that distinguishes it 
from control of vehicles on land or in the air. 
Radio-frequency communication is not typically 
available underwater, and acoustic data telemetry 
has limitations including sensitivity to ambient 
noise, unpredictable propagation, limited band¬ 
width, and latency. 

When acoustic communication is too limiting, 
vehicles can surface periodically and communi¬ 
cate via satellite. This method may be bandwidth 
limited and will require time and energy. How¬ 
ever, in the case of profiling propelled AUVs 
or underwater gliders, they already move in the 
vertical plane in a sawtooth pattern and thus 
regularly come closer to the surface. When on the 
surface, vehicles can also get a GPS fix whereas 
there is no access to GPS underwater. The GPS 
fix is used for correcting onboard dead reckoning 
of the vehicle’s absolute position and for updating 
onboard estimation of the underwater currents, 
both helpful for control. 

Vehicles are typically equipped with 
conductivity-temperature-density (CTD) sensors 
to measure temperature, salinity, and density. 
From this pressure can be computed and thus 
depth and vertical speed. Attitude sensors 
provide measurements of pitch, roll, and 
heading. Position and velocity in the plane is 
estimated using dead reckoning. Many sensors 
for measuring the environment have been 
developed for use on underwater vehicles; these 
include chlorophyll fluorometers to estimate 
phytoplankton abundance, acoustic Doppler 
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profilers (ADPs) to measure variations in water 
velocity, and sensors to measure pH, dissolved 
oxygen, and carbon dioxide. 


Control 

Described here are a selection of control method¬ 
ologies designed to serve a variety of under¬ 
water applications and to address many of the 
challenges described above for both propelled 
AUVs and underwater gliders. Some of these 
methodologies have been successfully field tested 
in the ocean. 

Formations for Tracking Gradients, 
Boundaries, and Level Sets in Sampled 
Fields 

While a small underwater vehicle can take only 
single-point measurements of a field, a network 
of N vehicles employing cooperative control 
laws can move as a formation and estimate or 
track a gradient in the field. This can be done in a 
straightforward way in 2D with three vehicles and 
can be extended to 3D with additional vehicles. 
Consider N = 3 vehicles moving together in an 
equilateral triangular formation and sampling a 
2D field T : M 2 M. The formation serves as a 
sensor array and the triangle side length defines 
the resolution of the array. 

Let the position of the i th vehicle be X/ e M 2 . 
Consider double integrator dynamics x 7 = u 7 , 
where u 7 e M 2 is the control force on the i th 
vehicle. Suppose that each vehicle can measure 
the relative position of each of its neighbors, 
Xij = x 7 — Xj . Decentralized control that derives 
from an artificial potential is a popular method for 
each of the three vehicles to stay in the triangular 
formation of prescribed resolution do. Consider 
the nonlinear interaction potential Vj : M 2 —>► M 
defined as 

H,(*, J ) = ^l n ||* ( ,|| + j A 5 ) 

where k s > 0 is a scalar gain. The control law 
for the i th vehicle derives as the gradient of this 
potential with respect to x 7 as follows: 


N 

X; = u, = - VF/(x,y) — k d Xj 

where a damping term is added with scalar gain 
kd > 0. Stability of the triangle of resolution do 
is proved with the Lyapunov function 

1 N N- 1 N 

v = 2 J2 II** ii 2 + J2 Vi(Xi J )• 

i = 1 i = l j=i-\-l 

Now let each vehicle use the sequence of 
single-point measurements it takes along its path 
to compute the projection of the spatial gradient 
onto its normalized velocity, e* = x 7 /||x 7 ||, 
i.e., V7>(x,x z ) = (VT(x) • e^e*. Following 
Bachmayer and Leonard (2002), let 

N 

Xi = U; = KVTp(x,Xi)- Y2 VF/(Xy)-^X/, 

j = \,j+i 

where k is a scalar gain. For k > 0, each vehicle 
will accelerate along its path when it measures an 
increasing T and decelerates for a decreasing T . 
Each vehicle will also turn to keep up with the 
others so that the formation will climb the spatial 
gradient of T to find a local maximum. 

Alternative control strategies have been devel¬ 
oped that add versatility in feature tracking. The 
virtual body and artificial potential (VBAP) mul¬ 
tivehicle control methodology (Ogren et al. 2004) 
was demonstrated with a network of Slocum 
autonomous underwater gliders in the AOSN II 
field experiment in Monterey Bay, California, 
in August 2003 (Fiorelli et al. 2006). VBAP is 
well suited to the operational scenario described 
above in which vehicles surface asynchronously 
to establish communication with a base. 

VBAP is a control methodology for coordi¬ 
nating the translation, rotation, and dilation of a 
group of vehicles. A virtual body is defined by 
a set of reference points that move according to 
dynamics that are computed centrally and made 
available to the vehicles in the group. Artificial 
potentials are used to couple the dynamics of 
vehicles and a virtual body so that control laws 
can be derived that stabilize desired formations 
of vehicles and a virtual body. When sampled 
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measurements of a scalar field can be commu¬ 
nicated, the local gradients can be estimated. 
Gradient climbing algorithms prescribe virtual 
body direction, so that, for example, the vehicle 
network can be directed to head for the coldest 
water or the highest concentration of phytoplank¬ 
ton. Further, the formation can be dilated so that 
the resolution can be adapted to minimize error in 
estimates. Control of the speed of the virtual body 
ensures stability and convergence of the vehicle 
formation. 

These ideas have been extended further to 
design provable control laws for cooperative level 
set tracking, whereby small vehicle groups coop¬ 
erate to generate contour plots of noisy, unknown 
fields, adjusting their formation shape to provide 
optimal filtering of their noisy measurements 
(Zhang and Leonard 2010). 

Motion Patterns for Adaptive Sampling 

A central objective in many underwater applica¬ 
tions is to design provable and reliable mobile 
sensor networks for collecting the richest data 
set in an uncertain environment given limited re¬ 
sources. Consider the sampling of a single time- 
and space-varying scalar field, like temperature 
T, using a network of vehicles, where the control 
problem is to coordinate the motion of the net¬ 
work to maximize information on this field over 
a given area or volume. 

The definition of the information metric will 
depend on the application. If the data are to 
be assimilated into a high-resolution dynamical 
ocean model, then the metric would be defined 
by uncertainty as computed by the model. A 
general-purpose metric, based on objective anal¬ 
ysis (linear statistical estimation from given field 
statistics), specifies the statistical uncertainty of 
the field model as a function of where and when 
the data were taken (Bennett 2002). The pos¬ 
teriori error A(r, t) is the variance of T about 
its estimate at location r and time t. Entropic 
information over a spatial domain of area A is 

1(f) = - log JdrA(r, 0 j , 

where oq is a scaling factor (Grocholsky 2002). 


Computing coordinated trajectories to maxi¬ 
mize X(t) can in principle be addressed using 
optimal coverage control methods. However, this 
coverage problem is especially challenging since 
the uncertainty field is spatially nonuniform and 
it changes with time and with the motion of 
the sampling vehicles. Furthermore, the optimal 
trajectories may become quite complex so that 
controlling vehicles to them in the presence of 
dynamic disturbances and uncertainty may lead 
to suboptimal performance. 

An alternative approach decouples the design 
of motion patterns to optimize the entropic in¬ 
formation metric from the decentralized control 
laws that stabilize the network onto the motion 
patterns (see Leonard et al. 2007). This approach 
was demonstrated with a network of 6 Slocum 
autonomous underwater gliders in a 24-day-long 
field experiment in Monterey Bay, California, in 
August 2006 (see Leonard et al. 2010). The coor¬ 
dinating feedback laws for the individual vehicles 
derive systematically from a control methodol¬ 
ogy that provides provable stabilization of a pa¬ 
rameterized family of collective motion patterns 
(Sepulchre et al. 2008). These patterns consist of 
vehicles moving on a finite set of closed curves 
with spacing between vehicles defined by a small 
number of “synchrony” parameters. The feed¬ 
back laws that stabilize a given motion pattern use 
the same synchrony parameters that distinguish 
the desired pattern. 

Each vehicle moves in response to the relative 
position and direction of its neighbors so that it 
keeps moving, it maintains the desired spacing, 
and it stays close to its assigned curve. It has been 
observed in the ocean, for vehicles carrying out 
this coordinated control law, that “when a vehicle 
on a curve is slowed down by a strong opposing 
flow field, it will cut inside a curve to make up 
distance and its neighbor on the same curve will 
cut outside the curve so that it does not overtake 
the slower vehicle and compromise the desired 
spacing” (Leonard et al. 2010). The approach is 
robust to vehicle failure since there are no leaders 
in the network, and it is scalable since the control 
law for each vehicle can be defined in terms of 
the state of a few other vehicles, independent of 
the total number of vehicles. 
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The control methodology prescribes steering 
laws for vehicles operated at a constant speed. 
Assume that the i th vehicle moves at unit speed in 
the plane in the direction ft (t) at time t. Then, the 
velocity of the i th vehicle is x* = (cos ft, sin ft). 
The steering control ut is the component of the 
force in the direction normal to velocity, such that 
ft = Ui for i = l,... ,N. Define 

N 1 N 

U(0 U ...,0 N ) = — IM 2 , Pe = —J2*J- 

V j =i 

U is a potential function that is maximal at 1 
when all vehicle directions are synchronized and 
minimal at 0 when all vehicle directions are 
perfectly anti-synchronized. Let x* = (x/, yi ) = 
(1/ N) tU x ij and let xf- = , x,). Define 

1 N 

S(xi,...,x N ,d u ...,0 N ) = - y] ||x ( -ft)ox/-|| 2 , 

(=i 

where a>o 7 ^ 0. S is a potential function that is 
minimal at 0 for circular motion of the vehicles 
around their center of mass with radius po = 

Nr 1 . 

Define the steering control as 

N 

0i = ft>o(l + K c {xi,Xi)) - Ko Y, si n( 6 > ; - 0,-), 

j =i 

where K c > 0 and Kq are scalar gains. Then, cir¬ 
cular motion of the network is a steady solution, 
with the phase-locked heading arrangement a 
minimum of Kq U, i.e., synchronized or perfectly 
anti-synchronized depending on the sign of Kq. 
Stability can be proved with the Lyapunov func¬ 
tion V c o = K C S + KqU. This steering control 
law depends only on relative position and relative 
heading measurements of the other vehicles. 

The general form of the methodology extends 
the above control law to network interconnec¬ 
tions defined by possibly time-varying graphs 
with limited sensing or communication links, 
and it provides systematic control laws to stabi¬ 
lize symmetric patterns of heading distributions 
about noncircular closed curves. It also allows 


for multiple graphs to handle multiple scales. 
For example, in the 2006 field experiment, the 
default motion pattern was one in which six 
gliders moved in coordinated pairs around three 
closed curves; one graph defined the smaller- 
scale coordination of each pair of gliders about 
its curve, while a second graph defined the larger- 
scale coordination of gliders across the three 
curves. 

Implementation 

Implementation of control of networks of 
underwater vehicles requires coping with the 
remote, hostile underwater environment. The 
control methodology for motion patterns 
and adaptive sampling, described above, was 
implemented in the field using a customized 
software infrastructure called the Glider 
Coordinated Control System (GCCS) (Paley et al. 
2008). The GCCS combines a simple model for 
control planning with a detailed model of glider 
dynamics to accommodate the constant speed of 
gliders, relatively large ocean currents, waypoint 
tracking routines, communication only when 
gliders surface (asynchronously), other latencies, 
and more. Other approaches consider control 
design in the presence of a flow field, formal 
methods to integrate high-resolution models of 
the flow field, and design tailored to propelled 
AUVs. 


Summary and Future Directions 

The multiscale, spatiotemporal dynamics of the 
underwater environment drive the need for well- 
coordinated control of networks of underwater 
vehicles that can manage the significant opera¬ 
tional challenges of the opaque, uncertain, inhos¬ 
pitable, and dynamic oceans, lakes, and rivers. 
Control theory and algorithms have been de¬ 
veloped to enable networks of vehicles to suc¬ 
cessfully operate as adaptable sensor arrays in 
missions that include feature tracking and adap¬ 
tive sampling. Future work will improve control 
in the presence of strong and unpredictable flow 
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fields and will leverage the latest in battery and 
underwater communication technologies. Hybrid 
vehicles and heterogeneous networks of vehicles 
will also promote advances in control. Future 
work will draw inspiration from the rapidly grow¬ 
ing literature in decentralized cooperative con¬ 
trol strategies and complex dynamic networks. 
Dynamics of decision-making teams of robotic 
vehicles and humans is yet another important 
direction of research that will impact the success 
of control of networks of underwater vehicles. 


Cross-References 

► Motion Planning for Marine Control Systems 

► Underactuated Marine Control Systems 


Recommended Reading 

In Bellingham and Raj an (2007), it is argued that 
cooperative control of robotic vehicles is espe¬ 
cially useful for exploration in remote and hostile 
environments such as the deep ocean. A recent 
survey of robotics for environmental monitoring, 
including a discussion of cooperative systems, 
is provided in Dunbabin and Marques (2012). 
A survey of work on cooperative underwater 
vehicles is provided in Redfield (2013). 
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Abstract 

The reader is introduced to the predictor feedback 
method for the control of general nonlinear sys¬ 
tems with input delays of arbitrary length. The 
delays need not necessarily be constant but can 
be time-varying or state-dependent. The predictor 
feedback methodology employs a model-based 
construction of the (unmeasurable) future state of 
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the system. The analysis methodology is based on 
the concept of infinite-dimensional backstepping 
transformation - a transformation that converts 
the overall feedback system to a new, cascade 
“target system” whose stability can be studied 
with the construction of a Lyapunov function. 


Keywords 

Distributed parameter systems; Delay systems; 
Backstepping; Lyapunov function 


Nonlinear Systems with Input Delay 

Nonlinear systems of the form 

X(t) = f(X(t),U(t-D(t,X(tm, (1) 

where t e M+ is time, / : W 1 x M —> W 1 
is a vector field, X g M” is the state, D : 
M+ x —> M+ is a nonnegative function of 

the state of the system, and U e R is the scalar 
input, are ubiquitous in applications. The starting 
point for designing a control law for (1), as well 
as for analyzing the dynamics of (1) is to con¬ 
sider the delay-free counterpart of (1), i.e., when 
D = 0, for which a plethora of results exists 
dealing with its stabilization and Lyapunov-based 
analysis (Krstic et al 1995). 

Systems of the form (1) constitute more 
realistic models for physical systems than 
delay-free systems. The reason is that often 
in engineering applications the control that is 
applied to the system does not immediately affect 
the system. This dead time until the controller can 
affect the system might be due to, among other 
things, the long distance of the controller from 
the system, such as, for example, in networked 
control systems, or due to finite-speed transport 
or flow phenomena, such as, for example, in 
additive manufacturing and cooling systems, or 
due to various after-effects, such as, for example, 
in population dynamics. 


The first step toward control design and anal¬ 
ysis for system (1) is to consider the special 
case in which D = const. The next step is to 
consider the special case of system (1), in which 
D = D(t), i.e., the delay is an a priori given 
function of time. Systems with time-varying de¬ 
lays model numerous real-world systems, such 
as, networked control systems, traffic systems, 
or irrigation channels. Assuming that the input 
delay is an a priori defined function of time is 
a plausible assumption for some applications. 
Yet, the time-variation of the delay might be 
the result of the variation of a physical quantity 
that has its own dynamics, such as, for example, 
in milling processes (due to speed variations), 
3D printers (due to distance variations), cooling 
systems (due to flow rate variations), and popu¬ 
lation dynamics (due to population’s size varia¬ 
tions). Processes in this category can be modeled 
by systems with a delay that is a function of 
the state of the system, i.e., by (1) with D = 
D(X). 

In this article control designs are presented 
for the stabilization of nonlinear systems with 
input delays, with delays that are constant (Krstic 
2009), time-varying (Bekiaris-Liberis and Krstic 
2012) or state-dependent (Bekiaris-Liberis and 
Krstic 2013b), employing predictor feedback, 
i.e., employing a feedback law that uses the future 
rather than the current state of the system. Since 
one employs in the feedback law the future values 
of the state, the predictor feedback completely 
cancels (compensates) the input delay, i.e., after 
the control signal reaches the system, the state 
evolves as if there were no delay at all. Since the 
future values of the state are not a priori known, 
the main control challenge is the implementation 
of the predictor feedback law. Having determined 
the predictor, the control law is then obtained 
by replacing the current state in a nominal state- 
feedback law (which stabilizes the delay-free 
system) by the predictor. 

A methodology is presented in the article 
for the stability analysis of the closed-loop 
system under predictor feedback by constructing 
Lyapunov functionals. The Lyapunov functionals 
are constructed for a transformed (rather than 
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the original) system. The transformed system 
is, in turn, constructed by transforming the 
original actuator state U(9), 8 e [t — D,t] 
to a transformed actuator state with the 
aid of an infinite-dimensional backstepping 
transformation. The overall transformed system 
is easier to analyze than the original system 
because it is a cascade, rather than a feedback 
system, consisting of a delay line with zero 
input, whose effect fades away in finite time, 
namely, after D time units, cascaded with an 
asymptotically stable system. 


Predictor Feedback 

The predictor feedback designs are based on a 
feedback law U(t) = K{X(t)) that renders the 
closed-loop system X = f (X,k(X)) glob¬ 
ally asymptotically stable. For stabilizing sys¬ 
tem ( 1 ), the following control law is employed 
instead 

U(t) = K(P(t )), ( 2 ) 

where 


P(0) = x(t) + 


f 

Jt—L 


f(P(s),U(s )) 


a(9) = t + 


L 


-D(t,X(t)) 1 “ D > ( a ( S )’ P ( S )) “ VD (°V)’ P ( s )) / ( P ( s )> U(s)) 

6 _ 1 _ , 

-D(t,X(t)) 1 - P>t (<*(*), PW) ~ VD (cr(s), P(s )) / (P(s), U(s)) 


ds 


( 3 ) 

(4) 


for all t — D(t,X(t )) < 9 < t. The sig¬ 
nal P is the predictor of X at the appropri¬ 
ate prediction time a, i.e., P(t) = X(a(t)). 
This fact is explained in more detail in the next 
paragraphs of this section. The predictor em¬ 
ploys the future values of the state X which 
are not a priori available. Therefore, for actually 
implementing the feedback law ( 2 ) one has to 
employ (3). Relation (3) is a formula for the 
future values of the state that depends on the 
available measured quantities, i.e., the current 
state X{t) and the history of the actuator state 
U(9), 6 e [t — D (t, X(t )), t]. To make clear the 
definitions of the predictor P and the prediction 
time a, as well as their implementation through 
formulas (3) and (4), the constant delay case is 
discussed first. 

The idea of predictor feedback is to employ in 
the control law the future values of the state at 
the appropriate future time, such that the effect 
of the input delay is completely canceled (com¬ 
pensated). Define the quantity (j){t ) = t — D, 
which from now on is referred to as the de¬ 
layed time. This is the time instant at which the 
control signal that currently affects the system 


was actually applied. To cancel the effect of this 
delay, the control law (2) is designed such that 
£/(0(O) = U(t - D) = ic(X(t)) 9 i.e., such 
that U(t) = k (X (<£ -1 (0)) = k (X(t + £>)). 
Define the prediction time a through the relation 
(j)~ x {t) = a(t) = t + D. This is the time 
instant at which an input signal that is currently 
applied actually affects the system. In the case of 
a constant delay, the prediction time is simply D 
time-units in the future. Next an implementable 
formula for X(o(t)) = X(t + D ) is derived. 
Performing a change of variables t = 6 + D , for 
al \t-D <9 <tmX(t) = f (X(t), U(t - D)) 
and integrating in 0 starting at 9 = t — D, one 
can conclude that P defined by (3) with D t = 
V Df = 0 and D = const is the D time-units 
ahead predictor of X , i.e., P(t) = X{o(t)) = 
X(t + D). 

To better understand definition (3) the 
case of a linear system with a constant input 
delay D , i.e., a system of the form X(t ) = 
AX(t) + BU(t — D ), is considered next (see 
also ► Control of Linear Systems with Delays 
and Hale and Verduyn Lunel (1993)). In this 
case, the predictor P(t ) is given explicitly 
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using the variation of constants formula, with 
the initial condition P(t — D) = X(t), as 
P{t) = e AD X(t) + f t _ D e A ^BUmO- 
For systems that are nonlinear, P(t) cannot be 
written explicitly, for the same reason that a 
nonlinear ODE cannot be solved explicitly. So 
P(t) is represented implicitly using the nonlinear 
integral equation (3). The computation of P(t) 
from (3) is straightforward with a discretized 
implementation in which P(t) is assigned values 
based on the right-hand side of (3), which 
involves earlier values of P and the values of 
the input U. 

The case D = D(t) is considered next. As 
in the case of constant delays the main goal is 
to implement the predictor P . One needs first to 
define the appropriate time interval over which 
the predictor of the state is needed, which, in 
the constant delay case is simply D time-units 
in the future. The control law has to satisfy 
Um 0) = K(X(t)), or, U(t) = k (X (ct(/))). 
Hence, one needs to find an implementable for¬ 
mula for P(t ) = X (a(t)). In the constant 

delay case the prediction horizon over which one 
needs to compute the predictor can be determined 
based on the knowledge of the delay time since 
the prediction horizon and the delay time are 
both equal to D. This is not anymore true in 
the time-varying case in which the delayed time 
is defined as 0(7) = t — D(t), whereas the 
prediction time as 0 _1 (O = a(t) = t + D (a(t)). 
Employing a change of variables in X(t) = 
f ( X(t),U (t — D(t))) as t — o{ 6 ), for all 
< 0 < t and integrating in 0 starting at 
6 = 0(7) one obtains the formula for P given 
by (3) with D t = D' (cr(t)), VDf = 0 and 
D = D(t). 

Next the case D = D(X(t)) is considered. 
First one has to determine the predictor, i.e., 
the signal P such that P(t) = X (a(t)), where 
a(t) = 0 -1 (O and 0(7) = t — D (X(t)). In 
the case of state-dependent delay, the prediction 
time cr(7) depends on the predictor itself, i.e., 
the time when the current control reaches 
the system depends on the value of the state 
at that time, namely, the following implicit 
relationship holds P(t) = X(t + D(P(t))) 


(and X(t) = P(t — D(X(t)))). This implicit 
relation can be solved by proceeding as in the 
time-varying case, i.e., by performing the change 
of variables t = a (0), for all t — D ( X(t )) < 
6 < t in X(t) = f (X(t), U (t- D{X(t)))) and 
integrating in 6 starting at 6 = t — D {X(t)), 
to obtain the formula (3) for P with D t = 0, 
V Df = VD (P(s)) f (P(s), U(s)) and D = 
D{X(t)). 

Analogously, one can derive the predictor for 
the case D = D ( t,X(t )) with the difference 
that now the prediction time is not given explic¬ 
itly in terms of P, but it is defined through an 
implicit relation, namely, it holds that o(t) = 
t + D (a(t), P(t)). Therefore, for actually com¬ 
puting g one has to proceed as in the deriva¬ 
tion of P, i.e., to differentiate relation o(6) = 
0 + D (a (9), P(d)) and then integrate starting 
at the known value a (t — D (t, X(t))) = t. It 
is important to note that the integral equation (4) 
is needed in the computation of P only when D 
depends on both X and t . 


Backstepping Transformation and 
Stability Analysis 

The predictor feedback designs are based on a 
feedback law k(X) that renders the closed-loop 
system X = f (X,k(X)) globally asymptoti¬ 
cally stable. However, in the rest of the section 
it is assumed that the feedback law k(X) renders 
the closed-loop system X = f (X,k(X) + v) 
input-to-state stable (ISS) with respect to v, i.e., 
there exists a smooth function S : M" —>► M+ and 
class /Coo functions oq, d 2 , (X 3 , a 4 such that 


« 3 (l*(0l) <s(X(t)) 

<a 4 (|*(0l) (5) 


9S(X(t)) 

dx 


f(X(t) , K (X(0) 


+v(0) < —ai(|X(0l) +a 2 (M0D- (6) 


Imposing this stronger assumption enables one 
to construct a Lyapunov functional for the 
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closed-loop systems (1)—(4) with the aid of the 
Lyapunov characterization of ISS defined in (5) 
and ( 6 ). 

The stability analysis of the closed-loop 
systems (l)-(4) is explained next. Denote the 
infinite-dimensional backstepping transformation 
of the actuator state as 


With the invertibility of the backstepping trans¬ 
formation one can then show global asymptotic 
stability of the closed-loop system in the original 
variables ( X , U). In particular, there exists a class 
JCC function /3 such that 

l*(OI + sup |I7(0)| 

t-D(t,X(t))<9<t 


W(6) = U(0)-k(P(0)), 

for all* — D (t, X(t)) <0 <t, (7) 

where P(9) is given in terms of U(6) from (3). 
Using the fact that P (t — D (t, X(t))) = 

X(t ), for all t > 0, one gets from (7) that 
U(t-D(t,X(t))) = W (t - D (t, X(t))) + 
K{X(t)). With the fact that for all 0 > 0, 
U(6) = k(P(0)) one obtains from (7) that 
W(0) = 0, for all 6 > 0. Yet, for all 

t < D (t, X(t )), i.e., for all 0 < 0, W(0) might 
be nonzero due to the effect of the arbitrary initial 
condition U(0), 0 e [-D (0, Y(0)), 0]. With the 
above observations, one can transform system (1) 
with the aid of transformation (7) to the following 
target system 


X(t) = f 


+ W(t-D(t,X(t)))) 

( 8 ) 

W (t-D (f, X(t))) = 0, 


for t-D (t,X(t)) > 0. 

(9) 


Using relations (5), ( 6 ), and ( 8 ), (9) one can 
construct the following Lyapunov functional 
for showing asymptotic stability of the target 
system ( 8 ), (9), i.e., for the overall system 
consisting of the vector X(t) and the transformed 
infinite-dimensional actuator state W(6), 
t — D (t, X(t)) < 6 < t, 

V(t) = S (X(t)) + - / ———dr, ( 10 ) 

c Jo r 

where c > 0 is arbitrary and 


L(t) = sup 


y c(cr(0)—t) 


W(6 ) 


. (ID 


</Hl*(0)|+ sup \U(6)\,t , 

\ -£>(O,X(O))<0<O J 

for all t > 0 . ( 12 ) 

One of the main obstacles in designing glob¬ 
ally stabilizing control laws for nonlinear sys¬ 
tems with long input delays is the finite escape 
phenomenon. The input delay may be so large 
that the control signal cannot reach the system 
before its state grows unbounded. Therefore, one 
has to assume that the system X = f ( X , co) is 
forward complete, i.e., for every initial condition 
and every bounded input signal the corresponding 
solution is defined for all t > 0 . 

With the forward completeness requirement, 
estimate ( 12 ) holds globally for constant but 
arbitrary large delays. For the case of time- 
varying delays, estimate ( 12 ) holds globally as 
well but under the following four conditions on 
the delay: 

Cl. D(t) > 0. This condition guarantees the 
causality of the system. 

C2. D(t) < oo. This condition guarantees that 
all inputs applied to the system eventually 
reach the system. 

C3. D(t) < 1. This condition guarantees that the 
system never feels input values that are older 
than the ones it has already felt, i.e., the input 
signal’s direction never gets reversed. (This 
condition guarantees the existence of cr = 
</>■'.) 

C4. D(t) > —oo This condition guarantees that 
the delay cannot disappear instantaneously, 
but only gradually. 

In the case of state-dependent delays, the delay 
depends on time as a result of its dependency on 
the state. Therefore, predictor feedback guaran¬ 
tees stabilization of the system when the delay 
satisfies the four conditions C1-C4. Yet, since 
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the delay is a nonnegative function of the state, 
conditions C2-C4 are satisfied by restricting the 
initial state X and the initial actuator state. There¬ 
fore estimate (12) holds locally. 

Cross-References 

► Control of Linear Systems with Delays 


Recommended Reading 

The main control design tool for general systems 
with input delays of arbitrary length is predictor 
feedback. The reader is referred to Artstein 
(1982) for the first systematic treatment of 
general linear systems with constant input delays. 
The applicability of predictor feedback was 
extended in Krstic (2009) to several classes of 
systems, such as nonlinear systems with constant 
input delays and linear systems with unknown 
input delays. Subsequently, predictor feedback 
was extended to general nonlinear systems with 
nonconstant input and state delays (Bekiaris- 
Liberis and Krstic 2013a). The main stability 
analysis tool for systems employing predictor 
feedback is backstepping. Backstepping was 
initially introduced for adaptive control of finite¬ 
dimensional nonlinear systems (Krstic et al 
1995). The continuum version of backstepping 
was originally developed for the boundary 
control of several classes of PDEs in Krstic and 
Smyshlyaev (2008). 
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Abstract 

Quantum control theory is concerned with the 
control of systems whose dynamics are governed 
by the laws of quantum mechanics. Quantum 
control may take the form of open loop quan¬ 
tum control or quantum feedback control. Also, 
quantum feedback control may consist of mea¬ 
surement based feedback control, in which the 
controller is a classical system governed by the 
laws of classical physics. Alternatively, quantum 
feedback control may take the form of coherent 
feedback control in which the controller is a 
quantum system governed by the laws of quan¬ 
tum mechanics. In the area of open loop quantum 
control, questions of controllability along with 
optimal control and Lyapunov control methods 
are discussed. In the case of quantum feedback 
control, LQG and H°° control methods are dis¬ 
cussed. 
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Coherent quantum feedback; Measurement based 
quantum feedback; Quantum control; Quantum 
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Introduction 

Quantum control is the control of systems whose 
dynamics are described by the laws of quantum 
physics rather than classical physics. The 
dynamics of quantum systems must be described 
using quantum mechanics which allows for 
uniquely quantum behavior such as entanglement 
and coherence. There are two main approaches 
to quantum mechanics which are referred to 
as the Schrodinger picture and the Heisenberg 
picture. In the Schrodinger picture, quantum 
systems are modeled using the Schrodinger 
equation or a master equation which describe the 
evolution of the system state or density operator. 
In the Heisenberg picture, quantum systems are 
modeled using quantum stochastic differential 
equations which describe the evolution of system 
observables. These different approaches to 
quantum mechanics lead to different approaches 
to quantum control. Important areas in which 
quantum control problems arise include physical 
chemistry, atomic and molecular physics, and 
optics. Detailed overviews of the field o quantum 
control can be found in the survey papers Dong 
and Petersen (2010) and Brif et al. (2010) and the 
monographs Wiseman and Milburn (2010) and 
D’Alessandro (2007). 

A fundamental problem in a number of ap¬ 
proaches to quantum control is the controllability 
problem. Quantum controllability problems are 
concerned with finite dimensional quantum sys¬ 
tems modeled using the Schrodinger picture of 
quantum mechanics and involves the structure of 
corresponding Lie groups or Lie algebras; e.g., 
see D’Alessandro (2007). These problems are 
typically concerned with closed quantum sys¬ 
tems which are quantum systems isolated from 
their environment. For a controllable quantum 
system, an open loop control strategy can be 
constructed in order to manipulate the quantum 
state of the system in a general way. Such open 
loop control strategies are referred to as coherent 
control strategies. Time optimal control is one 
method of constructing these control strategies 
which has been applied in applications including 
physical chemistry and in nuclear magnetic res¬ 
onance systems; e.g., see Khaneja et al. (2001). 


An alternative approach to open loop quantum 
control is the Lyapunov approach; e.g., see Wang 
and Schirmer (2010). This approach extends the 
classical Lyapunov control approach in which a 
control Lyapunov function is used to construct a 
stabilizing state feedback control law. However 
in quantum control, state feedback control is not 
allowed since classical measurements change the 
quantum state of a system and the Heisenberg un¬ 
certainty principle forbids the simultaneous exact 
classical measurement of noncommuting quan¬ 
tum variables. Also, in many quantum control 
applications, the timescales are such that real time 
classical measurements are not technically feasi¬ 
ble. Thus, in order to obtain an open loop control 
strategy, the deterministic closed loop system is 
simulated as if the state feedback control were 
available and this enables an open loop control 
strategy to be constructed. As an alternative to 
coherent open loop control strategies, some clas¬ 
sical measurements may be introduced leading to 
incoherent control strategies; e.g., see Dong et al. 
(2009). 

In addition to open loop quantum control 
approaches, a number of approaches to quantum 
control involve the use of feedback; e.g., see 
Wiseman and Milburn (2010). This quantum 
feedback may either involve the use of classical 
measurements, in which case the controller is a 
classical (nonquantum) system or it may involve 
the case where no classical measurements are 
used since the controller itself is a quantum 
system. The case in which the controller itself 
is a quantum system is referred to as coherent 
quantum feedback control; e.g., see Lloyd 
(2000) and James et al. (2008). Quantum 
feedback control may be considered using the 
Schrodinger picture, in which case the quantum 
systems under consideration are modeled using 
stochastic master equations. Alternatively using 
the Heisenberg picture, the quantum systems 
under consideration are modeled using quantum 
stochastic differential equations. Applications in 
which quantum feedback control can be applied 
include quantum optics and atomic physics. In 
addition, quantum control can potentially be 
applied to problems in quantum information (e.g., 
see Nielsen and Chuang 2000) such as quantum 
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error correction (e.g., see Kerckhoff et al. 2010) 
or the preparation of quantum states. Quantum 
information and quantum computing in turn have 
great potential in solving intractable computing 
problems such as factoring large integers using 
Shor’s algorithm; see Shor (1994). 


Schrodinger Picture Models of 
Quantum Systems 


The state of a closed quantum system can be rep¬ 
resented by a unit vector | x/r) in a complex Hilbert 
space %. Such a quantum state is also referred 
to as a wavefunction. In the Schrodinger picture, 
the time evolution of the quantum state is defined 
by the Schrodinger equation which is in general a 
partial differential equation. An important class 
of quantum systems are finite-level systems in 
which the Hilbert space is finite dimensional. In 
this case, the Schrodinger equation is a linear 
ordinary differential equation of the form 

9 

ih — ^it)) = H 0 m)) 

where Ho is the free Hamiltonian of the sys¬ 
tem, which is a self-adjoint operator on FL\ e.g., 
see Merzbacher (1970). Also, ti is the reduced 
Planck’s constant, which can be assumed to be 
one with a suitable choice of units. In the case of 
a controlled closed quantum system, this differ¬ 
ential equation is extended to a bilinear ordinary 
differential equation of the form 


‘l m)) 


m 

Ho + Uk (t ) Hk 

k =l 


m))w 


where the functions Uk(t) are the control 
variables and the Hk are corresponding control 
Hamiltonians, which are also assumed to be self- 
adjoint operators on the underlying Hilbert space. 
These models are used in the open loop control 
of closed quantum systems. 

To represent open quantum systems, it is nec¬ 
essary to extend the notion of quantum state to 
density operators p which are positive operators 
with trace one on the underlying Hilbert space 


T-L. In this case, the Schrodinger picture model of 
a quantum system is given in terms of a master 
equation which describes the time evolution of 
the density operator. In the case of an open quan¬ 
tum system with Markovian dynamics defined on 
a finite dimensional Hilbert space of dimension 
N, the master equation is a matrix differential 
equation of the form 


p(t) = -i ^ Uk(t)H k J , pit) 

1 iV 2 -l 

+ 2 E“m([^P(0,^] 

j,k=0 

+ [Fj,p(t)F}])-, 

( 2 ) 

e.g., see Breuer and Petruccione (2002). Here 
the notation [X,p] = Xp — pX refers to the 
commutation operator and the notation denotes 
the adjoint of an operator. Also, {7 7 / -}^ =() 1 is a 
basis set for the space of bounded linear operators 
on H with Fo = /. Also, the matrix A = (cij,k) 
is assumed to be positive definite. These models, 
which include the Lindblad master equation for 
dissipative quantum systems as a special case 
(e.g., see Wiseman and Milburn 2010), are used 
in the open loop control of finite-level Markovian 
open quantum systems. 

In quantum mechanics, classical measure¬ 
ments are described in terms of self-adjoint 
operators on the underlying Hilbert space 
referred to as observables; e.g., see Breuer 
and Petruccione (2002). An important case 
of measurements are projective measurements 
in which an observable M is decomposed as 
M = J2k=\ kPk where the are orthogonal 
projection operators on FL\ e.g., see Nielsen and 
Chuang (2000). Then, for a closed quantum 
system with quantum state \x/f), the probability 
of an outcome k from the measurement is given 
by l^r\Pk\^r) which denotes the inner product 
between the vector \xf/) and the vector Pk\ty). 
This notation is referred to as Dirac notation and 
is commonly used in quantum mechanics. If the 








192 


Control of Quantum Systems 


outcome of the quantum measurement is k, the 
state of the quantum system collapses to the new 
value of , Pk ^ This change in the quantum 

state as a result of a measurement is an important 
characteristic of quantum mechanics. For an open 
quantum system which is in a quantum state 
defined by a density operator p, the probability of 
a measurement outcome k is given by tr(i\p). In 
this case, the quantum state collapses to ^p^p ) . 

In the case of an open quantum system with 
continuous measurements of an observable X, 
we can consider a stochastic master equation as 
follows: 


d p{t) = —i 


h 0 + y Uk (t)H k j ,p(t) 


k=1 


At 


-k [X, [X, p(t )]] d; 

+ V2ic(Xp(t) + p(t)X 


—2tr (Xp(t)) p(t)) AW 


(3) 


where k is a constant parameter related to the 
measurement strength and d W is a standard 
Wiener increment which is related to the 
continuous measurement outcome y (t) by 

AW = Ay- 2-v/Ktr (Xp(t)) At ; (4) 

e.g., see Wiseman and Milburn (2010). These 
models are used in the measurement feedback 
control of Markovian open quantum systems. 
Also, the Eqs. (3) and (4) can be regarded as a 
quantum filter in which p(t ) is the conditional 
density of the quantum system obtained by filter¬ 
ing the measurement signal y (t); e.g., see Bouten 
et al. (2007) and Gough et al. (2012). 


Heisenberg Picture Models of 
Quantum Systems 


evolution of general operators on the underlying 
Hilbert space rather than just observables which 
are required to be self-adjoint operators. An im¬ 
portant class of open quantum systems which are 
considered in the Heisenberg picture arise when 
the underlying Hilbert space is infinite dimen¬ 
sional and the system represents a collection of 
independent quantum harmonic oscillators inter¬ 
acting with a number of external quantum fields. 
Such linear quantum systems are described in the 
Heisenberg picture by linear quantum stochastic 
differential equations (QSDEs) of the form 

d x(t) = Ax(t)dt + Bdw(t); 

dy(t) = Cx(t)dt + Ddw(t) (5) 

where A, B, C, D are real or complex matrices, 
x(t ) is a vector of possibly noncommuting oper¬ 
ators on the underlying Hilbert space e.g., see 
James et al. (2008). Also, the quantity dw(t) is 
decomposed as 

dw(t) = p w (t)dt + dw(t) 

where /3 w (t ) is an adapted process and w(t) is a 
quantum Wiener process with Ito table: 

dw(t)dw(ty = F^dt. 

Here, F% > 0 is a real or complex matrix. The 
quantity w{t) represents the components of the 
input quantum fields acting on the system. Also, 
the quantity y{t) represents the components of 
interest of the corresponding output fields that 
result from the interaction of the harmonic oscil¬ 
lators with the incoming fields. 

In order to represent physical quantum sys¬ 
tems, the components of vector x(t) are required 
to satisfy certain commutation relations of the 
form 

[xj(t),x k (t)\ = 2 i&jk, j,k = 1,2,... ,n, Vf 


In the Heisenberg picture of quantum mechanics, 
the observables of a system evolve with time and 
the quantum state remains fixed. This picture may 
also be extended slightly by considering the time 


where the matrix © = (©y^) is skew symmetric. 
The requirement to represent a physical quan¬ 
tum system places restrictions on the matrices 
A, B , C, D , which are referred to as physical 
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realizability conditions; e.g., see James et al. 
(2008) and Shaiju and Petersen (2012). QSDE 
models of the form (5) arise frequently in the area 
of quantum optics. They can also be generalized 
to allow for nonlinear quantum systems such as 
arise in the areas of nonlinear quantum optics 
and superconducting quantum circuits; e.g., see 
Bertet et al. (2012). These models are used in 
the feedback control of quantum systems in both 
the case of classical measurement feedback and 
in the case of coherent feedback in which the 
quantum controller is also a quantum system and 
is represented by such a QSDE model. 

(S', L, H) Quantum System Models 

An alternative method of modeling an open 
quantum system as opposed to the stochastic 
master equation (SME) approach or the 
quantum stochastic differential equation (QSDE) 
approach, which were considered above, is to 
simply model the quantum system in terms of 
the physical quantities which underlie the SME 
and QSDE models. For a general open quantum 
system, these quantities are the scattering 
matrix S which is a matrix of operators on 
the underlying Hilbert space, the coupling 
operator L which is a vector of operators on 
the underlying Hilbert space, and the system 
Hamiltonian which is a self-adjoint operator on 
the underlying Hilbert space; e.g., see Gough 
and James (2009). For a given ( S , L, H ) model, 
the corresponding SME model or QSDE model 
can be calculated using standard formulas; e.g., 
see Bouten et al. (2007) and James et al. (2008). 
Also, in certain circumstances, an ( S,L,H ) 
model can be calculated from an SME model 
or a QSDE model. For example, if the linear 
QSDE model (5) is physically realizable, then a 
corresponding (S, L, H) model can be found. In 
fact, this amounts to the definition of physical 
realizability. 

Open Loop Control of Quantum 
Systems 

A fundamental question in the open loop 
control of quantum systems is the question of 


controllability. For the case of a closed quantum 
system of the form (1), the question of 
controllability can be defined as follows (e.g., 
see Albertini and D’Alessandro 2003): 

Definition 1 (Pure State Controllability) The 

quantum system (1) is said to be pure state 
controllable if for every pair of initial and final 
states | i/'o) and | \j/f ), there exist control functions 
{uk(t)} and a time T > 0 such that the cor¬ 
responding solution of (1) with initial condition 
l^o) satisfies | jr(T)) = | j/f). 

Alternative definitions have also been con¬ 
sidered for the controllability of the quantum 
system (1); e.g., see Albertini and D’Alessandro 
(2003) and Grigoriu et al. (2013) in the case 
of open quantum systems. The following the¬ 
orem provides a necessary and sufficient con¬ 
dition for pure state controllability in terms of 
the Lie algebra Co generated by the matrices 
{—iHo, —iH i,..., — iH m }, u(A) the Lie algebra 
corresponding to the unitary group of dimension 
A, su(A) the Lie algebra corresponding to the 
special unitary group of dimension A, sp(y) the 
y dimensional symplectic group, and C the Lie 
algebra conjugate to sp(y). 

Theorem 1 (See D’Alessandro 2007) The 

quantum system (1) is pure state controllable 
if and only if the Lie algebra Co satisfies one of 
the following conditions: 

(1) Co = su(A); 

(2) Co is conjugate to sp(y); 

(3) Co = u(A); 

(4) Co = span {H NxN } © £. 

Similar conditions have been obtained when 
alternative definitions of controllability are used. 

Once it has been determined that a quantum 
system is controllable, the next task in open 
loop quantum control is to determine the control 
functions {uk(t)} which drive a given initial state 
to a given final state. An important approach to 
this problem is the optimal control approach 
in which a time optimal control problem is 
solved using Pontryagin’s maximum principle 
to construct the control functions {uk(t)} which 
drives the given initial state to the given final state 
in minimum time; e.g., see Khaneja et al. (2001). 
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This approach works well for low dimensional 
quantum systems but is computationally 
intractable for high dimensional quantum 
systems. 

An alternative approach for high dimensional 
quantum systems is the Lyapunov control ap¬ 
proach. In this approach, a Lyapunov function is 
selected which provides a measure of the distance 
between the current quantum state and the desired 
terminal quantum state. An example of such a 
Lyapunov function is 

v = (i(t) - - f f ) > 0 ; 

e.g., see Mirrahimi et al. (2005). A state feedback 
control law is then chosen to ensure that the time 
derivative of this Lyapunov function is negative. 
This state feedback control law is then simulated 
with the quantum system dynamics (1) to give the 
required open loop control functions {uk(t)}. 


electromagnetic field is stabilized about a spec¬ 
ified state Pf = I fm) (fm I • 

A Heisenberg Picture Approach to 
Classical Measurement Based Quantum 
Feedback Control 

In this Heisenberg picture approach to classical 
measurement based quantum feedback control, 
we begin with a quantum system which is de¬ 
scribed by linear quantum stochastic equations of 
the form (5). In these equations, it is assumed 
that the components of the output vector all 
commute with each other and so can be regarded 
as classical quantities. This can be achieved if 
each of the components are obtained via a process 
of homodyne detection from the corresponding 
electromagnetic field; e.g., see Bachor and Ralph 
(2004). Also, it is assumed that the input electro¬ 
magnetic field w{t) can be decomposed as 


Classical Measurement Based 
Quantum Feedback Control 


d w(t) = 


Pu(t)dt + dwi(0 

d w 2 (t) 


( 6 ) 


A Schrodinger Picture Approach to 
Classical Measurement Based Quantum 
Feedback Control 

In the Schrodinger picture approach to classical 
measurement based quantum feedback control 
with weak continuous measurements, we begin 
the stochastic master equations (3) and (4) which 
are considered as both a model for the system 
being controlled and as a filter which will form 
part of the final controller. These filter equations 
are then combined with a control law of the form 


where P u (t) represents the classical control input 
signal and w\ (t), W2(t) are quantum Wiener pro¬ 
cesses. The control signal displaces components 
of the incoming electromagnetic field acting on 
the system via the use of an electro-optic modu¬ 
lator; e.g., see Bachor and Ralph (2004). 

The classical measurement feedback based 
controllers to be considered are classical systems 
described by stochastic differential equations of 
the form 


«(0 = /(p(0) 

where the function /(•) is designed to achieve 
a particular objective such as stabilization of the 
quantum system. Here u{t) represents the vector 
of control inputs Uk(t). An example of such a 
quantum control scheme is given in the paper 
Mirrahimi and van Handel (2007) in which a 
Lyapunov method is used to design the con¬ 
trol law /(•) so that a quantum system consist¬ 
ing of an atomic ensemble interacting with an 


d x K (t) = A K x k (t)dt + B K dy(t) 

Pu(t)dt = C K x k (t)dt. (7) 

For a given quantum system model (5), the ma¬ 
trices in the controller (7) can be designed using 
standard classical control theory techniques such 
as LQG control (see Doherty and Jacobs 1999) or 
H°° control (see James et al. 2008). 
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Coherent Quantum Feedback Control 

Coherent feedback control of a quantum system 
corresponds to the case in which the controller 
itself is a quantum system which is coupled in a 
feedback interconnection to the quantum system 
being controlled; e.g., see Lloyd (2000). This 
type of control by interconnection is closely re¬ 
lated to the behavioral interpretation of feedback 
control; e.g., see Polderman and Willems (1998). 

An important approach to coherent quantum 
feedback control occurs in the case when the 
quantum system to be controlled is a linear quan¬ 
tum system described by the QSDEs (5). Also, it 
is assumed that the input field is decomposed as 
in (6). However in this case, the quantity /3 u (t) 
represents a vector of noncommuting operators 
on the Hilbert space underlying the controller 
system. These operators are described by the fol¬ 
lowing linear QSDEs, which represent the quan¬ 
tum controller: 

d xxit) = A K Xk(t)dt + B K dy(t) + B K dwK(t ) 
d yid?) = CKXk(t)dt + D K dwK(t). (8) 

Then, the input fi u (t) is identified as 

Pu(t) = C K Xk(t). 

Here the quantity 


An important requirement in coherent feed¬ 
back control is that the QSDEs (8) should satisfy 
the conditions for physical realizability; e.g., see 
James et al. (2008). Subject to these constraints, 
the controller (8) can then be designed according 
to an H°° or LQG criterion; e.g., see James 
et al. (2008) and Nurdin et al. (2009). In the case 
of coherent quantum H°° control, it is shown 
in James et al. (2008) that for any controller 
matrices (Ak, Bk, Ck), the matrices ( Bk,B>k ) 
can be chosen so that the controller QSDEs (8) 
are physically realizable. Furthermore, the choice 
of the matrices (Bk, B>k) does not affect the H°° 
performance criterion considered in James et al. 
(2008). This means that the coherent controller 
can be designed using the same approach as 
designing a classical H°° controller. 

In the case of coherent LQG control such as 
considered in Nurdin et al. (2009), the choice of 
the matrices (Bk, B>k) significantly affects the 
closed loop LQG performance of the quantum 
control system. This means that the approach 
used in solving the coherent quantum H°° prob¬ 
lem given in James et al. (2008) cannot be applied 
to the coherent quantum LQG problem. To date 
there exist only some nonconvex optimization 
methods which have been applied to the coherent 
quantum LQG problem (e.g., see Nurdin et al. 
2009), and the general solution to the coherent 
quantum LQG control problem remains an open 
question. 


dwx(t) = 


dy(t) 

dw K (t) 


( 9 ) Cross-References 


represents the quantum fields acting on the con¬ 
troller quantum system and where wk(0 cor¬ 
responds to a quantum Wiener process with a 
given Ito table. Also, y(t) represents the output 
quantum fields from the quantum system being 
controlled. Note that in the case of coherent quan¬ 
tum feedback control, there is no requirement that 
the components of y (t) commute with each other 
and this in fact represents one of the main advan¬ 
tages of coherent quantum feedback control as 
opposed to classical measurement based quantum 
feedback control. 


► Bilinear Control of Schrodinger PDEs 

► Robustness Issues in Quantum Control 
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Abstract 

The undesirable effects of roll motion of ships 
(rocking about the longitudinal axis) became no¬ 
ticeable in the mid-nineteenth century when sig¬ 
nificant changes were introduced to the design of 
ships as a result of sails being replaced by steam 
engines and the arrangement being changed from 
broad to narrow hulls. The combination of these 
changes led to lower transverse stability (lower 
restoring moment for a given angle of roll) with 
the consequence of larger roll motion. The in¬ 
crease in roll motion and its effect on cargo 
and human performance lead to the development 
several control devices that aimed at reducing and 
controlling roll motion. The control devices most 
commonly used today are fin stabilizers, rudder, 
anti-roll tanks, and gyrostabilizers. The use of 
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different types of actuators for control of ship 
roll motion has been amply demonstrated for over 
100 years. Performance, however, can still fall 
short of expectations because of difficulties as¬ 
sociated with control system design, which have 
proven to be far from trivial due to fundamental 
performance limitations and large variations of 
the spectral characteristics of wave-induced roll 
motion. This short article provides an overview 
of the fundamentals of control design for ship 
roll motion reduction. The overview is limited to 
the most common control devices. Most of the 
material is based on Perez (Ship motion control. 
Advances in industrial control. Springer, London, 
2005) and Perez and Blanke (Ann Rev Control 
36(1): 1367—5788,2012). 

Keywords 

Roll damping; Ship motion control 

Ship Roll Motion Control Techniques 

One of the most commonly used devices to at¬ 
tenuate ship motion are the fin stabilisers. These 
are small controllable fins located on the bilge of 
the hull usually amid ships. These devices attain 
a performance in the range of 60-90% of roll 
reduction (root mean square) (Sellars and Martin 
1992). They require control systems that sense 
the vessel’s roll motion and act by changing the 
angle of the fins. These devices are expensive 
and introduce underwater noise that can affect 
sonar performance, they add to propulsion losses, 
and they can be damaged. Despite this, they 
are among the most commonly used ship roll 
motion control device. From a control perspec¬ 
tive, highly nonlinear effects (dynamic stall) may 
appear when operating in severe sea states and 
heavy rolling conditions (Gaillarde 2002). 

During studies of ship damage stability con¬ 
ducted in the late 1800s, it was observed that 
under certain conditions the water inside the 
vessel moved out of phase with respect to the 
wave profile, and thus, the weight of the water on 
the vessel counteracted the increase of pressure 


on the hull, hence reducing the net roll excitation 
moment. This led to the development of fluid 
anti-roll tank stabilizers. The most common type 
of anti-roll tank is the U-tank, which comprises 
two reservoirs, located one on port and one on 
starboard, connected at the bottom by a duct. 
Anti-roll tanks can be either passive or active. In 
passive tanks, the fluid flows freely from side to 
side. According to the density and viscosity of 
the fluid used, the tank is dimensioned so that 
the time required for most of the fluid to flow 
from side to side equals the natural roll period 
of the ship. Active tanks operate in a similar 
manner, but they incorporate a control system 
that modifies the natural period of the tank to 
match the actual ship roll period. This is normally 
achieved by controlling the flow of air from the 
top of one reservoir to the other. Anti-roll tanks 
attain a medium to high performance in the range 
of 20-70% of roll angle reduction (RMS) (Mar- 
zouk and Nayfeh 2009). Anti-roll tanks increase 
the ship displacement. They can also be used to 
correct list (steady-state roll angle), and they are 
the preferred stabilizer for icebreakers. 

Rudder-roll stabilization (RRS) is a technique 
based on the fact that the rudder is located not 
only aft, but also below the center of gravity of 
the vessel, and thus the rudder imparts not only 
yaw but also roll moment. The idea of using the 
rudder for simultaneous course keeping and roll 
reduction was conceived in the late 1960s by 
observations of anomalous behavior of autopilots 
that did not have appropriate wave filtering - a 
feature of the autopilot that prevents the rudder 
from reacting to every single wave; see, for ex¬ 
ample, Fossen and Perez (2009) for a discus¬ 
sion on wave filtering. Rudder-roll stabilization 
has been demonstrated to attain medium to high 
performance in the range of 50-75 % of roll 
reduction (RMS) (Baitis et al. 1983; Blanke et al. 
1989; Kallstrom et al. 1988; Oda et al. 1992; 
van Amerongen et al. 1990). The upgrade of the 
rudder machinery is required to be able to attain 
slew rates in the range 10-20 deg/s for RRS to 
have sufficient control authority. 

A gyrostabilizer uses the gyroscopic effects of 
large rotating wheels to generate a roll reducing 
torque. The use of gyroscopic effects was 
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proposed in the early 1900s as a method to elim¬ 
inate roll, rather than to reduce it. Although the 
performance of these systems was remarkable, 
up to 95 % roll reduction, their high cost, the in¬ 
crease in weight, and the large stress produced on 
the hull masked their benefits and prevented fur¬ 
ther developments. However, a recent increase in 
development of gyro stabilizers has been seen in 
the yacht industry (Perez and Steinmann 2009). 

Fins and rudder give rise to lift forces in 
proportion to the square of flow velocity past the 
fin. Hence, roll stabilization by fin or rudder is 
not possible at low or zero speed. Only U-tanks 
and gyro devices are able to provide stabilization 
in these conditions. For further details about the 
performance of different devices, see Sellars and 
Martin (1992), and for a comprehensive descrip¬ 
tion of the early development of devices, see 
Chalmers (1931). 


Modeling of Ship Roll Motion for 
Control Design 

The study of roll motion dynamics for control 
system design is normally done in terms of either 
one- or four-degrees-of-freedom (DOF) models. 
The choice between models of different complex¬ 
ity depends on the type of motion control system 
considered. 

For a one-degree-of-freedom(lDOF) case, the 
following model is used: 

<P = P, (1) 

Ixx P = Kh + K w + K c , (2) 

where 0 is roll angle, p is roll rate, and I xx is 
rigid-body moment of inertia about the x-axis of 
a body-fixed coordinate system, where Kh is hy¬ 
drostatic and hydrodynamic torques, K w torque 
generated by wave forces acting on the hull, and 
K c the control torques. The hydrodynamic torque 
can be approximated by the following parametric 
model: K h ss K p p + K p p + K p \ p \ p\p\ + K(<p). 
The first term represents a hydrodynamic torque 
in roll due to pressure change that is proportional 
to the roll accelerations, and the coefficient Kp 


is called roll added mass (inertia). The second 
term is a damping term, which captures forces 
due to wave making and linear skin friction, and 
the coefficient K p is a linear damping coefficient. 
The third term is a nonlinear damping term, 
which captures forces due to viscous effects. The 
last term is the restoring torque due to gravity and 
buoyancy. 

For a 4DOF model (surge, sway, roll, and 
yaw), motion variables considered are rj = 
[0 0] T , v = [u v p r] T , ri = [X Y K TV]" 1 ", where 
0 is the yaw angle, the body-fixed velocities are 
u -surge and u-sway, and r is the yaw rate. The 
forces and torques are X-surge, 7-sway, K- roll, 
and A-yaw. With these variables, the following 
mathematical model is usually considered: 

ij=J(j/)v, (3) 

M/j# V + Crb(v)v = Th + T c + Td, (4) 

where J (rj) is a kinematic transformation, Mrb 
is the rigid-body inertia matrix that corresponds 
to expressing the inertia tensor in body-fixed co¬ 
ordinates, Crb (v) is the rigid-body Coriolis and 
centripetal matrix, and t/*, t c , and r d represent 
the hydrodynamic, control, and disturbance vec¬ 
tor of force components and torques, respectively. 

The hydrostatic and hydrodynamic forces are 
th % — Ma v — Ca(v)v — D(v)v — K(0). The 
first two terms have origin in the motion of a 
vessel in an irrotational flow in a nonviscous 
fluid. The third term corresponds to damping 
forces due to potential (wave making), skin fric¬ 
tion, vortex shedding, and circulation (lift and 
drag). The hydrodynamic effects involved are 
quite complex, and different approaches based 
on superposition of either odd-term Taylor ex¬ 
pansions or square modulus (x|x|) series expan¬ 
sions are usually considered Abkowitz (1964) 
and Fedyaevsky and Sobolev (1964). The K(0) 
term represents the restoring forces in roll due to 
buoyancy and gravity. The 4DOF model captures 
parameter dependency on ship speed as well as 
the couplings between steering and roll, and it is 
useful for controller design. For additional details 
about mathematical model of marine vehicles, 
see Fossen (2011). 
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Wave-Disturbance Models 

The action of the waves creates changes in pres¬ 
sure on the hull of the ship, which translate into 
forces and moments. It is common to model the 
ship motion response due to waves within a linear 
framework and to obtain two frequency-response 
functions (FRF), wave to excitation ( jco , U, /) 
and wave to motion H\ ( jco , U, /) response func¬ 
tions, where i indicates the degree of freedom. 
These FRF depend on the wave frequency, the 
ship speed, and the angle / at which the waves 
encounter the ship - this is called the encounter 
angle. 

The wave elevation in deep water is approx¬ 
imately a stochastic process that is zero mean, 
stationary for short periods of time, and Gaussian 
(Haverre and Moan 1985). Under these assump¬ 
tions, the wave elevation £ is fully described by 
a power spectral density <f>^(&>). With a linear 
response assumption, the power spectral density 
of wave to excitation force and wave to motion 
can be expressed as 

®FF,i{jco) = \Fi(jco,U,x)\ 2 ^^Uco), 

= \H,(joj,U,x)\ 2 ^(joj). 

These spectra are models of the wave-induced 
forces and motions, respectively, from which it 
its common to generate either time series of 
wave excitation forces in terms of the encounter 
frequency to be used as input disturbances in 
simulation models or time series of wave-induced 
motion to be used as output disturbance; see, for 
example, Perez (2005) and references herein. 

Roll Motion Control and Performance 
Limitations 

The analysis of performance of ship roll mo¬ 
tion control by means of force actuators is usu¬ 
ally conducted within a linear framework by 
linearizing the models. For a SISO loop where 
the wave-induced roll motion is considered an 
output disturbance, the Bode integral constraint 
applies. This imposes restrictions on one’s free¬ 
dom to shape the closed-loop transfer function 


to attenuate the motion due to the wave-induced 
forces in different frequency ranges. These re¬ 
sults have important consequences on the de¬ 
sign of a roll motion control system since the 
frequency of the waves seen from the vessel 
changes significantly with the sea state, the speed 
of the vessel, and the wave encounter angle. 
The changing characteristics on open-loop roll 
motion in conjunction with the Bode integral 
constraint make the control design challenging 
since roll amplification may occur if the control 
design is not done properly. For some roll motion 
control problems, like using the rudder for simul¬ 
taneous roll attenuation and heading control, the 
system presents non-minimum phase dynamics. 
In this case, the trade-off of reduced sensitivity 
vs. amplification of roll motion is dominating 
at frequencies close to the non-minimum phase 
zero - a constraint with origin in the Poisson 
integral (Hearns and Blanke 1998); see also Perez 
(2005). 

It should be noted that non-minimum phase 
dynamics also occurs with fin stabilizers, when 
the stabilizers are located aft of the center of 
gravity. With the fins at this location, they behave 
like a rudder and introduce non-minimum phase 
dynamics and heading interference at low wave- 
excitation frequencies. These aspects of fin loca¬ 
tion were discussed by Lloyd (1989). 

The above discussion highlights general de¬ 
sign constraints that apply to roll motion control 
systems in terms of the dynamics of the vessel 
and actuator. In addition to these constraints, one 
needs also to account for limitations in actuator 
slew rate and angle. 

Controls Techniques Used in Different 
Roll Control Systems 

Fin Stabilizers 

In regard to fin stabilizers, the control design is 
commonly address using the 1DOF model (1) 
and (2). The main issues associated with control 
design are the parametric uncertainty in model 
and the Bode integral constraint. This integral 
constraint can lead to roll amplification due to 
changes in the spectrum of the wave-induced 
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roll moment with sea state and sailing conditions 
(speed and encounter angle). Fin machinery is 
designed so that the rate of the fin motion is 
fast enough, and actuator rate saturation is not an 
issue in moderate sea states. The fins could be 
used to correct heeling angles (steady-state roll) 
when the ship makes speed, but this is avoided 
due to added resistance. If it is used, integral 
action needs to include anti-windup. In terms 
of control strategies, PID, Hoo, and LQR tech¬ 
niques have been successfully applied in prac¬ 
tice. Highly nonlinear effects (dynamic stall) may 
appear when operating in severe sea states and 
heavy rolling conditions, and proposals for appli¬ 
cations of model predictive control have been put 
forward to constraint the effective angle of attack 
of the fins. In addition, if the fins are located 
too far aft along the ship, the dynamic response 
from fin angle to roll can exhibit non-minimum 
phase dynamics, which can limit the performance 
at low encounter frequencies. A thorough review 
of the control literature can be found in Perez and 
Blanke (2012). 

Rudder-Roll Stabilization 

The problem of rudder-roll stabilization requires 
the 4DOF model (3) and (4), which captures the 
interaction between roll, sway, and yaw together 
with the changes in the hydrodynamic forces 
due to the forward speed. The response from 
rudder to roll is non-minimum phase (NMP), 
and the system is characterized by further con¬ 
straints due to the single-input-two-output nature 
of the control problem - attenuate roll without 
too much interference with the heading. Studies 
of fundamental limitations due to NMP dynamics 
have been approached using standard frequency- 
domain tools by Hearns and Blanke (1998) and 
Perez (2005). A characterization of the trade-off 
between roll reduction vs. increase of interfer¬ 
ence was part of the controller design in Stoustrup 
et al. (1994). Perez (2005) determined the limits 
obtainable using optimal control with full distur¬ 
bance information. The latter also incorporated 
constraints due to the limiting authority of the 
control action in rate and magnitude of rudder 
machinery and stall conditions of the rudder. 
The control design for rudder-roll stabilization 


has been addressed in practice using PID, LQG, 
and 1 -Loq and standard frequency-domain linear 
control designs. The characteristics of limited 
control authority were solved by van Ameron- 
gen et al. (1990) using automatic gain control. 
In the literature, there have been proposals put 
forward for the use of model predictive control, 
QFT, sliding-mode nonlinear control, and auto¬ 
regressive stochastic control. Combined use of 
fin and rudder has also be investigated. Grimble 
et al. (1993) and later Roberts et al. (1997) 
used Hoo control techniques. Thorough com¬ 
parison of controller performances for warships 
was published in Crossland (2003). A thorough 
review of the control literature can be found in 
Perez and Blanke (2012). 

Gyrostabilizers 

Using a single gimbal suspension gyro stabilizer 
for roll damping control, the coupled vessel-roll- 
gyro model can be modeled as follows: 

<P = P, (5) 

Kp p + K p p + K^cj) = K w — K g a cos a (6) 
IpOi + B p a + C p sin a = K g p cos a + T p , 

(7) 

where (6) represents the 1DOF roll dynamics 
and (7) represents the dynamics of the gyrosta- 
bilizer about the axis of the gimbal suspension, 
where a is the gimbal angle, equivalent to the 
precession angle for a single gimbal suspension, 
I p is gimbal and wheel inertia about the gimbal 
axis, B p is the damping, and C p is a restoring 
term of the gyro about the precession axis due to 
location of the gyro center of mass relative to the 
precession axis (Arnold and Maunder 1961). T p 
is the control torque applied to the gimbal. The 
use of twin counter-spinning wheels prevents gy¬ 
roscopic coupling with other degrees of freedom. 
Hence, the control design for gyrostabilizers can 
be based on a linear single-degree-of-freedom 
model for roll. 

The wave-induced roll moment K w excites the 
roll motion. As the roll motion develops, the roll 
rate p induces a torque along the precession axis 
of the gyrostabilizer. As the precession angle a 
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develops, there is reaction torque done on the 
vessel that opposes the wave-induced moment. 
The later is the roll stabilizing torque, X g = 
—K g a cos a ^ —K g a. This roll torque can only 
be controlled indirectly through the precession 
dynamics in (7) via T p . In the model above, the 
spin angular velocity od sp i n is controlled to be 
constant; hence the wheels’ angular momentum 
Kg = Ispin Mspin is constant. 

The precession control torque T p is used 
to control the gyro. As observed by Sperry 
(Chalmers 1931), the intrinsic behavior of the 
gyrostabilizer is to use roll rate to generate a roll 
torque. Hence, one could design a precession 
torque controller such that from the point of 
view of the vessel, the gyro behaves as damper. 
Depending on how precession torque is delivered, 
it may be necessary to constraint precession 
angle and rate. This problem has been recently 
considered in Donaire and Perez (2013) using 
passivity-based control. 

U-tanks 

U-tanks can be passive or active. Roll reduction 
is achieved by attempting to transfer energy from 
the roll motion to motion of liquid within the tank 
and using the weight of the liquid to counteract 
the wave excitation moment. A key aspect of the 
design is the dimension and geometry of the tank 
to ensure that there is enough weight due to the 
displaced liquid in the tank and that the oscilla¬ 
tion of the fluid in the tank matches the vessel 
natural frequency in roll; see Holden and Fossen 
(2012) and references herein. The design of the 
U-tank can ensure a single-frequency matching, 
at which the performance is optimized, and for 
this frequency the roll natural frequency is used. 
As the frequency of roll motion departs from this, 
a degradation of roll reduction occurs. Active U- 
tanks use valves to control the flow of air from 
the top of the reservoirs to extend the frequency 
matching in sailing conditions in which the roll 
dominant frequency is lower than the roll natural 
frequency - the flow of air is used to delay 
the motion of the liquid from one reservoir to 
the other. This control is achieved by detecting 
the dominant roll frequency and using this infor¬ 
mation to control the air flow from one reservoir 


to the other. If the roll dominant frequency is 
higher than the roll natural frequency, the U-tank 
is used in passive mode, and the standard roll 
reduction degradation occurs. 


Summary and Future Directions 

This article provides a brief summary of control 
aspects for the most common ship roll motion 
control devices. These aspects include the type of 
mathematical models used to design and analyze 
the control problem, the inherent fundamental 
limitations and the constraints that some of the 
designs are subjected to, and the performance 
that can be expected from the different devices. 
As an outlook, one of the key issues in roll 
motion control is the model uncertainty and the 
adaptation to the changes in the environmen¬ 
tal conditions. As the vessel changes speed and 
heading, or as the seas build up or abate, the dom¬ 
inant frequency range of the wave-induced forces 
changes significantly. Due to the fundamental 
limitations discussed, a nonadaptive controller 
may produce roll amplification rather than roll 
reduction. This topic has received some attention 
in the literature via multi-mode control switching, 
but further work in this area could be beneficial. 
In the recent years, new devices have appeared 
for stabilization at zero speed, like flapping fins 
and rotating cylinders. Also the industry’s interest 
in roll gyrostabilizers has been re-ignited. The 
investigation of control designs for these devices 
has not yet received much attention within the 
control community. Hence, it is expected that this 
will create a potential for research activity in the 
future. 


Cross-References 

► Fundamental Limitation of Feedback Control 

► H-Infinity Control 
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Abstract 

Control structure selection deals with selecting 
what to control (outputs), what to measure and 
what to manipulate (inputs), and also how to split 
the controller in a hierarchical and decentralized 
manner. The most important issue is probably 
the selection of the controlled variables (outputs), 
CV = Hy, where y are the available mea¬ 
surements and H is a degree of freedom that is 
seldom treated in a systematic manner by control 
engineers. This entry discusses how to find H 
for both for the upper (slower) economic layer 
and the lower (faster) regulatory layer in the 
control hierarchy. Each layer may be split in a 
decentralized fashion. Systematic approaches for 
input/output (IO) selection are presented. 

Keywords 

Control configuration; Control hierarchy; 
Control structure design; Decentralized control; 
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Economic control; Input-output controllability; 
Input/output selection; Plantwide control; 
Regulatory control; Supervisory control 

Introduction 

Consider the generalized controller design prob¬ 
lem in Fig. 1 where P denotes the generalized 
plant model. Here, the objective is to design the 
controller K, which, based on the sensed outputs 
v, computes the inputs (MVs) u such that the 
variables z are kept small, in spite of variations in 
the variables w, which include disturbances (d), 
varying setpoints/references (CV S ) and measure¬ 
ment noise (n), 

w = [d, CV s ,n] 

The variables z, which should be kept small, 
typically include the control error for the selected 
controlled variables (CV) plus the plant inputs 
(u), 

z — [C V C V s ; u] 

The variables v, which are the inputs to the 
controller, include all known variables, including 
measured outputs (y m ), measured disturbances 
(d m ) and setpoints, 

V = [y m ;d m ;CV s ]. 

The cost function for designing the optimal con¬ 
troller K is usually the weighted control error, 


(weighted) (weighted) 

exogenous inputs exogenous outputs 



Control Structure Selection, Fig. 1 General formu¬ 
lation for designing the controller K. The plant P is 
controlled by manipulating u, and is disturbed by the 
signals w. The controller uses the measurements v, and the 
control objective is to keep the outputs (weighted control 
error) z as small as possible 


= | |W'z| |. The reason for using a prime on J 
(J'), is to distinguish it from the economic cost 
J which we later use for selecting the controlled 
variables (CV). 

Notice that it is assumed in Fig. 1 that we know 
what to measure (v), manipulate (u), and, most 
importantly, which variables in z we would like to 
keep at setpoints (CV), that is, we have assumed a 
given control structure. The term “control struc¬ 
ture selection” (CSS) and its synonym “control 
structure design” (CSD) is associated with the 
overall control philosophy for the system with 
emphasis on the structural decisions which are 
a prerequisite for the controller design problem 
in Fig. 1: 

1. Selection of controlled variables (CVs, 
“outputs,” included in z in Fig. 1 ) 

2. Selection of manipulated variables (MVs, 
“inputs,” u in Fig. 1 ) 

3. Selection of measurements y (included in v in 
Fig. 1) 

4. Selection of control configuration (structure 
of overall controller K that interconnects the 
controlled, manipulated and measured vari¬ 
ables; structure of K in Fig. 1) 

5. Selection of type of controller K (PID, MPC, 
FQG, H-infinity, etc.) and objective function 
(norm) used to design and analyze it. 
Decisions 2 and 3 (selection of u and y) are 

sometimes referred to as the input/output (10) 
selection problem. In practice, the controller (K) 
is usually divided into several layers, operating on 
different time scales (see Fig. 2), which implies 
that we in addition to selecting the (primary) 
controlled variables (CVi = CV) must also 
select the (secondary) variables that interconnect 
the layers (CV 2 ). 

Control structure selection includes all the 
structural decisions that the engineer needs to 
make when designing a control system, but 
it does not involve the actual design of each 
individual controller block. Thus, it involves the 
decisions necessary to make a block diagram 
(Fig. 1; used by control engineers) or process 
& instrumentation diagram (used by process 
engineers) for the entire plant, and provides 
the starting point for a detailed controller 
design. 
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Control Structure Selection, Fig. 2 Typical control hi¬ 
erarchy, as illustrated for a process plant 


The term “plantwide control,” which is a syn¬ 
onym for “control structure selection,” is used 
in the field of process control. Control structure 
selection is particularly important for process 
control because of the complexity of large pro¬ 
cessing plants, but it applies to all control applica¬ 
tions, including vehicle control, aircraft control, 
robotics, power systems, biological systems, so¬ 
cial systems, and so on. 

It may be argued that control structure selec¬ 
tion is more important than the controller design 
itself. Yet, control structure selection is hardly 
covered in most control courses. This is probably 
related to the complexity of the problem, which 
requires the knowledge from several engineering 
fields. In the mathematical sense, the control 
structure selection problem is a formidable com¬ 
binatorial problem which involves a large number 
of discrete decision variables. 


Overall Objectives for Control and 
Structure of the Control Layer 

The starting point for control system design 
is to define clearly the operational objectives. 
There are usually two main objectives for 
control: 

1. Longer-term economic operation (minimize 

economic cost J subject to satisfying opera¬ 
tional constraints) 

2. Stability and short-term regulatory control 
The first objective is related to “making the sys¬ 
tem operate as intended,” where economics are 
an important issue. Traditionally, control engi¬ 
neers have not been much involved in this step. 
The second objective is related to “making sure 
the system stays operational,” where stability 
and robustness are important issues, and this 
has traditionally been the main domain of con¬ 
trol engineers. In terms of designing the con¬ 
trol system, the second objective (stabilization) 
is usually considered first. An example is bicy¬ 
cle riding; we first need to learn how to sta¬ 
bilize the bicycle (regulation), before trying to 
use it for something useful (optimal operation), 
like riding to work and selecting the shortest 
path. 

We use the term “economic cost,” because 
usually the cost function J can be given a mon¬ 
etary value, but more generally, the cost J could 
be any scalar cost. For example, the cost J could 
be the “environmental impact” and the economics 
could then be given as constraints. 

In theory, the optimal strategy is to combine 
the control tasks of optimal economic operation 
and stabilization/regulation in a single centralized 
controller K, which at each time step collects all 
the information and computes the optimal input 
changes. In practice, simpler controllers are used. 
The main reason for this is that in most cases one 
can obtain acceptable control performance with 
simple structures, where each controller block in¬ 
volves only a few variables. Such control systems 
can be designed and tuned with much less effort, 
especially when it comes to the modeling and 
tuning effort. 

So how are large-scale systems controlled in 
practise? Usually, the controller K is decomposed 
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into several subcontrollers, using two main prin¬ 
ciples 

- Decentralized (local) control. This “horizon¬ 
tal decomposition” of the control layer is usu¬ 
ally based on separation in space, for example, 
by using local control of individual units. 

- Hierarchical (cascade) control This “vertical 
decomposition” is usually based on time scale 
separation, as illustrated for a process plant in 
Fig. 2. The upper three layers in Fig. 2 deal 
explicitly with economic optimization and are 
not considered here. We are concerned with 
the two lower control layers, where the main 
objective is to track the setpoints specified by 
the layer above. 

In accordance with the two main objectives for 
control, the control layer is in most cases divided 
hierarchically in two layers (Fig. 2): 

1. A “slow” supervisory (economic) layer 

2. A “fast” regulatory (stabilization) layer 
Another reason for the separation in two con¬ 
trol layers, is that the tasks of economic opera¬ 
tion and regulation are fundamentally different. 
Combining the two objectives in a single cost 
function, which is required for designing a single 
centralized controller K, is like trying to compare 
apples and oranges. For example, how much is 
an increased stability margin worth in monitory 
units [$] ? Only if there is a reasonable benefit in 
combining the two layers, for example, because 
there is limited time scale separation between 
the tasks of regulation and optimal economics, 
should one consider combining them into a single 
controller. 


Notation and Matrices Hi and H2 for 
Controlled Variable Selection 

The most important notation is summarized in 
Table 1 and Fig. 3. To distinguish between the 
two control layers, we use “1” for the upper 
supervisory (economic) layer and “2” for the 
regulatory layer, which is “secondary” in terms 
of its place in the control hierarchy. 

There is often limited possibility to select the 
input set (u) as it is usually constrained by the 



Control Structure Selection, Fig. 3 Block diagram of 
a typical control hierarchy, emphasizing the selection of 
controlled variables for supervisory (economic) control 
(CVi = Hiy) and regulatory control (CV 2 = H 2 y) 


Control Structure Selection, Table 1 Important notation 

u = [ui; u 2 ] = set of all available physical plant inputs 
ui = inputs used directly by supervisory control layer 
u 2 = inputs used by regulatory layer 
y m = set of all measured outputs 
y = [y m ; u] = combined set of measurements and inputs 

y 2 = controlled outputs in regulatory layer (subset or combination of y); dim(y 2 ) = dim(u 2 ) 
CVi = Hi y = controlled variables in supervisory layer; dim(CVi) = dim(u) 

CV 2 = [y 2 ;ui] = H 2 y = controlled variables in regulatory layer; dim(CV 2 ) = dim(u) 

MVi = CV 2s = [y 2s ; ui] = manipulated variables in supervisory layer; dim(MVi) = dim(u) 
MV 2 = u 2 = manipulated variables in regulatory layer; dim(MV 2 ) = dim(u 2 ) < dim(u) 
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plant design. However, there may be a possibility 
to add inputs or to move some to another location, 
for example, to avoid saturation or to reduce the 
time delay and thus improve the input-output 
controllability. 

There is much more flexibility in terms of out¬ 
put selection, and the most important structural 
decision is related to the selection of controlled 
variables in the two control layers, as given by 
the decision matrices Hi and H 2 (see Fig. 3). 

CVi = H iy 

CV 2 = H 2 y 

Note from the definition in Table 1 that y = 
[y m ; u]. Thus, y includes, in addition to the can¬ 
didate measured outputs (y m ), also the physical 
inputs u. This allows for the possibility of select¬ 
ing an input u as a “controlled” variable, which 
means that this input is kept constant (or, more 
precisely, the input is left “unused” for control in 
this layer). 

In general, Hi and H 2 are “full” matrices, 
allowing for measurement combinations as con¬ 
trolled variables. However, for simplicity, espe¬ 
cially in the regulatory layer, we often pefer to 
control individual measurements, that is, H 2 is 
usually a “selection matrix,” where each row 
in H 2 contains one 1-element (to identify the 
selected variable) with the remaining elements set 
to 0. In this case, we can write CV 2 = H 2 y = 
[y 2 ;ui], where y 2 denotes the actual controlled 
variables in the regulatory layer, whereas ui de¬ 
notes the “unused” inputs (ui), which are left 
as degrees of freedom for the supervisory layer. 
Note that this indirectly determines the inputs u 2 
used in the regulatory layer to control y 2 , because 
u 2 is what remains in the set u after selecting u\. 
To have a simple control structure, with as few 
regulatory loops as possible, it is desirable that 
H 2 is selected such that there are many inputs (ui) 
left “unused” in the regulatory layer. 

Example. Assume there are three candidate out¬ 
put measurements (temperatures T) and two in¬ 
puts (flowrates q), 

y m = [T a T b T c ], u = [q a q b ] 


and we have by definition y = [y m ; u]. Then the 
choice 

H 2 = [0 1 0 0 0; 0 0 0 0 1] 

means that we have selected CV 2 = H 2 y = 
[Tb; qb]. Thus, ui = qb is an unused input for 
regulatory control, and in the regulatory layer we 
close one loop, using u 2 = q a to control y 2 = Tb. 
If we instead select 

H 2 = [1 0 0 0 0; 0 0 1 0 0] 

then we have CV 2 = [T a ; T c ]. None of these are 
inputs, so ui is an empty set in this case. This 
means that we need to close two regulatory loops, 
using u 2 = [q a ; q b ] to control y 2 = [T a ; T c ]. 

Supervisory Control Layer and 
Selection of Economic Controlled 
Variables (CVi) 

Some objectives for the supervisory control layer 
are given in Table 2. The main structural issue 
for the supervisory control layer, and probably 
the most important decision in the design of any 
control system, is the selection of the primary 
(economic) controlled variable CVi. In many 
cases, a good engineer can make a reasonable 
choice based on process insight and experience. 
However, the control engineer must realize that 
this is a critical decision. The main rules and 
issues for selecting CV 1 are 
CViRulel. Control active constraints (almost 
always) 

• Active constraints may often be identified 
by engineering insight, but more generally 
requires optimization based on a detailed 
model. 

For example, consider the problem of min¬ 
imizing the driving time between two cities 
(cost J = T). There is a single input (u = 
fuel flow f[l/s]) and the optimal solution 
is often constrained. When driving a fast 
car, the active constraint may be the speed 
limit (CV 1 = v [km/ h] with setpoint v max , 
e.g., Vmax — 100 km/h). When driving 
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Control Structure Selection, Table 2 Objectives of supervisory control layer 

Ol. Control primary “economic” variables CVi at setpoint using as degrees of freedom MVi, which includes the 
setpoints to the regulatory layer (y 2 S = CV 2 S ) as well as any “unused” degrees of freedom (ui) 

02. Switch controlled variables (CVi) depending on operating region, for example, because of change in active 
constraints 

03. Supervise the regulatory layer, for example, to avoid input saturation (U 2 ), which may destabilize the system 
04. Coordinate control loops (multivariable control) and reduce effect of interactions (decoupling) 

05. Provide feedforward action from measured disturbances 

06. Make use of additional inputs, for example, to improve the dynamic performance (usually combined with input 
midranging control) or to extend the steady-state operating range (split range control) 

07. Make use of extra measurements, for example, to estimate the primary variables CVi 


an old car, the active constraint maybe the 
maximum fuel flow (CVj = f[l/s] with 
setpoint f ma x)- The latter corresponds to 
an input constraint ( u max = fmax ) which 
is trivial to implement ('full gas”); the 
former corresponds to an output constraint 
( y m ax — Vmax) which requires a controller 
(“cruise control”). 

• For "hard” output constraints , which can¬ 
not be violated at any time, we need to 
introduce a backoff (safety margin) to guar¬ 
antee feasibility. The backoff is defined as 
the difference between the optimal value 
and the actual setpoint, for example, we 
need to back off from the speed limit be¬ 
cause of the possibility for measurement 
error and imperfect control 

CVi s = CV 1 ?max - backoff 

For example, to avoid exceeding the 
speed limit of 100 km/h, we may set 
backoff = 5 km/h, and use a setpoint 
v s = 95 km/h rather than 100 km/h. 

CViRule2. For the remaining unconstrained 
degrees of freedom, look for “self-optimizing” 
variables which when held constant, indirectly 
lead to close-to-optimal operation, in spite of 
disturbances. 

• Self-optimizing variables (CVi = Hiy) are 
variables which when kept constant, indi¬ 
rectly (through the action of the feedback 
control system) lead to close-to optimal 
adjustment of the inputs (u) when there are 
disturbances (d). 


• An ideal self-optimizing variable is the gra¬ 
dient of the cost function with respect to the 
unconstrained input. CV 1 = dJ/du = J u 

• More generally, since we rarely can mea¬ 
sure the gradient J u , we select CVi = Hiy. 
The selection of a good Hi is a nontrivial 
task, but some quantitative approaches are 
given below. 

For example, consider again the problem of 
driving between two cities, but assume that the 
objective is to minimize the total fuel, J = V 
[liters]., Here, driving at maximum speed will 
consume too much fuel, and driving too slow 
is also nonoptimal. This is an unconstrained 
optimization problem, and identifying a good 
C V\ is not obvious. One option is to maintain 
a constant speed (CV\ = v), but the optimal 
value of v may vary depending on the slope 
of the road. A more “ self-optimizing” option, 
could be to keep a constant fuel rate (CV\ = 
f[l/s]), which will imply that we drive slower 
uphill and faster downhill. More generally, 
one can control combinations, CV\ = H\y 
where H\ is a "full” matrix. 

CViRule3. For the unconstrained degrees of 
freedom, one should never control a variable 
that reaches its maximum or minimum value at 
the optimum, for example, never try to control 
directly the cost J. Violation of this rule gives 
either infeasibility (if attempting to control J 
at a lower value than J m i n ) or nonuniqueness 
(if attempting to control J at higher value than 
Jmin)- 

Assume again that we want to minimize the 
total fuel needed to drive between two cities, 
J = V [l ]. Then one should avoid fixing the 
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total fuel, CV\ — V [l ], or, alternatively, avoid 
fixing the fuel consumption^'gas mileage”) in 
liters pr. km (C V\ = f [l/ km]). Attempting to 
control the fuel consumption[l/km] below the 
car's minimum value is obviously not possible 
(infeasible). Alternatively, attempting to control 
the fuel consumption above its minimum value 
has two possible solutions; driving slower or 
faster than the optimum. Note that the policy of 
controlling the fuel rate f [l/s] at a fixed value 
will never become infeasible. 

For CVi-Rule 2, it is always possible to find 
good variable combinations (i.e., Hi is a “full” 
matrix), at least locally, but whether or not it is 
possible to find good individual variables (Hi 
is a selection matrix), is not obvious. To help 
identify potential “self-optimizing” variables 
(CV i = c) ,the following requirements may be 
used: 

Requirement 1. The optimal value of c is insen¬ 
sitive to disturbances, that is, dc opt /dd = HiF 
is small. Here F = dy opt /dd is the optimal 
sensitivity matrix (see below). 

Requirement 2. The variable c is easy to measure 
and control accurately 

Requirement 3. The value of c is sensitive to 
changes in the manipulated variable, u; that 
is, the gain, G = HG y , from u to c is 
large (so that even a large error in controlled 
variable, c, results in only a small variation in 
u.) Equivalently, the optimum should be “flat” 
with respect to the variable, c. Here G y = 
dy/du is the measurement gain matrix (see 
below). 

Requirement 4. For cases with two or more 
controlled variables c, the selected variables 
should not be closely correlated. 

All four requirements should be satisfied. 
For example, for the operation of a marathon 
runner, the heart rate may be a good “self- 
optimizing” controlled variable c (to keep at 
constant setpoint). Let us check this against 
the four requirements. The optimal heart 
rate is weakly dependent on the disturbances 
(requirement 1) and the heart rate is easy to 
measure (requirement 2). The heart rate is quite 
sensitive to changes in power input (requirement 
3). Requirement 4 does not apply since this is 


a problem with only one unconstrained input 
(the power). In summary, the heart rate is a good 
candidate. 

Regions and switching. If the optimal active 
constraints vary depending on the disturbances, 
new controlled variables (CVi) must be identified 
(offline) for each active constraint region, and on¬ 
line switching is required to maintain optimality. 
In practise, it is easy to identify when to switch 
when one reaches a constraint. It is less obvious 
when to switch out of a constraint, but actually 
one simply has to monitor the value of the un¬ 
constrained CVs from the neighbouring regions 
and switch out of the constraint region when the 
unconstrained CV reaches its setpoint. 

In general, one would like to simplify the 
control structure and reduce need for switching. 
This may require using a suboptimal CVi in 
some regions of active constraints. In this case, 
the setpoint for CVi may not be its nominally 
optimal value (which is the normal choice), but 
rather a “robust setpoint” (with backoff) which 
reduces the loss when we are outside the nominal 
constraint region. 

Structure of supervisory layer. The supervi¬ 
sory layer may either be centralized, e.g., using 
model predictive control (MPC), or decomposed 
into simpler subcontrollers using standard ele¬ 
ments, like decentralized control (PID), cascade 
control, selectors, decouplers, feedforward ele¬ 
ments, ratio control, split range control, and input 
midrange control (also known as input resetting, 
valve position control or habituating control). In 
theory, the performance is better with the central¬ 
ized approach (e.g., MPC), but the difference can 
be small when designed by a good engineer. The 
main reasons for using simpler elements is that 
(1) the system can be implemented in the existing 
“basic” control system, (2) it can be implemented 
with little model information, and (3) it can be 
build up gradually. However, such systems can 
quickly become complicated and difficult to un¬ 
derstand for other than the engineer who designed 
it. Therefore, model-based centralized solutions 
(MPC) are often preferred because the design is 
more systematic and easier to modify. 
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Quantitative Approach for Selecting 
Economic Controlled Variables, CVi 

A quantitative approach for selecting economic 
controlled variables is to consider the effect of the 
choice CVi = Hiy on the economic cost J when 
disturbances d occur. One should also include 
noise/errors (n y ) related to the measurements and 
inputs. 

Step SI. Define operational objectives (eco¬ 
nomic cost function J and constraints) 

We first quantify the operational objectives 
in terms of a scalar cost function J [$/s] that 
should be minimized (or equivalently, a scalar 
profit function, P = —J, that should be max¬ 
imized). For process control applications, this 
is usually easy, and typically we have 

J = cost feed + cost utilities (energy) 

— value products [$/s] 

Note that the economic cost function J is used 
to select the controlled variables (CVi), and 
another cost function (J'), typically involving 
the deviation in CVi from their optimal set- 
points CVis, is used for the actual controller 
design (e.g., using MPC). 

Step S2. Find optimal operation for expected 
disturbances 

Mathematically, the optimization problem can 
be formulated as 

minu J (u, x, d) 

subject to: 

Model equations: dx/dt = f (u, x, d) 

Operational constraints: g (u, x, d) < 0 

In many cases, the economics are determined 
by the steady-state behavior, so we can set 
dx/dt = 0. The optimization problem should 
be resolved for the expected disturbances (d) 
to find the truly optimal operation policy, 
u op t(d). The nominal solution (d n0 m) may 
be used to obtain the setpoints (CVi s ) 
for the selected controlled variables. In 


practise, the optimum input u opt (d) cannot 
be realized, because of model error and 
unknown disturbances d, so we use a feeback 
implementation where u is adjusted to keep 
the selected variables CVi at their nominally 
optimal setpoints. 

Together with obtaining the model, the opti¬ 
mization step S2 is often the most time con¬ 
suming step in the entire plantwide control 
procedure. 

Step S3. Select supervisory (economic) con¬ 
trolled variables , C V j 

CVi-Rule 1: Control Active Constraints 

A primary goal for solving the optimization prob¬ 
lem is to find the expected regions of active 
constraints, and a constraint is said to be “active” 
if g = 0 at the optimum. The optimally active 
constraints will vary depending on disturbances 
(d) and market conditions (prices). 

CVi-Rule 2: Control Self-Optimizing 
Variables 

After having identified (and controlled) the ac¬ 
tive constraints, one should consider the remain¬ 
ing lower-dimension unconstrained optimization 
problem, and for the remaining unconstrained 
degrees of freedom one should search for control 
“self-optimizing” variables c. 

1. “Brute force” approach. Given a set of con¬ 
trolled variables CVi = c = Hiy, one 
computes the cost J(c,d) when we keep c 
constant (c = c s + Hin y ) for various dis¬ 
turbances (d) and measurement errors (n y ). 
In practise, this is done by running a large 
number of steady-state simulations to try to 
cover the expected future operation. 

2. “Local” approaches based on a quadratic 
approximation of the cost J. Linear models are 
used for the effect of u and d on y. 

y = G y u + G^d 

This is discussed in more detail in Alstad et al. 
(2009) and references therein. The main local 
approaches are: 

2A. Maximum gain rule: maximize the min¬ 
imum singular value of G = HiG y . 
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In other words, the maximum gain rule, 
which essentially is a quantitative version 
of Requirements 1, 3 and 4 given above, 
says that one should control “sensitive” 
variables, with a large scaled gain G from 
the inputs (u) to c = Hiy. This rule is 
good for pre-screening and also yields good 
insight. 

2B. Nullspace method. This method yields op¬ 
timal measurement combinations for the 
case with no noise, n y = 0. One must first 
obtain the optimal measurement sensitivity 
matrix F, defined as 

F = dy opt /dd. 

Each column in F expresses the optimal 
change in the y’s when the independent 
variable (u) is adjusted so that the sys¬ 
tem remains optimal with respect to the 
disturbance d. Usually, it is simplest to 
obtain F numerically by optimizing the 
model. Alternatively, we can obtain F from 
a quadratic approximation of the cost func¬ 
tion 

F = Gj — G y J“ 1 J lld 

Then, assuming that we have at least as 
many (independent) measurements y as the 
sum of the number of (independent) inputs 
(u) and disturbances (d), the optimal is to 
select c = Hiy such that 

HiF = 0 

Note that Hi is a nonsquare matrix, so 
HiF = 0 does not require that Hi = 0 
(which is a trivial uninteresting solution), 
but rather that Hi is in the nullspace of F T . 

2C. Exact local method (loss method). This 
extends the nullspace method to include 
noise (n y ) and allows for any number of 
measurements. The noise and disturbances 
are normalized by introducing weighting 
matrices W ny and Wd (which have the ex¬ 
pected magnitudes along the diagonal) and 
then the expected loss, L = J — J op t(d), 
is minimized by selecting Hi to solve the 
following problem 


min_Hi||M(Hi)|| 2 

where 2 denotes the Frobenius norm and 

M(Hd = jJ{ 2 (H 1 G y ) _1 H 1 Y, Y 
= [FW d W ny ]. 

Note here that the optimal choice with 
W ny = 0 (no noise) is to choose Hi such 
that HiF = 0, which is the nullspace 
method. For the general case, when Hi is a 
“full” matrix, this is a convex problem and 
the optimal solution is H[ = (YY') _1 G y Q 
where Q is any nonsingular matrix. 

Regulatory Control Layer 

The main purpose of the regulatory layer is 
to “stabilize” the plant, preferably using a 
simple control structure (e.g., single-loop PID 
controllers) which does not require changes 
during operation. “Stabilize” is here used in a 
more extended sense to mean that the process 
does not “drift” too far away from acceptable 
operation when there are disturbances. The 
regulatory layer should make it possible to use 
a “slow” supervisory control layer that does not 
require a detailed model of the high-frequency 
dynamics. Therefore, in addition to track the 
setpoints given by the supervisory layer (e.g., 
MPC), the regulatory layer may directly control 
primary variables (CVi) that require fast and 
tight control, like economically important active 
constraints. 

In general, the design of the regulatory layer 
involves the following structural decisions: 

1. Selection of controlled outputs y 2 (among all 
candidate measurements y m ). 

2. Selection of inputs MV 2 = u 2 (a subset of all 
available inputs u) to control the outputs y 2 . 

3. Pairing of inputs u 2 and outputs y 2 (since 
decentralized control is normally used). 

Decisions 1 and 2 combined (10 selection) is 
equivalent to selecting H 2 (Fig. 3). Note that 
we do not “use up” any degrees of freedom in 
the regulatory layer because the set points (y 2s ) 
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become manipulated variables (MVi) for the 
supervisory layer (see Fig. 3). Furthermore, since 
the set points are set by the supervisory layer 
in a cascade manner, the system eventually 
approaches the same steady-state (as defined by 
the choice of economic variables CVi) regardless 
of the choice of controlled variables in the 
regulatory layer. 

The inputs for the regulatory layer (U 2 ) are 
selected as a subset of all the available inputs 
(u). For stability reasons, one should avoid input 
saturation in the regulatory layer. In particular, 
one should avoid using inputs (in the set U 2 ) that 
are optimally constrained in some disturbance 
region. Otherwise, in order to avoid input satura¬ 
tion, one needs to include a backoff for the input 
when entering this operational region, and doing 
so will have an economic penalty. 

In the regulatory layer, the outputs (y 2 ) are 
usually selected as individual measurements and 
they are often not important variables in them¬ 
selves. Rather, they are “extra outputs” that are 
controlled in order to “stabilize” the system, and 
their setpoints (y 2 S ) are changed by the layer 
above, to obtain economical optimal operation. 
For example, in a distillation column one may 
control a temperature somewhere in the middle 
of the column (y 2 = T) in order to “stabilize” 
the column profile. Its setpoint (y 2 S = T s ) is 
adjusted by the supervisory layer to obtain the 
desired product composition (yi = c). 


Input-Output (10) Selection for 
Regulatory Control (u 2 , y 2 ) 

Finding the truly optimal control structure, in¬ 
cluding selecting inputs and outputs for regu¬ 
latory control, requires finding also the optimal 
controller parameters. This is an extremely dif¬ 
ficult mathematical problem, at least if the con¬ 
troller K is decomposed into smaller controllers. 
In this section, we consider some approaches 
which does not require that the controller param¬ 
eters be found. This is done by making assump¬ 
tions related to achievable control performance 
(controllability) or perfect control. 


Before we look at the approaches, note again 
that the 10-selection for regulatory control may 
be combined into a single decision, by consider¬ 
ing the selection of 

CV 2 = [y 2 ;ui] = H 2 y 

Here ui denotes the inputs that are not used by the 
regulatory control layer. This follows because we 
want to use all inputs u for control, so assuming 
that the set u is given, “selection of inputs u 2 ” 
(decision 2) is by elimination equivalent to “se¬ 
lection of inputs ui” Note that CV 2 include all 
variables that we keep at desired (constant) values 
within the fast time horizon of the regulatory 
control layer, including the “unused” inputs ui 

Survey by Van de Wal and Jager 

Van de Wal and Jager provide an overview of 
methods for input-output selection, some of 
which include: 

1. “Accessibility” based on guaranteeing a 
cause-effect relationship between the selected 
inputs (u 2 ) and outputs (y 2 ). Use of such 
measures may eliminate unworkable control 
structures. 

2. “State controllability and state observability” 
to ensure that any unstable modes can be sta¬ 
bilized using the selected inputs and outputs. 

3. “Input-output controllability” analysis to en¬ 
sure that y 2 can be acceptably controlled us¬ 
ing u 2 . This is based on scaling the system, 
and then analysing the transfer matrices G 2 (s) 
(from u 2 to y 2 ) and Gd 2 (from expected dis¬ 
turbances d to y 2 ). Some important control¬ 
lability measures are right half plane zeros 
(unstable dynamics of the inverse), condition 
number, singular values, relative gain array, 
etc. One problem here is that there are many 
different measures, and it is not clear which 
should be given most emphasis. 

4. “Achievable robust performance.” This may 
be viewed as a more detailed version of input- 
output controllability, where several relevant 
issues are combined into a single measure. 
However, this requires that the control prob¬ 
lem can actually be formulated clearly, which 
may be very difficult, as already mentioned. 
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In addition, it requires finding the optimal 
robust controller for the given problem, which 
may be very difficult. 

Most of these methods are useful for analyzing a 
given structure (U 2 , y 2 ) but less suitable for selec¬ 
tion. Also, the list of methods is also incomplete, 
as disturbance rejection, which is probably the 
most important issue for the regulatory layer, is 
hardly considered. 

A Systematic Approach for 10-Selection 
Based on Minimizing State Drift Caused by 
Disturbances 

The objectives of the regulatory control layer 
are many, and Yelchuru and Skogestad (2013) 
list 13 partly conflicting objectives. To have a 
truly systematic approach to regulatory control 
design, including IO-selection, we would need to 
quantify all these partially conflicting objectives 
in terms of a scalar cost function J 2 . We here 
consider a fairly general cost function, 

J 2 = 11 Wx 11 

which may be interpreted as the weighted state 
drift. One justification for considering the state 
drift, is that the regulatory layer should ensure 
that the system, as measured by the weighted 
states Wx, does not drift too far away from the 
desired state, and thus stays in the “linear region” 
when there are disturbances. Note that the cost J 2 
is used to select controlled variables (CV 2 ) and 
not to design the controller (for which the cost 
may be the control error, V = | ICV 2 — CV 2 S 11). 

Within this framework, the IO-selection prob¬ 
lem for the regulatory layer is then to select the 
nonsquare matrix H 2 , 

CV 2 = H 2 y 

where y = [y m ;u], such that the cost J 2 is 
minimized. The cause for changes in J 2 are dis¬ 
turbances d, and we consider the linear model (in 
deviation variables) 

y = G y u + G^d 
x = G x u + G*d 


where the G-matrices are transfer matrices. Here, 
G^ gives the effect of the disturbances on the 
states with no control, and the idea is to reduce 
the disturbance effect by closing the regulatory 
control loops. Within the “slow” time scale of 
the supervisory layer, we can assume that CV 2 is 
perfectly controlled and thus constant, or CV 2 = 
0 in terms of deviation variables. This gives 

CV 2 = H 2 G y u + H 2 Gjd = 0 
and solving with respect to u gives 

u = - (H 2 G y ) -1 (H 2 G y ) d 
and we have 

x = P*(H 2 )d 

where 

Pci (H 2 ) = G$ - G x (H 2 G y ) _1 H 2 G y 

is the disturbance effect for the “partially” con¬ 
trolled system with only the regulatory loops 
closed. Note that it is not generally possible to 
make = 0 because we have more states than 
we have available inputs. To have a small “state 
drift,” we want J 2 = 11W Pd d| | to be small, and 
to have a simple regulatory control system we 
want to close as few regulatory loops as possible. 
Assume that we have normalized the disturbances 
so that the norm of d is 1, then we can solve the 
following problem 

For 0,1,2... etc. loops closed solve: 
min_H 2 | |M 2 (H 2 ) || 

where M 2 = WP^ and dim (u2) = 

dim (y2) = no. of loops closed. 

By comparing the value of | IM 2 (H 2 ) || with 
different number of loops closed (i.e., with differ¬ 
ent H 2 ), we can then decide on an appropriate reg¬ 
ulatory layer structure. For example, assume that 
we find that the value of J 2 is 110 (0 loops closed), 
0.2 (1 loop), and 0.02 (2 loops), and assume we 
have scaled the disturbances and states such that 
a J 2 -value less than about 1 is acceptable, then 
closing 1 regulatory loop is probably the best 
choice. 
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In principle, this is straightforward, but there 
are three remaining issues : (1) We need to choose 
an appropriate norm, ( 2 ) we should include 
measurement noise to avoid selecting insensitive 
measurements and (3) the problem must be 
solvable numerically. 

Issue 1. The norm of M 2 should be evalu¬ 
ated in the frequency range between the “slow” 
bandwidth of the supervisory control layer (oobi) 
and the “fast” bandwidth of the regulatory control 
layer (<jo B 2 )- However, since it is likely that the 
system sometimes operates without the supervi¬ 
sory layer, it is reasonable to evaluate the norm 
of in the frequency range from 0 (steady state) 
to oob 2 - Since we want H 2 to be a constant (not 
frequency-dependent) matrix, it is reasonable to 
choose H 2 to minimize the norm of M 2 at the 
frequency where 11 M 211 is expected to have its 
peak. For some mechanical systems, this may 
be at some resonance frequency, but for process 
control applications it is usually at steady state 
(00 = 0 ), that is, we can use the steady-state 
gain matrices when computing PJ. In terms of 
the norm, we use the 2-norm (Frobenius norm), 
mainly because it has good numerical proper¬ 
ties, and also because it has the interpretation of 
giving the expected variance in x for normally 
distributed disturbances. 

Issues 2 and 3. If we include also measurement 
noise n y , which we should, then the expected 
value of J 2 is minimized by solving the problem 
min_H 2 | |M 2 (H 2 ) 1 12 where (Yelchuru and Sko- 
gestad 2013) 

M 2 (H 2 ) = J 2 uu (H 2 G y ) -1 H 2 Y 2 


for the cost J 2 , it is possible to obtain analytical 
formulas for the optimal sensitivity, F 2 . Again, 
W d and W ny are diagonal matrices, expressing 
the expected magnitude of the disturbances (d) 
and noise (for y). 

For the case when H 2 is a “full” matrix, this 
can be reformulated as a convex optimization 
problem and an explicit solution is 

H [ = (Y2Y2) -1 G y (G y7 (Y2Y2 ) _1 G y ) _1 J 2 uu 

and from this we can find the optimal value of 
J 2 . It may seem restrictive to assume that H 2 is a 
“full” matrix, because we usually want to control 
individual measurements, and then H 2 should be 
a selection matrix, with l’s and 0’s. Fortunately, 
since we in this case want to control as many 
measurements (y 2 ) as inputs (U 2 ), we have that 
H 2 is square in the selected set, and the opti¬ 
mal value of J 2 when H 2 is a selection matrix 
is the same as when H 2 is a full matrix. The 
reason for this is that specifying (controlling) any 
linear combination of y 2 , uniquely determines 
the individual y 2 ? s, since dim(u 2 ) = dim(y 2 ). 
Thus, we can find the optimal selection matrix 
H 2 , by searching through all the candidate square 
sets of y. This can be effectively solved using 
the branch and bound approach of Kariwala and 
Cao, or alternatively it can be solved as a mixed- 
integer problem with a quadratic program (QP) at 
each node (Yelchuru and Skogestad 2012). The 
approach of Yelchuru and Skogestad can also be 
applied to the case where we allow for disjunct 
sets of measurement combinations, which may 
give a lower J 2 in some cases. 


Y 2 = [F 2 W d WJ; F 2 = 


9y op t 

3 d 


= G y / 2 - u U2 ud -G, 


*dy 


where J 2uu = = 


d 2 J 2 


2 G X W 7 ’WG X , J 2ad = 


_ £2 _ 9 

3u3d — ZVT 


W y WG d x, 

Note that this is the same mathematical prob¬ 
lem as the “exact local method” presented for se¬ 
lecting CVi = Hiy for minimizing the economic 
cost J, but because of the specific simple form 


Comments on the state drift approach. 

1. We have assumed that we perfectly control y 2 
using U 2 , at least within the bandwidth of the 
regulatory control system. Once one has found 
a candidate control structure (H 2 ), one should 
check that it is possible to achieve acceptable 
control. This may be done by analyzing the 
input-output controllability of the system y 2 = 
G 2 U 2 + G 2d d, based on the transfer matrices 
G 2 = H 2 G y and G 2 d = H 2 G^. If the control¬ 
lability of this system is not acceptable, then 
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one should consider the second-best matrix H 2 
(with the second-best value of the state drift 
J 2 ) and so on. 

2. The state drift cost drift J 2 = ||Wx|| is in 
principle independent of the economic cost 
(J). This is an advantage because we know 
that the economically optimal operation (e.g., 
active constraints) may change, whereas we 
would like the regulatory layer to remain un¬ 
changed. However, it is also a disadvantage, 
because the regulatory layer determines the 
initial response to disturbances, and we would 
like this initial response to be in the right 
direction economically, so that the required 
correction from the slower supervisory layer 
is as small as possible. Actually, this issue 
can be included by extending the state vector 
x to include also the economic controlled 
variables, CVi, which is selected based on the 
economic cost J. The weight matrix W may 
then be used to adjust the relative weights 
of avoiding drift in the internal states x and 
economic controlled variables CV 1 . 

3. The above steady-state approach does not con¬ 
sider input-output pairing, for which dynamics 
are usually the main issue. The main pairing 
rule is to “pair close” in order to minimize the 
effective time delay between the selected input 
and output. For a more detailed approach, de¬ 
centralized input-output controllability must 
be considered. 


Summary and Future Directions 

Control structure design involves the structural 
decisions that must be made before designing 
the actual controller, and it is in most cases a 
much more important step than the controller 
design. In spite of this, the theoretical tools for 
making the structural decisions are much less 
developed than for controller design. This chapter 
summarizes some approaches, and it is expected, 
or at least hoped, that this important area will 
further develop in the years to come. 

The most important structural decision is 
usually related to selecting the economic con¬ 
trolled variables, CVi = Hiy, and the stabilizing 


controlled variables, CV 2 = H 2 y. However, 
control engineers have traditionally not used 
the degrees of freedom in the matrices Hi and 
H 2 , and this chapter has summarized some 
approaches. 

There has been a belief that the use of “ad¬ 
vanced control,” e.g., MPC, makes control struc¬ 
ture design less important. However, this is not 
correct because also for MPC must one choose 
inputs (MVi = CV 2 S ) and outputs (CVi). The 
selection of CV 1 may to some extent be avoided 
by use of “Dynamic Real-Time Optimization 
(DRTO)” or “Economic MPC,” but these opti¬ 
mizing controllers usually operate on a slower 
time scale by sending setpoints to the basic con¬ 
trol layer (MVi = CV 2 S ), which means that se¬ 
lecting the variables CV 2 is critical for achieving 
(close to) optimality on the fast time scale. 


Cross-References 

► Control Hierarchy of Large Processing Plants: 
An Overview 

► Industrial MPC of Continuous Processes 

► PID Control 


Bibliography 

Alstad V, Skogestad S (2007) Null space method 
for selecting optimal measurement combinations as 
controlled variables. Ind Eng Chem Res 46(3): 
846-853 

Alstad V, Skogestad S, Hori ES (2009) Optimal measure¬ 
ment combinations as controlled variables. J Process 
Control 19:138-148 

Downs JJ, Skogestad S (2011) An industrial and academic 
perspective on plantwide control. Ann Rev Control 
17:99-110 

Engell S (2007) Feedback control for optimal process 
operation. J Proc Control 17:203-219 

Foss AS (1973) Critique of chemical process control 
theory. AIChE J 19(2):209-214 

Kariwala V, Cao Y (2010) Bidirectional branch and bound 
for controlled variable selection. Part III. Local av¬ 
erage loss minimization. IEEE Trans Ind Inform 6: 
54-61 

Kookos IK, Perkins JD (2002) An Algorithmic method 
for the selection of multivariable process control struc¬ 
tures. J Proc Control 12:85-99 



Controllability and Observability 


215 


Morari M, Arkun Y, Stephanopoulos G (1973) Studies 
in the synthesis of control structures for chemical 
processes. Part I. AIChE J 26:209-214 
Narraway LT, Perkins JD (1993) Selection of control 
structure based on economics. Comput Chem Eng 
18:S511—S515 

Skogestad S (2000) Plantwide control: the search for 
the self-optimizing control structure. J Proc Control 
10:487-507 

Skogestad S (2004) Control structure design for complete 
chemical plants. Comput Chem Eng 28(l-2):219-234 
Skogestad S (2012) Economic plantwide control, chap¬ 
ter 11. In: Rangaiah GP, Kariwala V (eds) Plantwide 
control. Recent developments and applications. Wiley, 
Chichester, pp 229-251. ISBN:978-0-470-98014-9 
Skogestad S, Postlethwaite I (2005) Multivariable feed¬ 
back control, 2nd edn. Wiley, Chichester 
van de Wal M, de Jager B (2001) Review of methods for 
input/output selection. Automatica 37:487-510 
Yelchuru R, Skogestad S (2012) Convex formulations for 
optimal selection of controlled variables and measure¬ 
ments using Mixed Integer Quadratic Programming. 
J Process Control 22:995-1007 
Yelchuru R, Skogestad S (2013) Quantitative methods for 
regulatory layer selection. J Process Control 23:58-69 


Controllability and Observability 

H.L. Trentelman 

Johann Bernoulli Institute for Mathematics and 
Computer Science, University of Groningen, 
Groningen, AV, The Netherlands 

Abstract 

State controllability and observability are key 
properties in linear input-output systems in state- 
space form. In the state-space approach, the re¬ 
lation between inputs and outputs is represented 
using the state variables of the system. A natural 
question is then to what extent it is possible 
to manipulate the values of the state vector by 
means of an appropriate choice of the input func¬ 
tion. The concepts of controllability, reachability, 
and null controllability address this issue. An¬ 
other important question is whether it is possible 
to uniquely determine the values of the state 
vector from knowledge of the input and output 


signals over a given time interval. This question 
is dealt with using the concept of observability. 


Keywords 

Controllability; Duality; Indistinguishability; 
Input-output systems in state-space form; 
Observability; Reachability 


Introduction 

In the state-space approach to input-output 
systems, the relation between input signals 
and output signals is represented by means of 
two equations. In the continuous-time case, the 
first of these equations is a first-order vector 
differential equation driven by the input signal 
and is often called the state equation. The second 
equation is an algebraic equation, often called the 
output equation. The unknown in the differential 
equation is called the state vector of the system. 
Given a particular input signal and initial value 
of the state vector, the state equation generates 
a unique solution, called the state trajectory of 
the system. The output equation determines the 
corresponding output signal as a function of this 
state trajectory and the input signal. Thus, in the 
state space approach, the input-output behavior 
of the system is obtained using the state vector as 
an intermediate variable. 

In the context of input-output systems in state- 
space form, the properties of controllability and 
observability characterize the interaction between 
the input, the state, and the output. In particular, 
controllability describes the ability to manipulate 
the state vector of the system by applying ap¬ 
propriate input signals. Observability describes 
the ability to determine the values of the state 
vector from knowledge of the input and output 
over a certain time interval. The properties of 
controllability and observability are fundamental 
properties that play a major role in the analysis 
and control of linear input-output systems in 
state-space form. 
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Systems with Inputs and Outputs 

Consider a continuous-time, linear, time- 
invariant, input-output system in state-space 
form represented by the equations 

x (t) = Ax (t) + Bu (t) , 
y (t) = Cx (t ) + Du ( t ). 


y u (t, xo) = Ce At x o + f K (t — r)u(r) dr 

Jo 

+Du ( t ), (3) 

where K(t):= Ce At B. In the case D = 0, it is 
customary to call K(t) the impulse response. In 
the general case, one would call the distribution 
K(t ) + D8(t) the impulse response. 


This system is referred to as X. In Eq. (1), A, 
B , C, and D are maps (or matrices), and the 
functions x, u , and y are considered to be defined 
on the real axis R or on any subinterval of it. 
In particular, one often assumes the domain of 
definition to be the nonnegative part of R, which 
is without loss of generality since the system is 
time-invariant. The function u is called the input , 
and its values are assumed to be given. The class 
of admissible input functions is denoted by U. Of¬ 
ten, U is the class of piecewise continuous or lo¬ 
cally integrable functions, but for most purposes, 
the exact class from which the input functions are 
chosen is not important. We assume that input 
functions take values in an m -dimensional space 
U, which we often identify with R m . The first 
equation of X is an ordinary differential equation 
for the variable x. For a given initial value of x 
and input function u , the function x is completely 
determined by this equation. The variable x is 
called the state variable and it is assumed to 
take values in an ft-dimensional space A. The 
space X is called the state space. It is usually 
identified with R” . Finally, y is called the output 
of the system and takes values in a -dimensional 
space y, which we identify with R^ 7 . Since the 
system X is completely determined by the maps 
(or matrices) A, B,C, and D , we identify X with 
the quadruple (A, B,C, D). 

The solution of the differential equation of X 
with initial value x(0) = Xo is denoted as x u (t, xo). 
It can be given explicitly using the variation-of- 
constants formula, namely, 

X u (f,x 0 ) = e At x 0 + f e^-^Bu (r) dr. (2) 

Jo 

The corresponding value of y is denoted by 
y u (t, xo). As a consequence of (2), we have 


Controllability 


Controllability is concerned with the ability to 
manipulate the state by choosing an appropriate 
input signal, thus steering the current state to a 
desired future state in a given finite time. Thus, in 
particular, in the differential equation in ( 1 ), we 
study the relation between u and x. We investi¬ 
gate to what extent one can influence the state x 
by a suitable choice of the input u. 

For this purpose, we introduce the (at time 
T) reachable space WV, defined as the space of 
points x\ for which there exists an input u such 
that x u (T, 0) = xi, i.e., the set of points that can 
be reached from the origin at time T. It follows 
from the linearity of the differential equation that 
WV is a linear subspace of A. In fact, (2) implies 


W T = \ e A ^ T ~ T) Bu(z)dz 


ft G U 


(4) 


We call system X reachable at time T if 
every point can be reached from the origin, 
i.e., if WV = X. It follows from ( 2 ) 
that if the system is reachable at time T, 
every point can be reached from every point 
at time T, because the condition for the 
point x\ to be reachable from xo at time T 
is 

xi — e AT xo g y\v. 

The property that every point is reachable 
from any point in a given time interval [ 0 , 
T] is called controllability (at T). Finally, we 
have the concept of null controllability , i.e., 
the possibility to reach the origin from an 
arbitrary initial point. According to (2), for a 
point xo to be null controllable at T, we must 
have 
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e AT xo + f e A ^ T T ^Bu( r)dr = 0 

Jo 

for some u e U. We observe that Vo is null 
controllable at T (by the control u) if and only 
if -e AT xo is reachable at T (by the control u). 
Since e AT is invertible, we see that E is null 
controllable at T if and only if E is reachable 
at T. Henceforth, we refer to the equivalent 
properties reachability, controllability, null con¬ 
trollability simply as controllability (at T). It 
should be remarked that the equivalence of these 
concepts does not hold in other situations, e.g., 
for discrete-time systems. We intend to obtain 
an explicit expression for the space WV and, 
based on this, an explicit condition for control¬ 
lability. This is provided by the following re¬ 
sult. 

Theorem 1 Let q be an n-dimensional row vec¬ 
tor and T> 0. Then the following statements are 
equivalent: 

1. q _LWV (i.e. y rjx = 0 for all x e WV)- 

2. qe tA B = 0 for 0 < t < T. 

3. r)A k B =0fork = 0,1, 2,.... 

4. r] (B AB ■ ■ ■ A n ~ l B) = 0. 

Proof (i) (ii) If q _L WV, then by Eq. (4): 



qe A ^ T Bu (r) dr = 0 


( 5 ) 


for every u e U. Choosing u(t ) = 

B T e A ' r ^ T ~ t ^q T f or Q < t < T yields 


f 


r\e 


A(T- r) 


B 


2 

dr = 0 , 


that A k (ik > n) is a linear combination 
of I, A, ..., A n ~ l as well. Therefore, 
r)A k B = 0 for k = 0,1,..., n — 1 implies 
that r)A k B = 0 for all k e N. □ 

As an immediate consequence of the previous 
theorem, we find that at time T reachable sub¬ 
space WV can be expressed in terms of the maps 
A and B as follows. 

Corollary 1 


W T = im (B AB ••• A n ~ l B). 

This implies that, in fact, WV is independent of 
T,for T > 0. Because of this, we often use W in¬ 
stead ofWr and call this sub space the reachable 
sub space of E. This sub space of the state space 
has the following geometric characterization in 
terms of the maps A and B. 

Corollary 2 W is the smallest A-invariant sub- 
space containing B:=imB. Explicitly, W is A- 
invariant, B C W, and any A-invariant sub- 
space C satisfying B C C also satisfies W C 
C. We denote the smallest A-invariant subspace 
containing B by (A\B), so that we can write 
W = (A\B). For the space {A\B), we have the 
following explicit formula 

( A\B) = B + AB + • • • + A n ~ l B. 

Corollary 3 The following statements are equiv¬ 
alent. 

1. There exists T > 0 such that system E is 
controllable at T. 

2 . (A\B) = X. 

3. Rank (B AB - • A n ~ l B ) = n. 

4. The system E is controllable at Tfor all T>0. 


from which (ii) follows. Conversely, assume 
that (ii) holds. Then (5) holds and hence (i) 
follows. 

(ii) O (iii) This is obtained by power series 

expansion of e At Ylh= o H* 

(iii) (iv) This follows immediately from the 
evaluation of the vector-matrix product. 

(iv) o ( 111 ) This implication is based on the 
Cayley-Hamilton Theorem. According to 
this theorem, A n is a linear combination 
of /, A ,..., A n ~ l . By induction, it follows 


We say that the matrix pair (A, B) is controllable 
if one of these equivalent conditions is satisfied. 

Example 1 Let A and B be defined by 


-(ID' *-d) 


Then (B AB) 


(ID- 


rank(i? AB) = 1, 


and consequently, (A, B) is not controllable. The 
reachable subspace is the span of (B AB), i.e., the 
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line given by the equation 2xi + 3x2 = 0. This 
can also be seen as follows. Let z := 2xi + 3^2, 
then z = z. Hence, if z(0) = 0, which is the case 
if x(0) = 0, we must have z(t) = 0 for all t > 0. 


Observability 

In this section, we include the second of equa¬ 
tions (1), y = Cx + Du, in our considerations. 
Specifically, we investigate to what extent it is 
possible to reconstruct the state x if the input 
u and the output y are known. The motivation 
is that we often can measure the output and 
prescribe (and hence know) the input, whereas 
the state variable is hidden. 

Definition 2 Two states xo and xi in T are 
called indistinguishable on the interval [0, T] if 
for any input u we have y u (t, xo) = y u {t, x i), 
for all 0 < t < T. 

Hence, Xo and X\ are indistinguishable if they 
give rise to the same output values for every 
input u. According to (3), for xo and x\ to be 
indistinguishable on [0, T], we must have that 


f 


Ce At x o+ / K {t — r) u (r) dr + Du (t) 

i 

Ce At x i + f K (t — r) u (r) dr + Du (t) 

Jo 


for 0 < t < T and for any input signal u. We 
note that the input signal does not affect distin- 
guishability, i.e., if one u is able to distinguish 
between two states, then any input is. In fact, 
xo and x\ are indistinguishable if and only if 
Ce At x o =Ce At x\ (0 < t < T). Obviously, Xo 
and x\ are indistinguishable if and only if v : = 
xo — x\ and 0 are indistinguishable. By applying 
Theorem 1 with rj = v T nonzero and transposing 
the equations, it follows that Ce At xo = Ce At x\ 
(0 < t < T) if and only if Ce At v =0 (0 
< t < T) and hence if and only if CA k v =0 
(k = 0,1,2,...). The Cayley-Hamilton Theorem 
implies that we need to consider the first n terms 
only, i.e., 


/ C 
CA 
CA 2 


\ 


v = 0 . 


\CA n ~ l / 


( 6 ) 


As a consequence, the distinguishability of two 
vectors does not depend on T. The space of 
vectors v for which (6) holds is denoted (ker 
C | A) and called the unobservable subspace. It is 
equivalently characterized as the intersection of 
the spaces ker CA k for k = 0,..., n — 1, i.e., 


n —1 

(ker CM) = P| ker CA k . 

k =0 


Equivalently, (ker C | A) is the largest A-invariant 
subspace contained in ker C. Finally, another 
characterization is 66 v e (ker C\A) if and only if 
yo(t, v ) is identically zero,” where the subscript 
“0” refers to the zero input. 

Definition 3 System I] is called observable if 
any two distinct states are not indistinguishable. 

The previous considerations immediately lead to 
the result. 


Theorem 2 The following statements are equiv¬ 
alent. 

1. The system I] is observable. 

2. Every nonzero state is not indistinguishable 
from the origin. 

3. (ker C\A) =0. 

4. Ce A, v = 0 (0 < t < T) v =0. 




5. Rank 


C 

CA 

CA 2 


\ 


= n. 


\CA n ~ l ) 


Since observability is completely determined 
by the matrix pair (C, A), we will say “(C, A) is 
observable” instead of “system I] is observable.” 

There is a remarkable relation between the 
controllability and observability properties, 
which is referred to as duality. This property 
is most conspicuous from the conditions (3) in 
Corollary 3 and (5) in Theorem 2, respectively. 
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Specifically, (C, A) is observable if and only if 
( A T , C T ) is controllable. As a consequence of 
duality, many theorems on controllability can 
be translated into theorems on observability and 
vice versa by mere transposition of matrices. 

Example 2 Let 


C: = ( 1 -1), 


Then 


rank 



= rank 




= 1 , 


hence, (C, A) is not observable. Notice that if v e 
(ker C | A) and u = 0, identically, then y = 0, 
identically. In this example, (ker C | A) is the span 
of (1, 1) T . 


Summary and Future Directions 

The property of controllability can be tested by 
means of a rank test on a matrix involving the 
maps A and B appearing in the state equation of 
the system. Alternatively, controllability is equiv¬ 
alent to the property that the reachable subspace 
of the system is equal to the state space. The prop¬ 
erty of observability allows a rank test on a matrix 
involving the maps A and C appearing in the 
system equations. An alternative characterization 
of this property is that the unobservable subspace 
of the system is equal to the zero subspace. Con¬ 
cepts of controllability and observability have 
also been defined for discrete-time systems and, 
more generally, for time-varying systems and 
for continuous-time and discrete-time nonlinear 
systems. 


Cross-References 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 


► Linear Systems: Continuous-Time, Time-Vary¬ 
ing State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Varying, 
State Variable Descriptions 

► Realizations in Linear Systems Theory 


Recommended Reading 

The description of linear systems in terms of 
a state space representation was particularly 
stressed by R. E. Kalman in the early 1960s (see 
Kalman 1960a,b, 1963), Kalman et al. (1963). 
See also Zadeh and Desoer (1963) and Gilbert 
(1963). In particular, Kalman introduced the 
concepts of controllability and observability and 
gave the conditions expressed in Corollary 3, 
time (3), and Theorem 5, item (5). Alternative 
conditions for controllability and observability 
have been introduced in Hautus (1969) and 
independently by a number of authors; see Popov 
(1966) and Popov (1973). Other references are 
Belevitch (1968) and Rosenbrock (1970). 
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Abstract 

Process control performance is a cornerstone of 
operational excellence in a broad spectrum of 
industries such as refining, petrochemicals, pulp 
and paper, mineral processing, power and waste 
water treatment. Control performance assessment 
and monitoring applications have become main¬ 
stream in these industries and are changing the 
maintenance methodology surrounding control 
assets from predictive to condition based. The 
large numbers of these assets on most sites com¬ 
pared to the number of maintenance and control 
personnel have made monitoring and diagnosing 
control problems challenging. For this reason, au¬ 
tomated controller performance monitoring tech¬ 
nologies have been readily embraced by these 
industries. 

This entry discusses the theory as well as 
practical application of controller performance 
monitoring tools as a requisite for monitoring and 
maintaining basic as well as advanced process 
control (APC) assets in the process industry. The 
section begins with the introduction to the theory 
of performance assessment as a technique for 
assessing the performance of the basic control 
loops in a plant. Performance assessment al¬ 
lows detection of performance degradation in the 


basic control loops in a plant by monitoring the 
variance in the process variable and compar¬ 
ing it to that of a minimum variance controller. 
Other metrics of controller performance are also 
reviewed. The resulting indices of performance 
give an indication of the level of performance 
of the controller and an indication of the ac¬ 
tion required to improve its performance; the 
diagnosis of poor performance may lead one to 
look at remediation alternatives such as: retuning 
controller parameters or process reengineering to 
reduce delays or implementation of feed-forward 
control or attribute poor performance to faulty 
actuators or other process nonlinearities. 

Keywords 

Time series analysis; Minimum variance control; 
Control loop performance assessment; Perfor¬ 
mance monitoring; Fault detection and diagnosis 

Introduction 

A typical industrial process, as in a petroleum 
refinery or a petrochemical complex, includes 
thousands of control loops. Instrumentation tech¬ 
nicians and engineers maintain and service these 
loops, but rather infrequently. However, industrial 
studies have shown that as many as 60% of 
control loops may have poor tuning or config¬ 
uration or actuator problems and may therefore 
be responsible for suboptimal process perfor¬ 
mance. As a result, monitoring of such control 
strategies to detect and diagnose cause(s) of un¬ 
satisfactory performance has received increasing 
attention from industrial engineers. Specifically 
the methodology of data-based controller per¬ 
formance monitoring (CPM) is able to answer 
questions such as the following: Is the controller 
doing its job satisfactorily and if not, what is the 
cause of poor performance? 

The performance of process control assets is 
monitored on a daily basis and compared with in¬ 
dustry benchmarks. The monitoring system also 
provides diagnostic guidance for poorly perform¬ 
ing control assets. Many industrial sites have 
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established reporting and remediation workflows 
to ensure that improvement activities are carried 
out in an expedient manner. Plant-wide perfor¬ 
mance metrics can provide insight into company¬ 
wide process control performance. Closed-loop 
tuning and modeling tools can also be deployed 
to aid with the improvement activities. Survey ar¬ 
ticles by Thornhill and Horch (2007) and Shardt 
et al. (2012) provide a good overview of the over¬ 
all state of CPM and the related diagnosis issues. 
CPM software is now readily available from most 
DCS vendors and has already been implemented 
successfully at many large-scale industrial sites 
throughout the world. 

Univariate Control Loop Performance 
Assessment with Minimum Variance 
Control as Benchmark 

It has been shown by Harris (1989) that for a 
system with time delay d , a portion of the output 
variance is feedback control invariant and can be 
estimated from routine operating data. This is the 
so-called minimum variance output. Consider the 
closed-loop system shown in Fig. 1, where Q is 
the controller transfer function, T is the process 
transfer function, d is the process delay (in terms 
of sample periods), and N is the disturbance 
transfer function driven by random white-noise 
sequence, a t . 

In the regulatory mode (when the set point 
is constant), the closed-loop transfer function 
relating the process output and the disturbance is 
given by 


Closed-loop response: y t 


\\+q- d TQJ 


Note that ah transfer functions are expressed for 
the discrete time case in terms of the backshift op¬ 
erator, q ~ l . N represents the disturbance trans¬ 
fer function with numerator and denominator 
polynomials in q~ l . The division of the numera¬ 
tor by the denominator can be rewritten as: N = 
F + q~ d R, where the quotient term, F = Fq + 
F\q~ l + ••• + Fd-\q~^ d ~ 1 ^ is a polynomial of 
order (d — 1) and the remainder, R is a transfer 
function. The closed-loop transfer function can 
be reexpressed, after algebraic manipulation as 
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The closed-loop output can then be expressed as 


yt = e t + w t -d 


where e t = Fd t corresponds to the first d — 1 
lags of the closed-loop expression for the output, 
y t , and more importantly is independent of the 
controller, Q , or it is controller invariant, while 
w t -d is dependent on the controller. The variance 
of the output is then given by 



Controller Performance Monitoring, Fig. 1 Block di¬ 
agram of a regulatory control loop 


Vdr(y t ) = Vdr(e t ) + Vdr(w t - d ) > Var(e t ) 

Since e t is controller invariant, it provides the 
lower bound on the output variance. This is nat¬ 
urally achieved if w t ~d = 0, that is, when 
R = FTQ or when the controller is a minimum 
variance controller with Q = -5~. If the total 
output variance is denoted as Vdr ( y t ) = cr 2 , then 
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the lowest achievable variance is Mar ( e t ) = (J ^ v . 
To obtain an estimate of the lowest achievable 
variance from the time series of the process 
output, one needs to model the closed-loop output 
data y t by a moving average process such as 


yt — fo a t + + * * * + fd-ia t -(d-i) 

— ---v----' 

e t 


+ fd^t-d + fd-\-l^t-(d-\-l) + * * * (1) 


The controller-invariant term e t can then be esti¬ 
mated by time series analysis of routine closed- 
loop operating data and subsequently used as 
a benchmark measure of theoretically achiev¬ 
able absolute lower bound of output variance to 
assess control loop performance.Harris (1989), 


Desborough and Harris (1992), and Huang and 
Shah (1999) have derived algorithms for the cal¬ 
culation of this minimum variance term. 

Multiplying Eq. (1) by a t , a t ~\,..., a t -d+u 
respectively, and then taking the expectation of 
both sides of the equation yield the sample co- 
variance terms: 


r ya (0) = E [; y,a t ] = f 0 (J 2 

r ya (1) = E\y t a,-i] = fio 2 
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r ya (d - 1) = E [y t a t - d +\] = fd-irf . 

The minimum variance or the invariant portion of 
output variance is 


a mv ~ ( fo + f\ + fl "+ I" fd-1 ) 
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A measure of controller performance index can Substituting Eq. (3) into Eq. (4) yields 
then be defined as 
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where Z is the vector of cross correlation coeffi¬ 
cients between y t and a t for lags 0 to d — 1 and 
is denoted as 


Although a t is unknown, it can be replaced by 
the estimated innovation sequence a t . The es¬ 
timate a t is obtained by whitening the process 
output variable y t via time series analysis. This 
algorithm is denoted as the FCOR algorithm 


Z ~ \_Pya (0) Pya (1) Pya (2) . . . Pya (d 1)] 
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for Filtering and CORrelation analysis (Huang 
and Shah 1999). This derivation assumes that 
the delay, d , be known a priori. In practice, 
however, a priori knowledge of time delays may 
not always be available. It is therefore useful to 
assume a range of time delays and then calculate 
performance indices over this range of the time 
delays. The indices over a range of time delays 
are also known as extended horizon performance 
indices (Thornhill et al. 1999). Through pattern 
recognition, one can tell the performance of the 
loop by visualizing the patterns of the perfor¬ 
mance indices versus time delays. There is a clear 
relation between performance indices curve and 
the impulse response curve of the control loop. 

Consider a simple case where the process is 
subject to random disturbances. Figure 2 is one 
example of performance evaluation for a control 
loop in the presence of disturbances. This fig¬ 
ure shows time-series of process variable data 
for both loops in the left column, closed-loop 
impulse responses (middle column) and corre¬ 
sponding performance indices (labeled as PI on 
the right column). From the impulse responses, 
one can see that the loop under the first set 
of tuning constants (denoted as TAG1.PV) has 
better performance; the loop under the second 
set of tuning constants (denoted as TAG5.PV) 
has oscillatory behavior, indicating a relatively 
poor control performance. With performance in¬ 
dex “1” indicating the best possible performance 
and index “0” indicating the worst performance, 
performance indices for the first controller tuning 
(shown on the upper-right plot) approach “1” 
within 4 time lags, while performance indices 
for the second controller tuning (shown on the 
bottom-right plot) take 10 time lags to approach 
“0.7.” In addition, performance indices for the 
second tuning show ripples as they approach an 
asymptotic limit, indicating a possible oscillation 
in the loop. 

Notice that one cannot rank performance of 
these two controller settings from the noisy time- 
series data. Instead, we can calculate performance 
indices over a range of time delays (from 1 to 10). 
The result is shown on the right column plots of 
Fig. 2. These simulations correspond to the same 
process with different controller tuning constants. 


It is clear from these plots that performance 
indices trajectory depends on dynamics of the 
disturbance and controller tuning. 

It is important to note that the minimum vari¬ 
ance is just one of several benchmarks for obtain¬ 
ing a controller performance metric. It is seldom 
practical to implement minimum variance control 
as it typically will require aggressive actuator ac¬ 
tion. However, the minimum variance benchmark 
serves to provide an indication of the opportunity 
in improving control performance; that is, should 
the performance index r](d) be near or just above 
zero, then it gives the user an idea of the benefits 
possible in improving the control performance of 
that loop. 

Performance Assessment and 
Diagnosis of Univariate Control Loop 
Using Alternative Performance 
Indicators 

In addition to the performance index for perfor¬ 
mance assessment, there are several alternative 
indicators of control loop performance. These are 
discussed next. 

Autocorrelation function: The autocorrelation 
function (ACF) of the output error, shown in 
Fig. 3, is an approximate measure of how close 
the existing controller is to minimum variance 
condition or how predictable the error is over the 
time horizon of interest. If the controller is under 
minimum variance condition then the autocorre¬ 
lation function should decay to zero after “d — 1” 
lags where “<7” is the delay of the process. In 
other words, there should be no predictable infor¬ 
mation beyond time lag d — 1. The rate at which 
the autocorrelation decays to zero after “d — 1” 
lags indicates how close the existing controller 
is to the minimum variance condition. Since it is 
straightforward to calculate autocorrelation using 
process data, the autocorrelation function is often 
used as a first-pass test before carrying out further 
performance analysis. 

Impulse response: An impulse response func¬ 
tion curve represents the closed-loop impulse 
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Impulse Response - TAG1.PV 



Pl(var) vs. Delay: TAG1.PV 



Lag 


Impulse Response - TAG5.PV 



Lag 


Pl(var) vs. Delay: TAG5.PV 



Controller Performance Monitoring, Fig. 2 Time series of process variable (top), corresponding impulse responses 
(left column) and their performance indices (right column). 
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Controller Performance 
Monitoring, Fig. 3 

Autocorrelation function of 
the controller error 


Controller Performance 
Monitoring, Fig. 4 

Impulse responses 
estimated from routine 
operating data 


Auto Correlation 
Loop#! 




setpoint 


response between the whitened disturbance se¬ 
quence and the process output. This function is 
a direct measure of how well the controller is 
performing in rejecting disturbances or tracking 
set-point changes. Under stochastic framework, 
this impulse response function may be calculated 
using time series model such as an Autoregres¬ 
sive Moving Average (ARMA) or Autoregres¬ 
sive with Integrated Moving Average (ARIMA) 
model. Once an ARMA type of time series model 
is estimated, the infinite-order moving average 
representation of the model shown in Eq. (1) can 
be obtained through a long division of the ARMA 
model. As shown in Huang and Shah (1999), the 
coefficients of the moving average model, Eq. (1), 
are the closed-loop impulse response coefficients 
of the process between whitened disturbances 
and the process output. Figure 4 shows closed- 
loop impulse responses of a control loop with two 
different control tunings. Clearly they denote two 
different closed-loop dynamic responses: one is 
slow and smooth, and the other one is relatively 
fast and slightly oscillatory. The sum of square of 
the impulse response coefficients is the variance 
of the data. 


Spectral analysis: The closed-loop frequency 
response of the process is an alternative way to 
assess control loop performance. Spectral analy¬ 
sis of output data easily allows one to detect oscil¬ 
lations, offsets, and measurement noise present in 
the process. The closed-loop frequency response 
is often plotted together with the closed-loop fre¬ 
quency response under minimum variance con¬ 
trol. This is to check the possibility of perfor¬ 
mance improvement through controller tunings. 
The comparison gives a measure of how close 
the existing controller is to the minimum vari¬ 
ance condition. In addition, it also provides the 
frequency range in which the controller signif¬ 
icantly deviates from minimum variance condi¬ 
tion. Large deviation in the low-frequency range 
typically indicates lack of integral action or weak 
proportional gain. Large peaks in the medium- 
frequency range typically indicate an overtuned 
controller or presence of oscillatory disturbances. 
Large deviation in the high-frequency range typ¬ 
ically indicates significant measurement noise. 
As an illustrative example, frequency responses 
of two control loops are shown in Fig. 5. The 
left graph of the figure shows that closed-loop 
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Closed-Loop vs. Min. Variance Output Response 
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Controller Performance Monitoring, Fig. 5 Frequency response estimated from routine operating data 


frequency response of the existing controller is 
almost the same as the frequency response under 
minimum variance control. A peak at the mid¬ 
frequency indicates possible overtuned control. 
The right graph of Fig. 5 shows that the frequency 
response of the existing controller is oscillatory, 
indicating a possible overtuned controller or the 
presence of an oscillatory disturbance at the peak 
frequency; otherwise the controller is close to 
minimum variance condition. 

Segmentation of performance indices: Most 
process data exhibit time- varying dynamics; i.e., 
the process transfer function or the disturbance 
transfer function is time variant. Performance 
assessment with a non-overlapping sliding data 
window that can track time-varying dynamics 


is therefore often desirable. For example, seg¬ 
mentation of data may lead to some insight into 
any cyclical behavior of the process variation in 
controller performance during, e.g., day/night or 
due to shift change. Figure 6 is an example of 
performance segmentation over a 200 data point 
window. 

Performance Assessment of 
Univariate Control Loops Using 
User-Specified Benchmarks 

The increasing level of global competitive¬ 
ness has pushed chemical plants into high- 
performance operating regions that require 
advanced process control technology. See the 
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articles ► Control Hierarchy of Large Processing 
Plants: An Overview and ►Control Structure 
Selection. Consequently, the industry has an 
increasing need to upgrade the conventional 
PID controllers to advanced control systems. 
The most natural questions to ask for such an 
upgrading are as follows. Has the advanced 
controller improved performance as expected? 
If yes, where is the improvement and can it 
be justified? Has the advanced controller been 
tuned to its full capacity? Can this improvement 
also be achieved by simply retuning the existing 
traditional (e.g., PID) controllers? (see ►PID 
Control). In other words, what is the cost 
versus benefit of implementing an advanced 
controller? Unlike performance assessment using 
minimum variance control as benchmark, the 
solution to this problem does not require a 
priori knowledge of time delays. Two possible 
relative benchmarks may be chosen: one is the 
historical data benchmark or reference data set 
benchmark, and the other is a user-specified 
benchmark. 

The purpose of reference data set benchmark¬ 
ing is to compare performance of the existing 
controller with the previous controller during the 
“normal” operation of the process. This reference 
data set may represent the process when the 
controller performance is considered satisfactory 
with respect to meeting the performance objec¬ 
tives. The reference data set should be represen¬ 
tative of the normal conditions that the process is 
expected to operate at; i.e., the disturbances and 
set-point changes entering into the process should 
not be unusually different. This analysis provides 
the user with a relative performance index (RPI) 
which compares the existing control loop perfor¬ 
mance with a reference control loop benchmark 
chosen by the user. The RPI is bounded by 
0 < RPI < (X), with “<1” indicating dete¬ 
riorated performance, “1” indicating no change 
of performance, and “>1” indicating improved 
performance. Figure 6 shows a result of reference 
data set benchmarking. The impulse response 
of the benchmark or reference data smoothly 
decays to zero, indicating good performance of 
the controller. After one increases the propor¬ 
tional gain of the controller, the impulse response 


shows oscillatory behavior, with an RPI = 0.4, 
indicating deteriorated performance due to the 
oscillation. 

In some cases one may wish to specify cer¬ 
tain desired closed-loop dynamics and carry out 
performance analysis with respect to such desired 
dynamics. One such desired dynamic benchmark 
is the closed-loop settling time. As an illustrative 
example, Fig. 8 shows a system where a settling 
time of ten sampling units is desired for a process 
with a delay of five sampling units. The impulse 
responses show that the existing loop is close to 
the desired performance, and the value of RPI = 
0.9918 confirms this. Thus no further tuning of 
the loop is necessary. 

Diagnosis of Poorly Performing 
Loops 

Whereas detection of poorly performing loops 
is now relatively simple, the task of diagnos¬ 
ing reason(s) for poor performance and how to 
“mend” the loop is generally not straightforward. 
The reasons for poor performance could be any 
one of interactions between various control loops, 
overtuned or undertuned controller settings, pro¬ 
cess nonlinearity, poor controller configuration 
(meaning the choice of pairing a process (or 
controlled) variable with a manipulative variable 
loop), or actuator problems such as stiction, large 
delays, and severe disturbances. Several studies 
have focused on the diagnosis issues related to 
actuator problems (Haagglund 2002; Choudhury 
et al. 2008; Srinivasan and Rengaswamy 2008; 
Xiang and Lakshminarayanan 2009; de Souza 
et al. 2012). Shardt et al. (2012) has given an 
overview of the overall state of CPM and the 
related diagnosis issues. 

Industrial Applications of CPM 
Technology 

As remarked earlier, CPM software is now read¬ 
ily available from most DCS vendors and has 
already been implemented successfully at several 
large-scale industrial sites. A summary of just 
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Controller Performance 
Monitoring, Fig. 6 

Performance indices for 
segmented data (each of 
window length 200 points) 


Controller Performance 
Monitoring, Fig. 7 

Reference benchmarking 
based on impulse responses 


Controller Performance 
Monitoring, Fig. 8 

User-specified benchmark 
based on impulse responses 
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two of many large-scale industrial implementa¬ 
tions of CPM technology appears below. It gives 
a clear evidence of the impact of this control 
technology and how readily it has been embraced 
by industry (Shah et al. 2014). 

BASF Controller Performance Monitoring 
Application 

As part of its excellence initiative OPAL 21 
(Optimization of Production Antwerp and Lud- 
wigshafen), BASF has implemented the CPM 
strategy on more than 30,000 control loops at 
its Ludwigshafen site in Germany and on over 
10,000 loops at its Antwerp production facility 
in Belgium. The key factor in using this technol¬ 
ogy effectively is to combine process knowledge, 
basic chemical engineering, and control expertise 
to develop solutions for the indicated control 
problems that are diagnosed in the CPM software 
(Wolff et al. 2012). 

Saudi Aramco Controller Performance 
Monitoring Practice 

As part of its process control improvement ini¬ 
tiative, Saudi Aramco has deployed CPM on 
approximately 15,000 PID loops, 50 MPC appli¬ 
cations, and 500 smart positioners across multiple 
operating facilities. 

The operational philosophy of the CPM en¬ 
gine is incorporated in the continuous improve¬ 
ment process at BASF and Aramco, whereby all 
loops are monitored in real-time and a holistic 
performance picture is obtained for the entire 
plant. Unit-wide performance metrics are dis¬ 
played in effective color-coded graphic forms to 
effectively convey the analytics information of 
the process. 

Concluding Remarks 

In summary, industrial control systems are de¬ 
signed and implemented or upgraded with a par¬ 
ticular objective in mind. The controller perfor¬ 
mance monitoring methodology discussed here 
will permit automated and repeated reviews of 
the design, tuning, and upgrading of the control 
loops. Poor design, tuning, or upgrading of the 


control loops can be detected, and repeated per¬ 
formance monitoring will indicate which loops 
should be retuned or which loops have not been 
effectively upgraded when changes in the dis¬ 
turbances, in the process, or in the controller 
itself occur. Obviously better design, tuning, and 
upgrading will mean that the process will operate 
at a point close to the economic optimum, leading 
to energy savings, improved safety, efficient uti¬ 
lization of raw materials, higher product yields, 
and more consistent product qualities. This entry 
has summarized the major features available in 
recent commercial software packages for control 
loop performance assessment. The illustrative ex¬ 
amples have demonstrated the applicability of 
this new technique when applied to process data. 

This entry has also illustrated how controllers, 
whether in hardware or software form, should 
be treated like “capital assets” and how there 
should be routine monitoring to ensure that they 
perform close to the economic optimum and that 
the benefits of good regulatory control will be 
achieved. 


Cross-References 

► Control Hierarchy of Large Processing Plants: 
An Overview 

► Control Structure Selection 

► Fault Detection and Diagnosis 

► PID Control 

► Statistical Process Control in Manufacturing 
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Abstract 

This chapter presents an overview of the main 
issues related to modeling and control of coop¬ 
erative robotic manipulators. A historical path 
is followed to present the main research results 


on cooperative manipulation. Kinematics and dy¬ 
namics of robotic arms cooperatively manipu¬ 
lating a tightly grasped rigid object are briefly 
discussed. Then, this entry presents the main 
strategies for force/motion control of the cooper¬ 
ative system. 


Keywords 

Cooperative task space; Coordinated motion; 
Force/motion control; Grasping; Manipulation; 
Multi-arm systems 


Introduction 

Since the early 1970s, it has been recognized that 
many tasks, which are difficult or even impossi¬ 
ble to execute by a single robotic manipulator, 
become feasible when two or more manipulators 
work in a cooperative way. Examples of typical 
cooperative tasks are the manipulation of heavy 
and/or large payloads, assembly of multiple parts, 
and handling of flexible and articulated objects 
(Fig. 1). 

In the 1980s, research achieved several the¬ 
oretical results related to modeling and control 
of to single-arm robots; this further fostered re¬ 
search on multi-arm robotic systems. Dynamics 



Cooperative Manipulators, Fig. 1 An example of a 
cooperative robotic work cell composed by two industrial 
robot arms 
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and control as well as force control issues have 
been widely explored along the decade. 

In the 1990s, parameterization of the 
constraint forces/moments acting on the object 
has been recognized as a key to solving control 
problems and has been studied in several 
papers (e.g., Sang et al. 1995; Uchiyama and 
Dauchez 1993; Walker et al. 1991; Williams 
and Khatib 1993). Several control schemes for 
cooperative manipulators based on the sought 
parameterizations have been designed, including 
force/motion control (Wen and Kreutz-Delgado 
1992) and impedance control (Bonitz and Hsia 
1996; Schneider and Cannon 1992). Other 
approaches are adaptive control (Hu et al. 1995), 
kinematic control (Chiacchio et al. 1996), task- 
space regulation (Caccavale et al. 2000), and 
model-based coordinated control (Hsu 1993). 
Other important topics investigated in the 1990s 
were the definition of user-oriented task-space 
variables for coordinated control (Caccavale et al. 
2000; Chiacchio et al. 1996), the development of 
meaningful performance measures (Chiacchio 
et al. 1991a,b) for multi-arm systems, and the 
problem of load sharing (Walker et al. 1989). 

Most of the abovementioned works assume 
that the cooperatively manipulated object is 
rigid and tightly grasped. However, since 
the 1990s, several research efforts have 
been focused on the control of cooperative 
flexible manipulators (Yamano et al. 2004), 
since flexible-arm robot merits (lightweight 
structure, intrinsic compliance, and hence safety) 
can be conveniently exploited in cooperative 
manipulation. Other research efforts have been 
focused on the control of cooperative systems 
for the manipulation of flexible objects (Yukawa 
et al. 1996) as well. 


Modeling, Load Sharing, 
and Performance Evaluation 

The first modeling goal is the definition of 
suitable variables describing the kinetostatics of 
a cooperative system. Hereafter, the main results 
available are summarized for a dual-arm system 


composed by two cooperative manipulators 
grasping a common object. 

The kinetostatic formulation proposed by 
Uchiyama and Dauchez (1993), i.e., the so-called 
symmetric formulation , is based on kinematic 
and static relationships between generalized 
forces/velocities acting at the object and their 
counterparts acting at the manipulators end 
effectors. To this aim, the concept of virtual 
stick is defined as the vector which determines 
the position of an object-fixed coordinate frame 
with respect to the frame attached to each robot 
end effector (Fig. 2). When the object grasped 
by the two manipulators can be considered rigid 
and tightly attached to each end effector, then the 
virtual stick behaves as a rigid stick fixed to each 
end effector. 

According to the symmetric formulation, the 
vector, h, collecting the generalized forces (i.e., 
forces and moments) acting at each end effector 
is given by 

h = W^h E + Vh I: (1) 

where W is the so-called grasp matrix , the 
columns of V span the null space of the 



Cooperative Manipulators, Fig. 2 Grasp geometry 
for a two-manipulator cooperative system manipulating 
a common object. The vectors r i and r 2 are the virtual 
sticks, T c is the coordinate frame attached to the object, 
and 7i and T 2 are the coordinate frames attached to each 
end effector 








232 


Cooperative Manipulators 


grasp matrix, and hi is the generalized force 
vector which does not contribute to the object’s 
motion, i.e., it represents internal loading of the 
object (mechanical stresses) and is termed as 
internal forces, while He represents the vector of 
external forces, i.e., forces and moments causing 
the object’s motion. Later, a task-oriented 
formulation has been proposed (Chiacchio et al. 
1996), aimed at defining a cooperative task space 
in terms of absolute and relative motion of 
the cooperative system, which can be directly 
computed from the position and orientation of 
the end-effector coordinate frames. 

The dynamics of a cooperative multi-arm sys¬ 
tem can be written as the dynamics of the single 
manipulators together with the closed-chain con¬ 
straints imposed by the grasped object. By elimi¬ 
nating the constraints, a reduced-order model can 
be obtained (Koivo and Unseren 1991). 

Strongly related to kinetostatics and dynamics 
of cooperative manipulators is the load sharing 
problem, i.e., distributing the load among the 
arms composing the system, which has been 
solved, e.g., in Walker et al. (1989). A very rele¬ 
vant problem related to the load sharing is that of 
robust holding, i.e., the problem of determining 
forces/moments applied to object by the arms, in 
order to keep the grasp even in the presence of 
disturbing forces/moments. 

A major issue in robotic manipulation is the 
performance evaluation via suitably defined 
indexes (e.g., manipulability ellipsoids). These 
concepts have been extended to multi-arm robotic 
systems in Chiacchio et al. (1991a,b). Namely, by 
exploiting the kinetostatic formulations described 
above, velocity and force manipulability 
ellipsoids can be defined, by regarding the whole 
cooperative system as a mechanical transformer 
from the joint space to the cooperative task space. 
The manipulability ellipsoids can be seen as 
performance measures aimed at determining the 
attitude of the system to cooperate in a given 
configuration. 

Finally, it is worth mentioning the strict re¬ 
lationship between problems related to grasping 
of objects by fingers/hands and those related 
to cooperative manipulation. In fact, in both 
cases, multiple manipulation structures grasp 


a commonly manipulated object. In multifingered 
hands, only some motion components are 
transmitted through the contact point to the 
manipulated object (unilateral constraints), while 
cooperative manipulation via robotic arms is 
achieved by rigid (or near-rigid) grasp points and 
interaction takes place by transmitting all the 
motion components through the grasping points 
(bilateral constraints). While many common 
problems between the two fields can be tackled 
in a conceptually similar way (e.g., kinetostatic 
modeling, force control), many others are specific 
of each of the two application fields (e.g., form 
and force closure for multifingered hands). 

Control 

When a cooperative multi-arm system is em¬ 
ployed for the manipulation of a common object, 
it is important to control both the absolute motion 
of the object and the internal stresses applied 
to it. Hence, most of the control approaches to 
cooperative robotic systems can be classified as 
force/motion control schemes. 

Early approaches to the control of cooperative 
systems were based on the master/slave concept. 
Namely, the cooperative system is decomposed 
in a position-controlled master arm, in charge 
of imposing the absolute motion of the object, 
and the force-controlled slave arms, which are 
to follow (as smoothly as possible) the motion 
imposed by the master. A natural evolution of the 
above-described concept has been the so-called 
leader/follower approach, where the follower 
arm reference motion is computed via closed- 
chain constraints. However, such approaches 
suffered from implementation issues, mainly due 
to the fact that the compliance of the slave arms 
has to be very large, so as to smoothly follow the 
motion imposed by the master arm. Moreover, 
the roles of the master and slave (leader and 
follower) may need to be changed during the task 
execution. 

Due to the abovementioned limitations, more 
natural nonmaster/slave approaches have been 
pursued later, where the cooperative system is 
seen as a whole. Namely, the reference motion 
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of the object is used to determine the motion of 
all the arms in the system and the interaction 
forces are measured and fed back so as to be 
directly controlled. To this aim, the mappings 
between forces and velocities at the end effector 
of each manipulator and their counterparts at the 
manipulated object are considered in the design 
of the control laws. 

An approach, based on the classical hybrid 
force/position control scheme, has been proposed 
in Uchiyama and Dauchez (1993), by exploiting 
the symmetric formulation described in the pre¬ 
vious section. 

In Wen and Kreutz-Delgado (1992) a 
Lyapunov-based approach is pursued to devise 
force/position PD-type control laws. This 
approach has been extended in Caccavale et al. 
(2000), where kinetostatic filtering of the control 
action is performed, so as to eliminate all the 
components of the control input which contribute 
to internal stresses at the object. 

A further improvement of the PD plus 
gravity compensation control approach has 
been achieved by introducing a full model 
compensation, so as to achieve feedback 
linearization of the closed-loop system. The 
feedback linearization approach formulated at 
the operational space level is the base of the so- 
called augmented object approach (Sang et al. 
1995). In this approach, the system is modeled 
in the operational space as a whole, by suitably 
expressing its inertial properties via a single 
augmented inertia matrix Mo , he., 

Mo{x e )xe +c 0 (xe,x e ) +go( x E) =h E , (2) 

where Mo, Co, and g 0 are the operational space 
terms modeling, respectively, the inertial proper¬ 
ties of the whole system (manipulators and ob¬ 
ject), the Coriolis/centrifugal/friction terms, and 
the gravity terms, while Xe is the operational 
space vector describing the position and orien¬ 
tation of the coordinate frame attached to the 
grasped object. In the framework of feedback lin¬ 
earization (formulated in the operational space), 
the problem of controlling the internal forces can 
be solved, e.g., by resorting to the virtual linkage 


model (Williams and Khatib 1993) or according 
to the scheme proposed in Hsu (1993). 

An alternative control approach is based on 
the well-known impedance concept (Bonitz and 
Hsia 1996; Schneider and Cannon 1992). In fact, 
when a manipulation system interacts with an 
external environment and/or other manipulators, 
large values of the contact forces and moments 
can be avoided by enforcing a compliant behavior 
with suitable dynamic features. In detail, the fol¬ 
lowing mechanical impedance behavior between 
the object displacements and the forces due to the 
object-environment interaction can be enforced 
(i external impedance)'. 

Me&e +De$e -\-K e eE = ^env> (3) 

where e e represents the vector of displacements 
between object’s desired and actual pose, ve is 
the difference between the object’s desired and 
actual generalized velocities, cle is the difference 
between the object’s desired and actual gener¬ 
alized accelerations, and h Qm is the generalized 
force acting on the object, due to the interaction 
with the environment. The impedance dynamics 
is characterized in terms of given positive definite 
mass, damping, and stiffness matrices (Me, De, 
Ke ). A mechanical impedance behavior between 
the i th end-effector displacements and the in¬ 
ternal forces can be imposed as well (internal 
impedance)'. 

Mij dj + Dj i Vi + K/j ei = hi i , (4) 

where ei is the vector expressing the displace¬ 
ment between the commanded and the actual 
pose of the /th end effector, v z is the vector 
expressing the difference between commanded 
and actual generalized velocities of the i th end 
effector, a, is the vector expressing the difference 
between commanded and actual generalized ac¬ 
celerations of the i th end effector, and hi i is the 
contribution of the / th end effector to the internal 
force. Again, the impedance dynamics is charac¬ 
terized in terms of given positive definite mass, 
damping, and stiffness matrices (Mij,Dij,Kij). 
More recently, an impedance scheme for control 
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of both external forces and internal forces has 
been proposed (Caccavale et al. 2008). 


Summary and Future Directions 

This entry has provided a brief survey of the 
main issues related to cooperative robots, with 
special emphasis on modeling and control prob¬ 
lems. Among several open research topics in 
cooperative manipulation, it is worth mentioning 
the problem of cooperative transportation and 
manipulation of objects via multiple mobile ma¬ 
nipulators. In fact, although notable results have 
been already devised in Khatib et al. (1996), 
the foreseen use of robotic teams in industrial 
settings (hyperflexible robotic work cells) and/or 
in collaboration with humans (robotic coworker 
concept) raises new challenges related to auton¬ 
omy and safety of multiple mobile manipulators. 
Also, an emerging application field is given by 
cooperative systems composed by multiple aerial 
vehicle-manipulator systems (see, e.g., Fink et al. 
2011 ). 


Cross-References 

► Force Control in Robotics 

► Robot Grasp Control 

► Robot Motion Control 


Recommended Reading 

An overview of the field of cooperative ma¬ 
nipulation can be found also in Caccavale and 
Uchiyama (2008), where a more extended lit¬ 
erature review and further technical details are 
provided. Seminal contributions to control of co¬ 
operative manipulators can be found in Chiacchio 
et al. (1991a), Koivo and Unseren (1991), Sang 
et al. (1995), Uchiyama and Dauchez (1993), 
Walker et al. (1989), Wen and Kreutz-Delgado 
(1992), and Williams and Khatib (1993). 
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Abstract 

This article presents the fundamental elements of 
the theory of cooperative games in the context 
of dynamic systems. The concepts of Pareto op¬ 
timality, Nash bargaining solution, characteristic 
function, cores, and C-optimality are discussed, 
and some fundamental results are recalled. 
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Introduction 

Solution concepts in game theory are regrouped 
in two main categories called noncooperative 
and cooperation solutions, respectively. In the 
seminal book of von Neumann and Morgen- 


stern (1944) this categorization is already made. 
These authors discuss zero-sum (matrix) games 
in normal form, where the noncooperative so¬ 
lution concept of saddle-point was defined and 
characterized, and games in characteristic func¬ 
tion form, where solution concepts for games 
of coalitions were introduced. In this article we 
present the fundamental solution concepts of the 
theory of cooperative games in the context of 
dynamical systems. The article is organized as 
follows: we first recall the papers, which mark 
the origin of development of a theory of dy¬ 
namic games; then we recall the basic concept of 
Pareto optimality proposed as a cooperative so¬ 
lution concept; we present the scalarization tech¬ 
nique and the necessary or sufficient optimality 
conditions for Pareto optimality in mathematical 
programming and optimal control settings; we 
then explore the difficulties encountered when 
one tried to extend the Nash bargaining solution, 
characteristic function and cores concept to dy¬ 
namic games; we show the links that exist with 
the theory of reachability for perturbed dynamic 
systems. 

The Origins 

One may consider that the first introduction of 
a cooperative game solution concept in systems 
and control science is due to L.A. Zadeh (1963). 
Two-player zero-sum dynamic games have been 
studied by R. Isaacs (1954) in a deterministic 
continuous time setting and by L. Shapley (1953) 
in a discrete time stochastic setting. Nonzero-sum 
and m player differential games were introduced 
by Y.C. Ho and A.W. Starr (1969) and J.H. Case 
(1969). For these games cooperative solutions 
can be looked for to complement the noncoop¬ 
erative Nash equilibrium concept. 

Cooperation Solution Concept 

In cooperative games one is interested in non- 
dominated solution. This solution type is re¬ 
lated to a concept introduced by the well-known 
economist V. Pareto (1869) in the context of 
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welfare economics. Consider a system with de¬ 
cision variables x e X C R" and m performance 
criteria x —> \j/j (x) G R, j = l,... ,m that one 
tries to maximize. 

Definition 1 The decision x* e X is nondomi- 
nated or Pareto optimal if the following condition 
holds: 

fj(x)>fj(x*) Vj = l,...m 
=* tj(x) = tyj(x*) Wj = l,...m. 

In other words it is impossible to give one cri¬ 
terion j a value greater than fj (x*) without 
decreasing the value of another criterion, say l , 
which then takes a value lower than t/^(x*). 

This vector-valued optimization framework cor¬ 
responds to a situation where m players are en¬ 
gaged in a game, described in its normal form, 
where the strategies of the m players constitute 
the decision vector x and their respective payoffs 
are given by the m performance criteria x/rj (x), 
j = 1 ,... ,m. One assumes that these players 
jointly take a decision that is cooperatively op¬ 
timal, in the sense that no player can improve 
his/her payoff without deteriorating the payoff of 
at least one other player. 

The Scalarization Technique 

Let r = (>i, r 2 ,..., r m ) be a given m-vector com¬ 
posed of normalized weights that satisfy r j > 0, 
j = \ ...,m and O' = L 

Lemma 1 Let x* e X be a maximum in 
X for the scalarized criterion 4>(x;r) = 

Y]= i r j x l r j (■*)• Then x* is a nondominated 
solution for the multi-objective problem. 

The proof is very simple. Suppose x* is 
dominated, then there exists x° e X such 
that fj (x°) > fj (x*), V j = 1,..., m, and 
> jfi(x*) for one i e {1,... ,m}. Since 
all the rj are > 0, this yields Y]=i r j ^j ( x °) > 
Y]=\ r j x l / j( x *)> which contradicts the maxi¬ 
mizing property of x*. This result shows that it 
will be very easy to find many Pareto optimal 
solutions by varying a strictly positive weighting 


of the criteria. But this procedure will not find all 
of the nondominated solutions. 

Conditions for Pareto Optimality 
in Mathematical Programming 

N.O. Da Cunha and E. Polak (1967b) have ob¬ 
tained the first necessary conditions for multi¬ 
objective optimization. The problem they con¬ 
sider is 

Pareto Opt. fj (x) j = 1,... m 
s.t. 

(pk (x) < 0 k = l,... p 

where the functions x e R" i-> fj (x) e H, 
j = 1 ,...,m, and x i—> ^(x) G R, k = 
1 ,..., p are continuously differentiable (C l ) and 
where we assume that the constraint qualification 
conditions of mathematical programming hold 
for this problem too. They proved the following 
theorem. 

Theorem 1 Let x* be a Pareto optimal solution 
of the problem defined above. Then there exists a 
vector X of p multipliers Xk, k = 1,..., p, and a 
vector r ^ 0 of m weights rj > 0, such that the 
following conditions hold 

(x*; r; A) = 0 
ox v 7 

<Pk(x*) < 0 

X k cp k (x*) = 0 

x k > 0 , 

where £(x*; r;A) is the weighted Lagrangian 
defined by 

m p 

C (x; r; A) = ^ r, fj(x) + ^ \ k <p k (x). 
j =1 k =1 

So there is a local scalarization principle for 
Pareto optimality. 

Maximum Principle 

The extension of Pareto optimality concept to 
control systems was done by several authors 
(Basile and Vincent 1970; Bellassali and Jourani 
2004; Binmore et al. 1986; Blaquiere et al. 1972; 
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Leitmann et al. 1972; Salukvadze 1971; Vincent 
and Leitmann 1970; Zadeh 1963), the main result 
being an extension of the maximum principle of 
Pontryagin. Let a system be governed by state 
equations: 


where the weighted Hamiltonian is defined by 

m 

H(x,u;X; r) = ^ rj gj(x,u ) + X T f(x,u). 
7=1 


x(t) = f(x(t),u(t )) 

(1) 

u(t ) € U 

(2) 

x(0) = x 0 

(3) 

t e [0, T] 

(4) 


where v G IT 7 is the state variable of the system, 
u e U C with U compact is the control 
variable, and [0, T] is the control horizon. The 
system is evaluated by m performance criteria of 
the form 



gj(x(t),u(t))dt+Gj(x(T)), 


(5) 

for j = 1 ,,m. Under the usual assumptions 
of control theory, i.e., /(•,•) and gj (•,•)> j — 
1 , ...,m, being C l in x and continuous in u , 
Gj (•) being C 1 in x, one can prove the following. 


Theorem 2 Let {v*(£) : t G [0, 7"]} be a Pareto 
optimal trajectory, generated at initial state x° by 
the Pareto optimal control {u*(t) : t G [0, T]}. 
Then there exist costate vectors {X*(t) : t G 
[0, T]} and a vector of positive weights r/Oe 
R 777 , with components n > o, E7 =i n = '■ 
r/zzzr the following relations hold: 


a 

x*(0 = —H(x*(t),u*(t);X(t);r) (6) 

dA 

A(0 = -j-H(x*(t),u*(t);X(t);r) (7) 

dx 

x*(0) = x„ (8) 


HT) = J2 rj —Gj(x (T)) 
7=1 


(9) 


The proof of this result necessitates some addi¬ 
tional regularity assumptions. Some of these con¬ 
ditions imply that there exist differentiable Bell¬ 
man value functions (see, e.g., Blaquiere et al. 
1972); some others use the formalism of nons¬ 
mooth analysis (see, e.g., Bellassali and Jourani 
2004). 

The Nash Bargaining Solution 

Since Pareto optimal solutions are numerous (ac¬ 
tually since a subset of Pareto outcomes are in¬ 
dexed over the weightings r, rj > 0, £7=i rj = 
1), one can expect, in the payoff m -dimensional 
space, to have a manifold of Pareto outcomes. 
Therefore, the problem that we must solve now 
is how to select the (i best ,f Pareto outcomel 
“Best” is a misnomer here, because, by their 
very definition, two Pareto outcomes cannot be 
compared or gauged. The choice of a Pareto 
outcome that satisfies each player must be the 
result of some bargaining. J. Nash addressed this 
problem very early, in 1951, using a two-player 
game setting. He developed an axiomatic ap¬ 
proach where he proposed four behavior axioms 
which, if accepted, would determine a unique 
choice for the bargaining solution. These ax¬ 
ioms are called respectively, (i) invariance to 
affine transformations of utility representations, 
(ii) Pareto optimality, (iii) independence of irrel¬ 
evant alternatives, and (iv) symmetry. Then the 
bargaining point is the Pareto optimal solution 
that maximizes the product 

x* = argmax ;c (Vfi(x)-^i(x 0 ))(i/f 2 (x)-i/f 2 (x 0 )) 


with 


H(x*(t),u*(t);X(t);r) 
= maxH(x*(t),u;X(t);r) 

uEU 


where i° is the status quo decision, in case 
bargaining fails, and (^ (i°)), j = 1,2 are 
the payoffs associated with this no-accord deci¬ 
sion (this defines the so-called threat point). It 
has been proved (Binmore et al. 1986) that this 
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solution could be obtained also as the solution of 
an auxiliary dynamic game in which a sequence 
of claims and counterclaims is made by the two 
players when they bargain. 

When extended directly to the context of 
differential or multistage games, the Nash 
bargaining solution concept proved to lack 
the important property of time consistency. 
This was first noticed in Haurie (1976). Let a 
dynamic game be defined by Eqs. (l)-(5), with 
j = 1,2. Suppose the status quo decision, 

if no agreement is reached at initial state 
(t = 0,x(0) = x°), consists in playing an 
open-loop Nash equilibrium, defined by the 
controls u 1 - (•) : [0,7] -> Uj, j = 1,2 and 
generating the trajectory x N (•) : [0, T] -> R", 
with ^^(0) = Now applying the Nash 

bargaining solution scheme to the data of this 
differential game played at time t = 0 and state 
v(0) = one identifies a particular Pareto 
optimal solution, associated with the controls 
w*(-) : [0, T] -* Uj, j = 1,2 and generating the 
trajectory v*(-) : [0, T] -> R", with v*(0) = 
Now assume the two players renegotiate the 
agreement to play u* (•) at an intermediate point 
of the Pareto optimal trajectory (r, x*(r)), 
r G (0, T). When computed from that point, 
the status quo strategies are in general not the 
same as they were at (0,x o ); furthermore, the 
shape of the Pareto frontier, when the game is 
played from (r, x*(r)), is different from what it 
is when the game is played at (0, x 0 ). For these 
two reasons the bargaining solution at (r, v*(r)) 
will not coincide in general with the restriction to 
the interval [r, T] of the bargaining solution from 
(0, x 0 ). This implies that the solution concept is 
not time consistent. Using feedback strategies, 
instead of open-loop ones, does not help, as 
the same phenomena (change of status quo and 
change of Pareto frontier) occur in a feedback 
strategy context. 

This shows that the cooperative game solu¬ 
tions proposed in the classical theory of games 
cannot be applied without precaution in a dy¬ 
namic setting when players have the possibil¬ 
ity to renegotiate agreements at any interme¬ 
diary point (t,x*(t)) of the bargained solution 
trajectory. 


Cores and C-Optimality in Dynamic 
Games 

Characteristic functions and the associated so¬ 
lution concept of core are important elements 
in the classical theory of cooperative games. In 
two papers (Haurie 1975; Haurie and Delfour 
1974) the basic definitions and properties of the 
concept of core in dynamic cooperative games 
were presented. Consider the multistage system, 
controlled by a set M of m players and defined 
by 

x(k + 1) = f k (x(k),u M (k)), 
k = 0,1 
x(i) = x\ /6{0,1.a:-i} 

UM(k)= ( Uj(k))jeM e UM(k)= ]~[ Uj(k). 

jeM 

From the initial point (/, x l ) a control sequence 
• • • ,um(K — 1)) generates for each 
player j a payoff defined as follows: 

J j (i,x l ; um(i), • • •, um(K — 1)) = 

K -1 

d> ; (x (k), u M (k)) + Tj (x (K)). 

k=i 

A subset S of M is called a coalition. Let ji k s : 
x(k) u s (k) e riyes Uj (k) be a feedback 
control for the coalition defined at each stage k. 
A player j e S considers then, from any initial 
point (/, x l ), his guaranteed payoff: 

•••./* f _1 ) = 

in f um-s (i)eU M -s(i),-,UM-s(K-l)eU M -s(K-l) 

Ef=/ la k s (x(k)),u M -s(k )]) 

+r j (x(K)). 

Definition 2 The characteristic function at stage 
i for coalition S C M is the mapping v l : 
(S, x l ) i-^ v l (S,x l ) C R 5 defined by 

COs = (tUy )jeS e v‘(S,x‘) 
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3/4. •••.Ms : Vy eS 

^ / (/, x‘; ji‘ s ,. . .^i K s ~') >(Oj. 

In other words, there is a feedback law for the 
coalition S which guarantees at least coj to each 
player j in the coalition. 

Suppose that in a cooperative agreement, at point 
(/, x 7 ), the coalition S is proposed a gain vector 
o) s which is interior to v l (S,x l ). Then coalition 
S will block this agreement, because using an ap¬ 
propriate feedback, the coalition can guarantee a 
better payoff to each of its members. We can now 
extend the definition of the core of a cooperative 
game to the context of dynamic games, as the set 
of agreement gains that cannot be blocked by any 
coalition. 

Definition 3 The core £2(/,x 7 ) at point (i,x l ) 
is the set of gain vectors com = (&>;); em such 
that: 

1. There exists a Pareto optimal control 

... ,u* m (K — 1) for which coj = 
//(i, x 1 '(/),-.., u* m (K- 1)), 

2. VS C M the projection of % in is not 
interior to v l (S, x l ) 

Playing a cooperative game, one would be inter¬ 
ested in finding a solution where the gain-to-go 
remains in the core at each point of the trajectory. 
This leads us to define the following. 

Definition 4 A control u° = (u° M ( 0),... ,um 
(K — 1)) is C-optimal at (0,x°) if u° is Pareto 
optimal generating a state trajectory 

{x°(0) = x°,x°(lx°(K)} 

and a sequence of gain-to-go values 

a>°j(i) = Jj(i,x°(i);u 0 M (i),...,u° M (K- 1)), 
i = 0,..., K - 1 

such that Vi = 0,1,..., K — 1, the m-vector 
oo° M (i) is element of the core Q(i,x°(i)). 

A C -optimal control generates an agreement 
which cannot be blocked by any coalition along 
the Pareto optimal trajectory. It can be shown on 


examples that a Pareto optimal trajectory which 
has the gain-to-go vector in the core at initial 
point (0, Vo) is not C-optimal. 


Links with Reachability Theory for 
Perturbed Systems 

The computation of characteristic functions can 
be made using the techniques developed to study 
reachability of dynamic systems with set con¬ 
strained disturbances (see Bertsekas and Rhodes 
1971). Consider the particular case of a linear 
system 

x(k + 1 ) = A k x(k ) + Bj u j(k) ( 10 ) 

jeM 

where x e IT 7 , uj e U k c HA, where U k is 
a convex-bound set and A k , B 1 - are matrices of 
appropriate dimensions. Let the payoff to player 
j be defined by: 

Jj(i,x l ;u M (i), • • • ,u m (K - 1 )) = 

K -1 

4>){x{k)) + Y k )(uj (k)) + Tj(x(K)). 

k=i 

Algorithm Here we use the notations cj) k s = 

(< Pj)j€S and B k s u s = T,j€S B j u j■ Also we 
denote {u + V}, where u is a vector in JR m and 
V C lR m , the set of vectors u + v, Vv e V. 
Then 

1 . Vx K v K (S,x K )= {cos 6 II s : Ts(x k ) > « s } 

2. Vx £ k+ \S,x) = nve U M -sv k+1 
(S,x + B k Ms v) 

3. Vx* n k (S,x k ) = u {y k (u) + £ k+l 
(, S,A k x k + B k u )} 

4. Vx* v k (S,x k ) = {4> k s (x k ) + n k {S,x k )) . 

In an open-loop control setting, the calculation 
of characteristic function can be done using the 
concept of Pareto optimal solution for a sys¬ 
tem with set constrained disturbances, as shown 
in Goffin and Haurie (1973, 1976) and Haurie 
(1973). 
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Conclusion 

Since the foundations of a theory of cooperative 
solutions to dynamic games, recalled in this ar¬ 
ticle, the research has evolved toward the search 
for cooperative solutions that could be also equi¬ 
librium solution, using for that purpose a class of 
memory strategies Haurie and Towinski (1985), 
and has found a very important domain of appli¬ 
cation in the assessment of environmental agree¬ 
ments, in particular those related to the climate 
change issue. For example, the sustainability of 
solutions in the core of a dynamic game mod¬ 
eling international environmental negotiations is 
studied in Germain et al. (2003). A more encom¬ 
passing model of dynamic formation of coalitions 
and stabilization of solutions through the use of 
threats is proposed in Breton et al. (2010). These 
references are indicative of the trend of research 
in this field. 

Cross-References 

► Dynamic Noncooperative Games 

► Game Theory: Historical Overview 

► Strategic Form Games and Nash Equilibrium 

Bibliography 

Basile G, Vincent TL (1970) Absolutely cooperative solu¬ 
tion for a linear, multiplayer differential game. J Optim 
Theory Appl 6:41-46 

Bellassali S, Jourani A (2004) Necessary optimality con¬ 
ditions in multiobjective dynamic optimization. SIAM 
J Control Optim 42:2043-2061 
Bertsekas DP, Rhodes IB (1971) On the minimax reach¬ 
ability of target sets and target tubes. Automatica 7: 
23-247 

Binmore K, Rubinstein A, Wolinsky A (1986) The Nash 
bargaining solution in economic modelling. Rand J 
Econ 17(2): 176-188 

Blaquiere A, Juricek L, Wiese KE (1972) Geometry of 
Pareto equilibria and maximum principle in n -person 
differential games. J Optim Theory Appl 38:223-243 
Breton M, Sbragia L, Zaccour G (2010) A dynamic model 
for international environmental agreements. Environ 
Resour Econ 45:25-48 

Case JH (1969) Toward a theory of many player differen¬ 
tial games. SIAM J Control 7(2): 179-197 
Da Cunha NO, Polak E (1967a) Constrained minimiza¬ 
tion under vector-valued criteria in linear topological 


spaces. In: Balakrishnan AV, Neustadt LW (eds) Math¬ 
ematical theory of control. Academic, New York, 
pp 96-108 

Da Cunha NO, Polak E (1967b) Constrained minimiza¬ 
tion under vector-valued criteria in finite dimensional 
spaces. J Math Anal Appl 19:103-124 
Germain M, Toint P, Tulkens H, Zeeuw A (2003) Trans¬ 
fers to sustain dynamic core-theoretic cooperation in 
international stock pollutant control. J Econ Dyn Con¬ 
trol 28:79-99 

Goffm JL, Haurie A (1973) Necessary conditions and suf¬ 
ficient conditions for Pareto optimality in a multicrite¬ 
rion perturbed system. In: Conti R, Ruberti A (eds) 5th 
conference on optimization techniques, Rome. Lecture 
notes in computer science, vol 4 Springer 
Goffm JL, Haurie A (1976) Pareto optimality with non- 
differentiable cost functions. In: Thiriez H, Zionts S 
(eds) Multiple criteria decision making. Lecture notes 
in economics and mathematical systems, vol 130. 
Springer, Berlin/New York, pp 232-246 
Haurie A (1973) On Pareto optimal decisions for a coali¬ 
tion of a subset of players. IEEE Trans Autom Control 
18:144-149 

Haurie A (1975) On some properties of the characteristic 
function and the core of a multistage game of coalition. 
IEEE Trans Autom Control 20(2):23 8-241 
Haurie A (1976) A note on nonzero-sum differential 
games with bargaining solutions. J Optim Theory Appl 
13:31-39 

Haurie A, Delfour MC (1974) Individual and collective 
rationality in a dynamic Pareto equilibrium. J Optim 
Appl 13(3):290-302 

Haurie A, Towinski B (1985) Definition and properties of 
cooperative equilibria in a two-player game of infinite 
duration. J Optim Theory Appl 46(4):525-534 
Isaacs R (1954) Differential games I: introduction. Rand 
Research Memorandum, RM-1391-30. Rand Corpora¬ 
tion, Santa Monica 

Leitmann G, Rocklin S, Vincent TL (1972) A note on 
control space properties of cooperative games. J Optim 
Theory Appl 9:379-390 

Nash J (1950) The bargaining problem. Econometrica 
18(2): 155-162 

Pareto V (1896) Cours d’Economie Politique. Rogue, 
Lausanne 

Salukvadze ME (1971) On the optimization of control 
systems with vector criteria. In: Proceedings of the 
11th all-union conference on control, Part 2. Nauka 
Shapley LS (1953) Stochastic games. PNAS 39(10): 1095- 
1100 

Starr AW, Ho YC (1969) Nonzero-sum differential games. 

J Optim Theory Appl 3(3): 184-206 
Vincent TL, Leitmann G (1970) Control space properties 
of cooperative games. J Optim Theory Appl 6(2): 
91-113 

von Neumann J, Morgenstem O (1944) Theory of Games 
and Economic Behavior, Princeton University Press 
Zadeh LA (1963) Optimality and non-scalar-valued per¬ 
formance criteria. IEEE Trans Autom Control AC- 
8:59-60 



Coordination of Distributed Energy Resources for Provision of Ancillary Services... 


241 


Coordination of Distributed Energy 
Resources for Provision of Ancillary 
Services: Architectures and 
Algorithms 

Alejandro D. Dominguez-Garcia 1 and 
Christoforos N. Hadjicostis 2 
University of Illinois at Urbana-Champaign, 
Urbana-Champaign, IL, USA 
2 University of Cyprus, Nicosia, Cyprus 

Abstract 

We discuss the utilization of distributed energy 
resources (DERs) to provide active and reactive 
power support for ancillary services. Though the 
amount of active and/or reactive power provided 
individually by each of these resources can be 
very small, their presence in large numbers in 
power distribution networks implies that, under 
proper coordination mechanisms, they can col¬ 
lectively provide substantial active and reactive 
power regulation capacity. In this entry, we pro¬ 
vide a simple formulation of the DER coordina¬ 
tion problem for enabling their utilization to pro¬ 
vide ancillary services. We also provide specific 
architectures and algorithmic solutions to solve 
the DER coordination problem, with focus on 
decentralized solutions. 

Keywords 

Ancillary services; Consensus; Distributed algo¬ 
rithms; Distributed energy resources (DERs) 

Introduction 

On the distribution side of a power system, 
there are many distributed energy resources 
(DERs), e.g., photovoltaic (PV) installations, 
plug-in hybrid electric vehicles (PHEVs), and 
thermostatically controlled loads (TCLs), that 
can be potentially used to provide ancillary 
services, e.g., reactive power support for voltage 
control (see, e.g., Turitsyn et al. (2011) and 
the references therein) and active power up and 


down regulation for frequency control (see, e.g., 
Callaway and Hiskens (2011) and the references 
therein). To enable DERs to provide ancillary 
services, it is necessary to develop appropriate 
control and coordination mechanisms. One 
potential solution relies on a centralized control 
architecture in which each DER is directly 
coordinated by (and communicates with) a 
central decision maker. An alternative approach 
is to distribute the decision making, which 
obviates the need for a central decision maker 
to coordinate the DERs. In both cases, the 
decision making involves solving a resource 
allocation problem for coordinating the DERs 
to collectively provide a certain amount of a 
resource (e.g., active or reactive power). 

In a practical setting, whether a centralized or 
a distributed architecture is adopted, the control 
of DERs for ancillary services provision will 
involve some aggregating entity that will gather 
together and coordinate a set of DERs, which 
will provide certain amount of active or reac¬ 
tive power in exchange for monetary benefits. In 
general, these aggregating entities are the ones 
that interact with the ancillary services market, 
and through some market-clearing mechanism, 
they enter a contract to provide some amount of 
resource, e.g., active and/or reactive power over a 
period of time. The goal of the aggregating entity 
is to provide this amount of resource by properly 
coordinating and controlling the DERs, while 
ensuring that the total monetary compensation 
to the DERs for providing the resource is below 
the monetary benefit that the aggregating entity 
obtains by selling the resource in the ancillary 
services market. 

In the context above, a household with a so¬ 
lar PV rooftop installation and a PHEV might 
choose to offer the PV installation to a renew¬ 
able aggregator so it is utilized to provide re¬ 
active power support (this can be achieved as 
long as the PV installation power electronics- 
based grid interface has the correct topology 
Dominguez-Garcia et al. 201 1). Additionally, the 
household could offer its PHEV to a battery ve¬ 
hicle aggregator to be used as a controllable load 
for energy peak shaving during peak hours and 
load leveling at night (Guille and Gross 2009). 
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Finally, the household might choose to enroll in 
a demand response program in which it allows a 
demand response provider to control its TCLs to 
provide frequency regulation services (Callaway 
and Hiskens 2011). In general, the renewable 
aggregator, the battery vehicle aggregator, and the 
demand response provider can be either separate 
entities or they can be the same entity. In this 
entry, we will refer to these aggregating entities 
as aggregators. 

The Problem of DER Coordination 

Without loss of generality, denote by xj the 
amount of resource provided by DER i without 
specifying whether it is active or reactive power. 
[However, it is understood that each DER pro¬ 
vides (or consumes) the same type of resource, 
i.e., all the xf s are either active or reactive 
power.] Let 0 < x t < x), for i = 1,2,, n, 
denote the minimum (x t ) and maximum (x/) 
capacity limits on the amount of resource Xf 
that node i can provide. Denote by X the total 
amount of resource that the DERs must collec¬ 
tively provide to satisfy the aggregator request. 
Let 7Ti{Xi) denote the price that the aggregator 
pays DER i per unit of resource Xf that it pro¬ 
vides. Then, the objective of the aggregator in 
the DER coordination problem is to minimize the 
total monetary amount to be paid to the DERs for 
providing the total amount of resource X while 
satisfying the individual capacity constraints of 
the DERs. Thus, the DER coordination problem 
can be formulated as follows: 

n 

minimize E XiUiiXi) 

i = 1 

.. A (1) 

subject to > Xi = X 

i = 1 

0 < x t < Xi < Xi, Vy. 

By allowing heterogeneity in the price per 
unit of resource that the aggregator offers to 
each DER, we can take into account the fact 
that the aggregator might value classes of DERs 


differently. For example, the downregulation ca¬ 
pacity provided by a residential PV installation 
(which is achieved by curtailing its power) might 
be valued differently from the downregulation 
capacity provided by a TCL or a PHEV (both 
would need to absorb additional power in order 
to provide downregulation). 

It is not difficult to see that if the price func¬ 
tions 7t i (•), i = 1,2 are convex and non¬ 

decreasing, then the cost function Y^=i x i 71 i ( x i) 
is convex; thus, if the problem in (1) is feasi¬ 
ble, then there exists a globally optimal solu¬ 
tion. Additionally, if the price per unit of re¬ 
source is linear with the amount of resource, i.e., 
m (Xi) = axi, i = 1,2 ,...,n, then x t 7r ? (pa) = 
Cixf, i = 1,2,... ,n, and the problem in (1) 
reduces to a quadratic program. Also, if the price 
per unit of resource is constant, i.e., i r z (x/) = 
Ci, i = 1,2,... ,n, then Xi7ti(xi) = CiXi, i = 
1,2,... ,n, and the problem in (1) reduces to a 
linear program. Finally, if i r z (x z -) = n(xi) = 
c, i = 1,2,for some constant c > 0, 
i.e., the price offered by the aggregator is constant 
and the same for all DERs, then the optimization 
problem in (1) becomes a feasibility problem of 
the form 

find X\,X 2 ,...,x n 
n 

subject to x i — X (2) 

i = 1 

0 < x t < Xi < Xi, V j. 

If the problem in (2) is indeed feasible (i.e., 
YTi=\ x i — X — YTi=\ x i)i then there is an 
infinite number of solutions. One such solution, 
which we refer to ns fair splitting, is given by 

x,- = x, + ^ /=1 E (x, -x ; ), Vi. (3) 

2^/=i v x i ~ x i) 

The formulation to the DER coordination 
problem provided in (2) is not the only possible 
one. In this regard, and in the context of PHEVs, 
several recent works have proposed game- 
theoretic formulations to the problem (Ghare- 
sifard et al. 2013; Ma et al. 2013; Tushar et al. 
2012). For example, in Gharesifard et al. (2013), 
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Centralized architecture. 



Decentralized architecture. 


Coordination of Distributed Energy Resources for Provision of Ancillary Services: Architectures and Algo¬ 
rithms, Fig. 1 Control architecture alternatives, (a) Centralized architecture, (b) Decentralized architecture 


the authors assume that each PHEV is a decision 
maker and can freely choose to participate after 
receiving a request from the aggregator. The 
decision that each PHEV is faced with depends 
on its own utility function, along with some 
pricing strategy designed by the aggregator. The 
PHEVs are assumed to be price anticipating in 
the sense that they are aware of the fact that 
the pricing is designed by the aggregator with 
respect to the average energy available. Another 
alternative is to formulate the DER coordination 
problem as a scheduling problem (Chen et al. 
2012; Subramanian et al. 2012), where the DERs 
are treated as tasks. Then, the problem is to 
develop real-time scheduling policies to service 
these tasks. 


Architectures 

Next, we describe two possible architectures that 
can be utilized to implement the proper algo¬ 
rithms for solving the DER coordination problem 


as formulated in (1). Specifically, we describe a 
centralized architecture that requires the aggre¬ 
gator to communicate bidirectionally with each 
DER and a distributed architecture that requires 
the aggregator to only unidirectionally communi¬ 
cate with a limited number of DERs but requires 
some additional exchange of information (not 
necessarily through bidirectional communication 
links) among the DERs. 

Centralized Architecture 

A solution can be achieved through the 
completely centralized architecture of Fig. la, 
where the aggregator can exchange information 
with each available DER. In this scenario, 
each DER can inform the aggregator about 
its active and/or reactive capacity limits and 
other operational constraints, e.g., maintenance 
schedule. After gathering all this information, 
the aggregator solves the optimization program 
in (1), the solution of which will determine 
how to allocate among the resources the total 
amount of active power P s r and/or reactive 
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power Q r s that it needs to provide. Then, the 
aggregator sends individual commands to each 
DER so they modify their active and or reactive 
power generation according to the solution 
of (1) computed by the aggregator. In this 
centralized solution, however, it is necessary 
to overlay a communication network connecting 
the aggregator with each resource and to maintain 
knowledge of the resources that are available at 
any given time. 

Decentralized Architecture 

An alternative is to use the decentralized control 
architecture of Fig. lb, where the aggregator re¬ 
lays information to a limited number of DERs 
that it can directly communicate with and each 
DER is able to exchange information with a 
number of other close-by DERs. For example, 
the aggregator might broadcast the prices to be 
paid to each type of DER. Then, through some 
distributed protocol that adheres to the commu¬ 
nication network interconnecting the DERs, the 
information relayed by the aggregator to this 
limited number of DERs is disseminated to all 
other available DERs. This dissemination pro¬ 
cess may rely on flooding algorithms, message¬ 
passing protocols, or linear-iterative algorithms 
as proposed in Dommguez-Garcia and Hadji- 
costis (2010, 2011). After the dissemination pro¬ 
cess is complete and through a distributed com¬ 
putation over the communication network, the 
DERs can solve the optimization program in (1) 
and determine its active and/or reactive power 
contribution. 

A decentralized architecture like the one in 
Fig. lb may offer several advantages over the cen¬ 
tralized one in Fig. lb, including the following. 
First, a decentralized architecture may be more 
economical because it does not require commu¬ 
nication between the aggregator and the various 
DERs. Also, a decentralized architecture does 
not require the aggregator to have a complete 
knowledge of the DERs available. Additionally, a 
decentralized architecture can be more resilient to 
faults and/or unpredictable behavioral patterns by 
the DERs. Finally, the practical implementation 
of such decentralized architecture can rely on 
inexpensive and simple hardware. For example, 


the testbed described in Dommguez-Garcia et al. 
(2012a), which is used to solve a particular in¬ 
stance of the problem in (1), uses Arduino mi¬ 
crocontrollers (see Arduino for a description) 
outfitted with wireless transceivers implementing 
a ZigBee protocol (see ZigBee for a description). 


Algorithms 

Ultimately, whether a centralized or a decentral¬ 
ized architecture is adopted, it is necessary to 
solve the optimization problem in (1). If a cen¬ 
tralized architecture is adopted, then solving (1) 
is relatively straightforward using, e.g., standard 
gradient-descent algorithms (see, e.g., Bertsekas 
and Tsitsiklis 1997). Beyond the DER coordina¬ 
tion problem and the specific formulation in (1), 
solving an optimization problem is challenging if 
a decentralized architecture is adopted (especially 
if the communication links between DERs are 
not bidirectional); this has spurred significant 
research in the last few years (see, e.g., Bertsekas 
and Tsitsiklis 1997, Xiao et al. 2006, Nedic et al. 
2010, Zanella et al. 2011, Gharesifard and Cortes 
2012, and the references therein). 

In the specific context of the DER coordi¬ 
nation problem as formulated in (1), when the 
cost functions are assumed to be quadratic and 
the communication between DERs is not bidirec¬ 
tional, an algorithm amenable for implementation 
in a decentralized architecture like the one in 
Fig. lb has been proposed in Dommguez-Garcia 
et al. (2012a). Also, in the context of Fig. lb, 
when the communication between DERs are bidi¬ 
rectional, the DER coordination problem, as for¬ 
mulated in (1), can be solved using an algorithm 
proposed in Kar and Hug (2012). 

As mentioned earlier, when the price offered 
by the aggregator is constant and identical for 
all DERs, the problem in (1) reduces to the 
feasibility problem in (2). One possible solution 
to this feasibility problem is the fair-splitting so¬ 
lution in (3). Next, we describe a linear-iterative 
algorithm - originally proposed in Dommguez- 
Garcia and Hadjicostis (2010, 2011) and referred 
to as ratio consensus - that allows the DERs to 
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individually determine its contribution so that the 
fair-splitting solution is achieved. 

Ratio Consensus: A Distributed Algorithm 
for Fair Splitting 

We assume that each DER is equipped with a 
processor that can perform simple computations 
and can exchange information with neighboring 
DERs. In particular, the information exchange 
between DERs can be described by a directed 
graphs = {V, £}, where V = {1,2 ,...,ft} is the 
vertex set (each vertex - or node - corresponds to 
a DER) and £ c V x V is the set of edges, where 
(/, j ) g £ if node i can receive information from 
node j . We require Q to be strongly connected , 
i.e., for any pair of vertices / and /', there exists 
a path that starts in / and ends in V. Let C + C 
V, 7 ^ 0 denote the set of nodes that the 
aggregator is able to directly communicate with. 

The processor of each DER i maintains two 
values yi and Zi, which we refer to as internal 
states, and updates them (independently of each 
other) to be, respectively, a linear combination of 
DER i ’s own previous internal states and the pre¬ 
vious internal states of all nodes that can possibly 
transmit information to node i (including itself). 
In particular, for all /c: > 0, each node i updates 
its two internal states as follows: 


yi[k + 1 ] — 

E 

jzN~ u i 

(4) 

Zi[k + 1 ] = 


(5) 


jeJV~ j 


where A f~ = {j e V : (i,j) e £}, i.e., 
all nodes that can possibly transmit information 
to node i (including itself); and D ? + is the out- 
degree of node /, i.e., the number of nodes to 
which node i can possibly transmit information 
(including itself). The initial conditions in (4) 
are set to j/[0] = X/m — x t if i e £ + , and 
yt [ 0 ] = —Xj otherwise and the initial conditions 
in (5) are set to Zi [0] = X; — x t . Then, as shown 
in Dommguez-Garcia and Hadjicostis (2011), as 
long as YTi =i */ - X — YTi =i */> each DER i 
can asymptotically calculate its contribution as 


Xi = x f + yfa -x t ) ( 6 ) 


where for all i 


lim 

k—>oo 


yj_ i k ] 

zdk] 


E/ = l (Xl-Xi) ' 


(7) 


It is important to note that the algorithm in 
(4)-(7) also serves as a primitive for the algorithm 
proposed in Dommguez-Garcia et al. (2012a), 
which solves the problem in ( 1 ) when the cost 
function is quadratic. Also, the algorithm in 
(4)-(7) is not resilient to packet-dropping com¬ 
munication links or imperfect synchronization 
among the DERs, which makes it difficult 
to implement in practice; however, there are 
robustified variants of this algorithm that address 
these issues Dommguez-Garcia et al. (2012b) 
and have been demonstrated to work in practice 
(Dommguez-Garcia et al. 2012a). 


Cross-References 
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Abstract 

Modeling of credit risk is concerned with con¬ 
structing and studying formal models of time 
evolution of credit ratings (credit migrations) in 
a pool of credit names, and with studying various 
properties of such models. In particular, this in¬ 
volves modeling and studying default times and 
their functionals. 


Keywords 
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Introduction 

Modeling of credit risk is concerned with con¬ 
structing and studying formal models of time 
evolution of credit ratings (credit migrations) in 
a pool of N credit names (obligors), and with 
studying various properties of such models. In 
particular, this involves modeling and studying 
default times and their functionals. In many ways, 
modeling techniques used in credit risk are sim¬ 
ilar to modeling techniques used in reliability 
theory. Here, we focus on modeling in continuous 
time. 

Models of credit risk are used for the purpose 
of valuation and hedging of credit derivatives, for 
valuation and hedging of counter-party risk, for 
assessment of systemic risk in an economy, or 
for constructing optimal trading strategies involv¬ 
ing credit-sensitive financial instruments, among 
other uses. 

Evolution of credit ratings for a single obligor, 
labeled as i, where i e {1 ,...,A}, can be 
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modeled in many possible ways. One popular 
possibility is to model credit migrations in terms 
of a jump process, say C l = (C l t ) t > o, taking 
values in a finite set, say 1C := {0,1, 2 ,... ,K l — 
1 ,K 1 }, representing credit ratings assigned to 
obligor i . Typically, the rating state K 1 represents 
the state of default of the i -th obligor, and typi¬ 
cally it is assumed that process C l is absorbed at 
state K l . 

Frequently, the case when K l = 1, that is 
1C : = {0,1}, is considered. In this case, one 
is only concerned with jump from the pre¬ 
default state 0 to the default state 1 , which 
is usually assumed to be absorbing - the 
assumption made here as well. It is assumed that 
process C l starts from state 0. The (random) 
time of jump of process C l from state 0 
to state 1 is called the default time, and is 
denoted as x l . Process C l is now the same as 
the indicator process of x 1 , which is denoted 
as H l and defined as H\ = 1 { r i< t }, for 

t > 0. Consequently, modeling of the process 
C l is equivalent to modeling of the default 
time r l . 

The ultimate goal of credit risk modeling is 
to provide a feasible mathematical and compu¬ 
tational methodology for modeling the evolution 
of the multivariate credit migration process C : = 
(C 1 ,..., C N ), so that relevant functionals of 
such processes can be computed efficiently. The 
simplest example of such functional is P(C tj e 
Aj , j = 1,2,..., J \Gs), representing the con¬ 
ditional probability, given the information Q s at 
time s > 0, that process C takes values in 
the set A j at time tj > 0, j = 1,2,...,/. 
In particular, in case of modeling of the de¬ 
fault times x l , i = 1,2 ,...,A^, one is con¬ 
cerned with computing conditional survival prob¬ 
abilities P( r 1 > t\,... ,x N > tfit\Gs), which 
are the same as probabilities P(H l t = 0,/ = 
1 ,2 ,...,A|&). 

Based on that, one can compute more com¬ 
plicated functionals, that naturally occur in the 
context of valuation and hedging of credit risk- 
sensitive financial instruments, such as corporate 
(defaultable) bonds, credit default swaps, credit 
spread options, collateralized bond obligations, 
and asset-based securities, for example. 


Modeling of Single Default Time 
Using Conditional Density 

Traditionally, there were two main approaches to 
modeling default times: the structural approach 
and the reduced approach, also known as the 
hazard process approach. The main features of 
both these approaches are presented in Bielecki 
and Rutkowski (2004). 

We focus here on modeling a single default 
time, denoted as r, using the so-called condi¬ 
tional density approach of El Karoui et al. (2010). 
This approach allows for extension of results that 
can be derived using reduced approach. 

The default time r is a strictly positive random 
variable defined on the underlying probability 
space (£2,T, P), which is endowed with a ref¬ 
erence filtration, say F = (J r t )t> o, representing 
flow of all (relevant) market information available 
in the model, not including information about 
occurrence of r. The information about occur¬ 
rence of r is carried by the (right continuous) 
filtration H generated by the indicator process 
H := ( H, = 1 r<t)t> o- The full information in 
the model is represented by filtration G := F vi. 
It is postulated that 

P(x e dte\Tt) = a t (0)d0 , 

for some random field a.(-), such that a t ( m ) is 
Ft 0 S(M+) measurable for each t. The family 
a t (-) is called /^-conditional density of r. In 
particular, P(r > 0) = ao(u) du. The 
following survival processes are associated with 
r, 

• S t (0) := P(r > 9\F t ) = f^°a t (u)du, 
which is an F-martingale, 

• S t := S t (t) = P(x > t\F t ), which is an F- 
supermartingale (Azema supermartingale). 

In particular, Sq( 6) = P(r > 0) = 

/ 0 °° ao(u) du, and S)( 0 ) = So = l. 

As an example of computations that can be 
done using the conditional density approach 
we give the following result, in which notation 
“bd” and “ad” stand for before default and 
at-or-after default, respectively. 
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Theorem 1 Let Y T {r) be a Tt V cr(r) measur- is a G-martingale. If A G is absolutely continuous, 
able and bounded random variable. Then the G-adapted process A G such that 


E(Y t ( r)\T t ) = l,<r + Y t ad (T, r)l t > r , 

where 

y bd 

¥ t =--is,> 0 , 

and 


Af = f X G u du 
Jo 

is called the G-intensity of r. The G-compensator 
is stopped at r, i.e., 7l G = A G Ar . Hence, X G = 0 
when t > r. In particular, we have 

Xf = l t<x X ¥ t = (1 - H,)X f. 


y, ad (T, 0) 


£(y r (0)ar(W), 


The conditional density process and the G- 
intensity of r are related as follows: For any t < £ 
and 6 > t we have 


There is an interesting connection between 
the conditional density process and the so-called 
default intensity processes, which are ones of the 
main objects used in the reduced approach. This 
connection starts with the following result, 

Theorem 2 (i) The Doob-Meyer (additive) 
decomposition of the survival process S is 
given as 

S t = 1 + Mf — f ot u (u)du , 

Jo 

where Mf = — f^(a t (u) — a u (u))du = 
E{f^° a u {u)du\JF t ) - 1 . 

(ii) Let £ := inf{£ > 0 : S t - = 0}. Define 
Xj = for t < £ and Xj = for t > £. 
Then, the multiplicative decomposition of S 
is given as 

S t =Lje-fi&", 

where 

dL v , = e f ° X “ du dMf, Lq = 1. 

The process A F is called the F intensity oft. 

The G-compensator of r is the G-predictable 
increasing process A G such that the process 

M g = H t -Af 


a,(e) = E(Xf\F'). 

Example 1 This is a structural-model-like 
example 

• Suppose F = ¥ x is a filtration of a default 
driver process, say X, and 0 is the default bar¬ 
rier assumed to be independent of X. Denote 
G(t) = P{0 > t). 

• Define 

r := inf {t > 0 : T t >0}, 

with T t \= sup s<t X s . We then have S t {6) = 
G{Te) if 9 < t and S t (6) = E(G(T 0 )\T t x ) if 
0 > t 

• Assume that F = 1 — G and T are ab¬ 
solutely continuous w.r.t. Lebesgue measure, 
with respective densities / and y. We then 
have 

oc t (0) = fWe)Ye =oce, t >9, 
and F A ' intensity of r is 

_ a, {t) _ a, (?) 

' _ G(r t ) ~ s, ’ 

• In particular, if & is a unit exponential r. v., that 
is, if G(t) = e~ f for t > 0, then we have that 

; — v — “do 
A t - Yt - -• 

Example 2 This is a reduced-form-like example. 
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• Suppose F is a strictly positive process. Then, 
the F-hazard process of r is denoted by T ¥ 
and is given as 

r, F = -lnS r , t> 0 . 

In other words, 

S t = e Ft , t > 0 . 

• In particular, if T ¥ is absolutely continuous, 
that is, r t ¥ = /o w then 

£ > 0 and 

(*t(0) = YoSq, t > 0. 

Modeling Evolution of Credit Ratings 
Using Markov Copulae 

The key goal in modeling of the joint migration 
process C is that the distributional laws of 
the individual migration components C ? , i e 
{l,... ,N}, agree with given (predetermined) 
laws. The reason for this is that the marginal laws 
of C, that is, the laws of C 1 , i e {1,..., N}, 
can be calibrated from market quotes for prices 
of individual (as opposed to basket) credit 
derivatives, such as the credit default swaps, 
and thus, the marginals of C should have laws 
agreeing with the market data. 

One way of achieving this goal is to model 
C as a Markov chain satisfying the so-called 
Markov copula property. For brevity we present 
here the simplest such model, in which the refer¬ 
ence filtration F is trivial, assuming additionally, 
but without loss of generality, that N = 2 and 
that /C 1 =JC 2 = 1C:= {0,1 ,...,K}. 

Here we focus on the case of the so-called 
strong Markov copula property, which is reflected 
in Theorem 3. 

Let us consider two Markov chains Z 1 and Z 2 
on (£2, J 7 , P), taking values in a finite state space 
/C, and with the infinitesimal generators A 1 : = 
[ajj] and ^4 2 := [a^], respectively. 

Consider the system of linear algebraic equa¬ 
tions in unknowns af h - k . 


Z a ?h.jk = a ij ’ Vi, j,h € 1C, i + j, (1) 

keKL 

Y2 a ?h,jk = a lk , Vi, h,k etc, h ± k, ( 2 ) 

j e/C 

It can be shown that this system admits at least 
one positive solution. 

Theorem 3 Consider an arbitrary positive so¬ 
lution of the system (l)-(2). Then the matrix 
A c = [a f h j k \i,h,j,keK, (where diagonal elements 
are defined appropriately) satisfies the condi¬ 
tions for a generator matrix of a bivariate time- 
homogeneous Markov chain, say C = (C 1 , C 2 ), 
whose components are Markov chains in the 
filtration of C and with the same laws as Z 1 and 
Z 2 . 

Consequently, the system (l)-(2) serves as a 
Markov copula between the Markovian margins 
C l , C 2 and the bivariate Markov chain C. 

Note that the system (l)-(2) can contain more 
unknowns than the number of equations, there¬ 
fore being underdetermined, which is a crucial 
feature for ability of calibration of the joint mi¬ 
gration process C to marginal market data. 

Example 3 This example illustrates modeling 
joint defaults using strong Markov copula theory. 

Let us consider two processes, Z 1 and Z 2 , 
that are time-homogeneous Markov chains, each 
taking values in the state space { 0 , 1 }, with re¬ 
spective generators 

0 1 

i _ 0 /-(a + c) a + c \ 

A ~ 1 V 0 0 ) (3) 

and 

•M(- ( V C) » r ), (4) 

for a, b,c > 0 . 

The off-diagonal elements of the matrix A c 
below satisfy the system ( 1 )—( 2 ), 
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(0,0) 

(0,1) 

(1,0) 

(i,i) 

(0.0) 

—{cl + b + c) 

b 

a 

c 

(0,1) 

0 

-(a + c ) 

0 

a + c 

(1.0) 

0 

0 

~(b + c) 

b + c 

(1.1) 

l o 

0 

0 

0 


Thus, matrix A c generates a Markovian joint 
migration process C = (C\C 2 ), whose com¬ 
ponents C 1 and C 2 model individual default with 
prescribed default intensities a + c and b + c, 
respectively. 

For more information about Markov copulae 
and about their applications in credit risk we, 
refer to Bielecki et al. (2013). 

Summary and Future Directions 

The future directions in development and applica¬ 
tions of credit risk models are comprehensively 
laid out in the recent volume Bielecki et al. 
(2011). One additional future direction is mod¬ 
eling of systemic risk. 
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Abstract 

In tracking applications, following the signal 
detection process that yields measurements, there 
is a procedure that selects the measurement(s) 
to be incorporated into the state estimator 
- this is called data association (DA). In 
multitarget-multisensor tracking systems, there 
are generally three classes of data association: 
specifically, measurement-to-track association 
(M2TA), track-to-track association (T2TA), 
and measurement-to-measurement association 
(M2MA). M2TA is the process of associating 
each measurement from a list (originating from 
one or more sensors) to a new or existing track. 
T2TA is the process of associating multiple 
existing tracks (from multiple sensors or from 
different periods in time), generally with the 
intent of fusing them afterward. M2MA is 
the process of associating measurements from 
different sensors in order to form “composite 
measurements” and/or do track initialization. The 
processes of M2TA and T2TA will be discussed 
in more detail here, while details on M2MA can 
be found in Bar-Shalom et al. (2011). 


Keywords 

Clutter; Measurement origin uncertainty; 
Measurement validation; Persistent interference; 
Tracking 

Introduction 

In a radar the “return” from the target of interest 
is sought within a time interval determined by the 
anticipated range of the target when it reflects the 
energy transmitted by the radar: a “range gate” is 
set up and the detection(s) within this gate can be 
associated with the target of interest. 

In general the measurements have a higher 
dimension: 

• Range, azimuth (bearing), elevation, or direc¬ 
tion sines for radar, possibly also range rate 

• Bearing and frequency (when the signal is 
narrow band) or time difference of arrival and 
frequency difference in passive sonar 

• Two line-of-sight angles or direction sines for 
optical or passive electromagnetic sensors 
Then a multidimensional gate is set up for 

detecting the signal from the target. This is done 
to avoid searching for the signal from the target of 
interest in the entire measurement space. A mea¬ 
surement in the gate, while not guaranteed to have 
originated from the target the gate pertains to, is 
a valid association candidate - thus, the name 
validation region or association region. If there 
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is more than one detection (measurement) in 
the gate, this leads to an association uncertainty. 

In the discussion to follow, it will be assumed 
that one has point measurements rather than 
distributed over several resolution cells of the 
sensor as in the case of an extended target. 

Similar validation has to be carried out in 
T2TA. 


Validation Region 

In view of the variety of variables that can be 
measured, a generic gating (or validation or as¬ 
sociation) procedure for continuous-valued mea¬ 
surements is discussed. 

Consider a target that is in track, i.e., its 
filter has been initialized. Then, according to 
Sect. 5.2.3 of Bar-Shalom et al. (2001), one has 
the predicted value (mean) of the measurement 
z(k +11 k) and the associated covariance S(k +1). 

Assumption. The true measurement con¬ 
ditioned on the past is normally (Gaussian) 
distributed (The notation A f(x; /z, S) stands for 
the normal (Gaussian) pdf with the argument 
(vector) random variable x, mean /z, and 
covariance matrix S. The reason for the use 
of the designation “normal” is to distinguish this 
omnipresent pdf from all the others (abnormal).) 
with its probability density function (pdf) 
given by 

p[z(k + 1)| Z k ] = U[z(k + 1 );z(k + 1| k), 

S(k + 1)] (1) 

where S(k + 1) is the innovation (residual) 
covariance matrix and z is the true measurement. 

Then the true measurement will be in the 
following region: 

V(k + l,y) = {z : d 2 < y} (2) 

with probability determined by the gate thresh¬ 
old y and 

d 2 = [z—z(k + l\k)]'S(k + l)~ l [z-z(k + l|k)] 

( 3 ) 


This distance metric, d 2 , is referred to in 
the literature as the normalized innovation 
squared (NIS), statistical distance squared , 
Mahalanobis distance , or chi-square distance. 

The region defined by (2) is called the gate 
or validation region (hence, the notation V) or 
association region. It is also known as the ellipse 
(or ellipsoid) of probability concentration - the 
region of minimum volume that contains a given 
probability mass under the Gaussian assumption. 
The semiaxes of the ellipsoid (2) are the square 
roots of the eigenvalues of yS. The threshold y 
is obtained from tables of the chi-square distribu¬ 
tion since the quadratic form (3) that defines the 
validation region in (2) is chi-square distributed 
with number of degrees of freedom equal to the 
dimension n z of the measurement. 

Table 1 gives the gate probability (The no¬ 
tation P{*} is used to denote the probability of 
event {•}.) 

P G = P{z(k + l)eV(k + l,y)} (4) 

or the “probability that the (true) measurement 
will fall in the gate” for various values y and 
dimensions n z of the measurement. The square 
root g = Jy is sometimes referred to as the 
“number of sigmas” (standard deviations) of the 
gate. This, however, does not fully define the 
probability mass in the gate as can be seen from 
Table 1. 

Remark 1 It should be pointed out that thresh¬ 
olding in a detector is also a form of gating - 
only a signal above a certain intensity level (at the 
end of the signal processing chain) is accepted as 
a detection and then one has a measurement. In 
this case the “gate” is the interval [r, oo] in the 
signal intensity space, where r is the detection 
threshold. 

A Single Target in Clutter 

The validation procedure limits the region in 
the measurement space where the information 
processor will “look” to find the measurement 
from the target of interest. In spite of this, it can 
happen that more than one detection , i.e., several 
measurements, will be found in the validation 
region. 
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Gate thresholds and the probability mass Pq in 

the gate 



y 

g 

n z 

1 

1 

4 

2 

6.6 

2.57 

9 

3 

9.2 

3.03 

11.4 

3.38 

16 

4 

25 

5 

1 

0.683 

0.954 

0.99 

0.997 



0.99994 

1 

2 

0.393 

0.865 


0.989 

0.99 


0.9997 

1 

3 

0.199 

0.739 


0.971 


0.99 

0.9989 

0.99998 


Measurements outside the validation region 
can be ignored: they are “too far” and thus very 
unlikely to have originated from the target of 
interest. This holds if the gate probability is close 
to unity and the model used to obtain the gate is 
correct. 

The problem of tracking a single target in 
clutter considers the situation where there are 
possibly several measurements in the validation 
region (gate) of a target. The set of validated 
measurements consists of: 

• The correct measurement (if detected and it 
fell in the gate) 

• The undesirable measurements: clutter or false 
alarm originated 

In practice detections are obtained by thresh¬ 
olding the signal received by the sensor after 
processing it. This is the simplest (binary) way 
of using a target feature - its intensity. More 
sophisticated ways of using such feature informa¬ 
tion can be found in Bar-Shalom et al. (2011). 

It is assumed that the measurement contains 
all the information that could be used to discard 
the undesirable measurements. Therefore, any 
measurement that has been validated could have 
originated from the target of interest. 

A situation with a single-target track and sev¬ 
eral validated measurements is depicted in Fig. 1. 
The (two-dimensional) validation region is an 
ellipse centered at the predicted measurement z 1 . 
The parameters of the ellipse are determined by 
the covariance matrix S of the innovation, which 
is assumed to be Gaussian. 

All the measurements in the validation region 
can be said to be not too unlikely to have origi¬ 
nated from the target of interest, even though only 
one is assumed to be the true one. 

The implication of the assumption that 
there is a single target is that the undesirable 
measurements constitute a random interference. 



Data Association, Fig. 1 Several measurements in the 
validation region of a single track 


The common mathematical model for such false 
measurements is that they are: 

• Uniformly spatially distributed 

• Independent across time 

This corresponds to what is known as residual 
clutter - the constant clutter, if any, is assumed 
to have been removed. 

Multiple Targets in Clutter 

The situation where there are several target tracks 
in the same neighborhood as well as clutter (or 
false alarms) is more complicated. Figure 2 il¬ 
lustrates such a case for a given time, with the 
predicted measurements for the two targets con¬ 
sidered denoted as z l and z 2 . In this figure the 
following measurement origins are possible: 

• z\ from target 1 or clutter 

• zi from either target 1 or target 2 or clutter 

• Z 3 and Z 4 from target 2 or clutter 
However, if Z 2 originated from target 2, then 

it is quite likely that z\ originated from target 
1. This illustrates the interdependence of the 
associations in a situation where a persistent 
interference (neighboring target) is present in 
addition to random interference (clutter). 
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Data Association, Fig. 2 Two tracks with a measure¬ 
ment in the intersection of their validation regions 

Up to this point, it was assumed that a mea¬ 
surement could have originated from one of the 
targets or from clutter. However, in view of the 
fact that any signal processing system has an in¬ 
herent finite resolution capability, an additional 
possibility has to be considered: 

Z 2 could be the result of the merging of the 
detections from the two targets - it is an 
unresolved measurement. 

This constitutes a fourth origin hypothesis for 
a measurement that lies in the intersection of 
two validation regions. Most tracking algorithms 
ignore the possibility that a measurement is an 
unresolved one. 

This illustrates only the difficulty of associ¬ 
ation of measurements to tracks at one point in 
time. The full problem, as will be discussed later, 
consists of associating measurements across 
time. 

Approaches to Tracking and Data 
Association 

The problem of tracking and data association 

is a hybrid problem because it is character¬ 
ized by: 


(1) Continuous uncertainties - state estimation in 
the presence of continuous noises 

(2) Discrete uncertainties - which measure¬ 
ments) should be used in the estimation 
process 

Assuming the goal is to obtain the MMSE 
estimate of the target state - its conditional mean 
- one can distinguish the following approaches. 

Pure MMSE Approach 

The Pure MMSE Approach to tracking and 
data association is obtained using the smoothing 
property of expectations (see, e.g., Bar-Shalom 
et al. 2001, Sect. 1.4.12), as follows: 

x MMSE = E[x\Z] = E{E[x\A,Z]\Z} 

= E[x\A i ,Z]P{A i \Z} (5) 

A t eA 

where A is an association event (assuming a 
Bayesian model, with prior probabilities from 
which one can calculate posterior probabilities), 
and the summation is over all events Af in the set 
A of mutually exclusive and exhaustive associa¬ 
tion events. 

The above, which requires the evaluation 
of all the conditional (posterior) probabilities 
P{Aj\Z}, is a direct consequence of the total 
probability theorem (see, e.g., Bar-Shalom et al. 
2001, Sect. 1.4.10), which yields the conditional 
pdf of the state as the following mixture 

p(x\Z) = J2 p(x\Ai,Z)P{Ai\Z} (6) 
At eA 

In the linear-Gaussian case the above becomes a 
Gaussian mixture. Algorithms that fall in this 
category are PDAF and JPDAF, see Bar-Shalom 
et al. (2011). 

MMSE-MAP Approach 

The MMSE-MAP Approach , instead of enu¬ 
merating and summing over all the association 
events, selects the one with highest posterior 
probability, namely, 

a map = argmax P{Aj\Z} 


(7) 
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and then 

~MMSE-MAP = ^I^MAP, Z ] (g) 

The HOMHT as proposed by Reid (1979) falls in 
this category, see Bar-Shalom et al. (2011). 

MMSE-ML Approach 

The MMSE-ML Approach does not assume 
priors for the association events and relies on the 
maximum likelihood approach to select the event, 
that is, 

A ml = argmaxp{Z\Ai\} (9) 

i 

and then 

^ MMSE-ML = E[x\A ML ,Z] (10) 

The TOMHT falls into this category and S-D 
assignment (or MDA) is an implementation of 
this, see Bar-Shalom et al. (2011). 

Heuristic Approaches 

There are numerous simpler/heuristic ap¬ 
proaches. The most common one relies on the 
distance metric (3) and makes the selection of 
which measurement is associated with which 
track based on the “nearest neighbor” rule. The 
same criterion can be used in a global cost 
function. 

Remarks 

It should be noted that the MMSE-MAP esti¬ 
mate (8) and the MMSE-ML estimate (10) are 
obtained assuming that the selected association 
is correct - a hard decision . This hard decision 
is sometimes correct, sometimes wrong. On the 
other hand, the pure MMSE estimate (5) yields 
a soft decision - it averages over all the possi¬ 
bilities. This soft decision is never totally correct, 
never totally wrong. 

The uncertainties (covariances) associated 
with the MMSE-MAP and MMSE-ML estimates 
might be optimistic in view of the above 
observation. The uncertainty associated with 
the pure MMSE estimate will be increased 


(realistically) in view of the fact that it includes 
the data association uncertainty. 

Estimation and Data Association 
in Nonlinear Stochastic Systems 

The Model 

Consider the discrete time stochastic system 

x(k + 1) = f[k,x(k),u(k),\(k)\ (11) 

where x E lZ n is the stacked state vector of 
the targets under consideration, u(fc) is a known 
input (included here for the sake of generality), 
and \{k) is the process noise with a known pdf. 
The measurements at time k + 1 are described by 
the stacked vector 

z(fc + l) = h[k + l,x(k + l), A(fc + l),w(fc + l)] 

( 12 ) 

where A(k + 1) is the data association event 
at k + 1 that specifies (i) which measurement 
component originated from which components of 
x(k + 1), namely, from which target, and (ii) 
which measurements are false, that is, originated 
from the clutter process. The vector w (k) is the 
observation noise, consisting of the error in the 
true measurement and the false measurements. 
The pdf of the false measurements and the prob¬ 
ability mass function (pmf) of their number are 
also assumed to be known. 

The noise sequences and false measurements 
are assumed to be white with known pdf and mu¬ 
tually independent. The initial state is assumed to 
have a known pdf and to be independent of the 
noises. Additional assumptions are given below 
for the optimal estimator, which evaluates the pdf 
of the state conditioned on the observations. 

The optimal state estimator in the presence 
of data association uncertainty consists of the 
computation of the conditional pdf of the state 
x(k) given all the information available at time 
k , namely, the prior information about the initial 
state, the intervening inputs, and the sets of mea¬ 
surements through time k. The conditions under 
which the optimal state estimator consists of the 
computation of this pdf are presented in detail. 
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The Optimal Estimator for the Pure MMSE 
Approach 

The information set available at k is 

I k = {Z k ,U k ~ 1 } (13) 

where 

z* = {Z (;)};=, (14) 

is the cumulative set of observations through time 
k, which subsumes the initial information Z°, 
and U k ~ l is the set of known inputs prior to time 
k. 

For a stochastic system, an information state 
(Striebel 1965) is a function of the available 
information set that summarizes the past of the 
system in a probabilistic sense. 

It can be shown that the conditional pdf of the 
state 

p k = p[x(k)\I k ] (15) 

is an information state if (i) the two noise se¬ 
quences (process and measurement) are white 
and mutually independent and (ii) the target de¬ 
tection and clutter/false measurement processes 
are white. Once the conditional pdf (15) is avail¬ 
able, the pure MMSE estimator , i.e., the condi¬ 
tional mean, as well as the conditional variance, 
or covariance matrix, can be obtained. 

The optimal estimator, which consists of the 
recursive functional relationship between the in¬ 
formation states pk+ 1 and pk , is given by 

Pk+I = ^[k + 1, Pk,z(k + l),u(fc)] (16) 

where 

ty[k + l, p k ,z(k + l),u(&)] 

, M(k+ 1) 

= - V p[z(k + l)\x(k + l),Ai(k + l)] 

c " 

1 = 1 

■ J p[x(k + l)|x(&),u(£)]/? fc dx(k) 

P{At(k+ 1)} (17) 

is the transformation that maps pk into Pk+u the 
integration in (17) is over the range of x(k) and c 
is the normalization constant. 


The recursion (17) shows that the optimal 
MMSE estimator in the presence of data associa¬ 
tion uncertainty has the following properties: 

PI. The pdf pk+\ is a weighted sum of pdfs, 
conditioned on the current time association 
events A t (k + 1), i = 1 ,...,M(k + 1), 
where M(k + 1) is the number of mutually 
exclusive and exhaustive association events 
at time k + 1. 

P2. If the exact previous pdf, which is the suf¬ 
ficient statistic, is available, then only the 
most recent association event probabilities 
are needed at each time. 

However, the number of terms of the mixture in 
the right-hand side of (17) is, by time k + 1, given 
by the product 

k +1 

M k+l = ft M(i) (18) 

i = \ 

which amounts to an exponential increase in time. 
This increase is similar to the increase in the 
number of the branches of the MHT hypothesis 
tree. 

A detailed derivation of the recursion for the 
optimal estimator can be found in Bar-Shalom 
etal. (2011). 


Track-to-Track Association 

In addition to measurement-to-track associa¬ 
tion (M2TA ), an additional class of data asso¬ 
ciation is track-to-track association (T2TA). 
Following T2TA, track-to-track fusion (T2TF) 
may be performed to (hopefully) improve the 
overall tracking accuracy. For more details on 
track fusion, see Bar-Shalom et al. (2011). 

It is desired first to test the hypothesis that 
two tracks pertain to the same target. The optimal 
test would require using the entire data base (the 
sequences of measurements that form the tracks) 
through the present time k and is not practical. 
In view of this, the test to be presented is based 
only on the most recent estimates from the tracks. 
The test based on the state estimates within a time 
window is discussed in Tian and Bar-Shalom 
(2009). 
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Association of Tracks with Independent 
Errors 

Let x l ( k ) be the estimated state of a target by 
sensor i with its own information processor. As¬ 
sume that one has an estimate x J ( k ) from sensor 
j , corresponding to the same time. Both can be 
current estimates or one can be a prediction as 
long as they pertain to the same time (the second 
time argument has been omitted for simplicity). 

The corresponding covariances are denoted as 
P m (k),m = i,j. The state estimation errors at 
different sensors (local trackers), 

x\k) =x i (k)-x i (k ) (19) 

x j (k) =x J (k)-x J (k) (20) 

where x l and x J ' are the corresponding true states, 
are assumed to be independent. This is the state 
estimation error independence assumption. 

Remark 2 As shown in the sequel, for indepen¬ 
dent sensors , the state estimation errors for the 
same target are dependent in the presence of 
process noise. 

Denote the difference of the two estimates as 

A lj (k) = x l (j k ) — x j (j k ) (21) 

This is the estimate of the difference of the true 
states 

A ij (k) = x i (k)-x j (k) (22) 

The same target hypothesis is that the true 
states are equal, 

H 0 : A ij (k) = 0 (23) 

while the different target alternative is 

Hi : A ij (k) ^ 0 (24) 

While (21) is the appropriate statistic to test 
whether ( 22 ) is zero or not, the rigorous proof of 
this fact is presented in Bar-Shalom et al. (201 1). 

The error in the difference between the state 
estimates 

(25) 


is zero mean and has covariance 

T ij (k) = E{A ij (k)A ij (k)'} 

= E{[x‘(k) - x J (k)][x‘(k)-x J (k)]'} 

(26) 

given, under the error independence assump¬ 
tion , by 

T ij (k) = P l (k) + P j (k) (27) 

Assuming the estimation errors to be Gaus¬ 
sian, the test of Ho vs. H\ - the T2TA test - is 

Accept Ho if 

D = A ,J (k)'[r< (k)]~ ] A ij (k) < D a (28) 

The threshold D a is such that 

P{D > D a \H 0 } = a (29) 

where, e.g., a = 0.01. From the Gaussian as¬ 
sumption, the threshold is the 1 — a point of 
the chi-square distribution with n z degrees of 
freedom (Bar-Shalom et al. 2001) 

a, = 4(i-“) < 3 °) 

Association of Tracks with Dependent 
Errors 

In the previous section, the association testing 
was done under the assumption that the esti¬ 
mation errors in these tracks are independent. 
However, as shown in Bar-Shalom et al. (2011), 
whenever there is process noise (or, in general, 
motion uncertainty), the track errors based on 
data from independent sensors are dependent. 

The dependence between the estimation er¬ 
rors x l (k\k) and x J (k\k) from the two tracks 
arises from the common process noise which 
contributes to both errors. This is due to the fact 
that there is a common motion equation for both 
trackers. 

The testing of the hypothesis that the two 
tracks under consideration originated from the 
same target is done in the same manner as before, 


A ij (k) = A ij (k)~ A ij (k) 
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except for the following modification to account 
for the dependence of their state estimation 
errors. 

The covariance associated with the difference 
of the estimates 

A ij (k) = x\k\k)-x j (k\k) (31) 

is, accounting for the dependence, 

T ij (k) = E{A' j (k)A‘ J (k)'} 

= E{[x l (k\k) — x j (k\k)][x l (k\k) 
~x j (k\k)]'} (32) 

and, with the known cross-covariance P lj , is 
given by the expression 

T ij (k) = P j (k\k) + P j (k\k) 

-P ij (k\k)-P ji (k\k) (33) 

Note the difference between the above and (27). 

Effect of the Dependence 

The effect of the dependence between the esti¬ 
mation errors is to reduce the covariance of the 
difference (31) of the estimates. This is due to 
the fact that the cross-covariance term reflects a 
positive correlation between the estimation errors 
(this is always the case for linear systems). 

The Test 

The hypothesis testing for the track-to-track 
association with the dependence accounted for 
is done in the same manner as before in (28), 
except that the “smaller” covariance from (33) 
is used in the test statistic, which is, as before, 
the normalized distance squared between the 
estimates 

D = A ij {k)'[T ij {k)]- l k ij (k) (34) 

The Cross-Covariance of the Estimation 
Errors 

The cross-covariance recursion for synchro¬ 
nized sensors can be shown to be (see Bar- 
Shalom et al. 2011) 


P ij (k\k) =£[? (k\k)x J (k\k)'] 

= [I - W‘ (k)H' (k)] 

■ [F(k-l)P ij (k-l\k-l)F(k-l)' 

+ Q(k - 1)] [/ - W'(k)H l (k)]' 

(35) 

This is a linear recursion - a Lyapunov-type 
equation - and its initial condition is, assuming 
the initial errors to be uncorrelated, 

p7( 0 |0) = 0 (36) 

This is a reasonable assumption in view of the 
fact that the initial estimates are usually based on 
the initial measurements, which were assumed to 
have independent errors. 

The cross-covariance for the case of asyn¬ 
chronous sensors can be found in Bar-Shalom 
et al. (2011). 


Summary and Future Directions 

This entry surveyed the issues involved in data 
association (specifically M2TA and T2TA) with 
regard to multitarget-multisensor tracking sys¬ 
tems. 

The future developments in this topic will be 
in regard to the use of new feature variables and 
classification in data association (some prelimi¬ 
nary results are in Bar-Shalom et al. (201 1)). 


Cross-References 

► Estimation for Random Sets 

► Estimation, Survey on 
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Abstract 

Topological feedback entropy is a measure for the 
smallest information rate in a digital communica¬ 
tion channel between the coder and the controller 
of a control system, above which the control task 
of rendering a subset of the state space invariant 
can be solved. It is defined purely in terms of 
the open-loop system without making reference 
to a particular coding and control scheme and can 
also be regarded as a measure for the inherent 
rate at which the system generates “invariance 
information.” 

Keywords 

Communication constraints; Controlled invari¬ 
ance; Invariance entropy; Minimal data rates; 
Stabilization 

Introduction 

In the theory of networked control systems, the 
assumption of classical control theory that infor¬ 
mation can be transmitted within control loops 


instantaneously, lossless, and with arbitrary pre¬ 
cision is no longer satisfied. Realistic mathe¬ 
matical models of many important real-world 
communication and control networks have to 
take into account general data rate constraints in 
the communication channels, time delays, partial 
loss of information, and variable network topolo¬ 
gies. This raises the question about the smallest 
possible information rate above which a given 
control task can be solved. Though networked 
control systems can have a complicated topology, 
consisting of multiple sensors, controllers, and 
actuators, a first step towards understanding the 
problem of minimal data rates is to analyze the 
simplest possible network topology, consisting of 
one controller and one dynamical system con¬ 
nected by a digital channel with a certain rate 
in bits per unit time. There is a wealth of liter¬ 
ature concerned with the problem of stabilizing 
a system under different assumptions about the 
specific coding and control scheme, in this con¬ 
text. However, with few exceptions, mainly linear 
systems (both deterministic and stochastic) have 
been considered. A comprehensive and detailed 
overview of this literature until 2007 can be 
found in the survey Nair et al. (2007). The first 
systematic approach to the problem of minimal 
data rates for set invariance and stabilization of 
(deterministic, nonlinear) control systems was 
presented in Nair et al. (2004), where the notion 
of topological feedback entropy was introduced. 
This quantity, defined in terms of the open-loop 
control system, is a measure for the smallest 
data rate a communication channel may have if 
the system is supposed to solve the control task 
of rendering a subset of the state space invari¬ 
ant. Other challenges that digital communication 
channels come along with are not yet taken into 
account here. 

Feedback entropy was first introduced in Nair 
et al. (2004), using a similar approach via open 
covers as in the definition of topological entropy 
of for classical dynamical systems in Adler et al. 
(1965). In Colonius and Kawan (2009), a quantity 
named invariance entropy was defined which 
later turned out to be equivalent to the feedback 
entropy of Nair et al. (cf. Colonius et al. 2013). 
The notion of invariance entropy has been further 
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studied in the papers Kawan (2011a), Kawan 
(2011b), and Kawan (2011c). Several variations 
and generalizations have been introduced in 
Colonius (2010), Colonius (2012), Colonius and 
Kawan (2011), Da Silva (2013), and Hagihara 
and Nair (2013). The research monograph Kawan 
(2013) provides a comprehensive presentation of 
the results obtained so far in the deterministic 
case. 

Definition 

Topological feedback entropy is a nonnegative 
real-valued quantity which serves as a measure 
for the smallest possible data rate in a digital 
channel, connecting a coder to a controller, above 
which the controller is able to generate inputs 
which guarantee invariance of a given subset of 
the state space. In the literature, one finds several 
slightly differing versions. The original definition 
given in Nair et al. (2004) is (with minor mod¬ 
ifications) as follows. Consider a discrete-time 
control system 

Xk +1 = F(x k ,u k ) = F Uk (x k ), k > 0, 

with F : X x U —> X, where X is a topological 
space and U a nonempty set such that F u : X -> 
X is continuous for every u e U . The transition 
map associated to this system is 

cp :N 0 x X xU N ° ^ X, 
cp(k, x, 0 u „)) := F Uk _ x o • • • o F m o F uo (x). 

A compact subset K C X with nonempty interior 
is ( strongly) controlled invariant if for every x e 
K there is an input u e U such that F u (x) e 
int K. A triple (^4, r, G) is called an invariant 
open cover of K if A is an open cover of K , 
r is a positive integer, and G : A —> U T is 
a map with components Go, G\ ,..., G r _i which 
assign control values to all sets in A such that for 
every A e A the finite sequence of controls G(A) 
yields cp(k , A, G(A)) C intK fork = 1 , 2 ,..., r. 
The entropy of {A, r, G) is defined as follows. 
For every sequence a = (A/)/>o of sets in A one 
defines an associated sequence of controls by 


u(a) = (u 0 , ui,u 2 ,. ..) with (m/)|L ( /_ 1)r 
= G(Ai-i) for all i > 0 , 

and for every j > la set 

Bj(a) := {x e X : (p(ir,x,u(a)) e Ai 
for i = 0 ,1 ,..., j — 1 }. 

The family Bj := {Bj (a) : a e ^4 N °} is an open 
cover of K. Letting N(Bj \ K) denote the minimal 
cardinality of a finite subcover, the entropy of 
(^4, r, G) is 

h(A,r,G):= lim — log 2 N{Bj \ K) 
j^oo j r 

= inf -f log 2 N(Bj\K). 

J >i jr 

Finally, the topological feedback entropy (TFE) 
of K is given by 

MK) := inf h(A,r,G), 

( A,r,G) 

where the infimum is taken over all invariant open 
covers of K. 

A conceptually simpler but equivalent defini¬ 
tion, introduced in Colonius and Kawan (2009), 
is the following. A subset S C U T is called 
(r, K)-spanning if for every x e K there is ueS 
with cp(k, x, u) e intK for k = 1,..., r. Writing 
ri nv (r, K) for the minimal cardinality of such a 
set, it can be shown that 

hfb(K) = lim - log 2 r im (r,K) 

r —>*00 r 

= inf - log 2 r inv (r, K). 
r>l r 

This definition and several variations of it are 
mostly referred to by the name invariance 
entropy instead of feedback entropy. The intuition 
behind this definition is that a controller which 
receives a certain amount of information about 
the state, say n bits, can generate at most 
2 n different control sequences to steer the 
system on a finite time interval and hence, 
the number of control sequences needed to 
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accomplish the control task on this interval 
is a measure for the necessary amount of 
information. 

The different variations of feedback or invari¬ 
ance entropy which can be found in the literature 
are briefly summarized as follows. For simplicity, 
in this entry we refer to several of these variations 
by the name (topological) feedback entropy : 

(i) Instead of requiring that trajectories enter the 
interior of K after one step of time, one can 
allow for a waiting time To before entering 
int K. 

(ii) One can require that trajectories stay in K in¬ 
stead of intK or that they stay in an arbitrar¬ 
ily small neighborhood of K , respectively. 

(iii) One can restrict the set of initial states to 
a subset of K' C K. In this case, a set S 
of control sequences is (r, K f , K)-spanning 
if for every x e K' there is u c S with 
( p(k,x,u ) £ int K for k = 1 ,..., r. 

(iv) Feedback entropy can be defined for other 
classes of systems, e.g., continuous-time de¬ 
terministic systems, random control systems, 
or systems with piecewise continuous right- 
hand sides. 

There is also a local version of topological 
feedback entropy (LTFE) which measures the 
smallest data rate for local uniform asymptotic 
stabilization at an equilibrium. Also for other 
control tasks there have been attempts to define 
corresponding versions of feedback or invariance 
entropy. 

Comparison to Topological Entropy 

Though there are similarities in the definitions 
of TFE and topological entropy of dynamical 
systems, which also reflect in similar properties, 
there is no direct relation between these two 
quantities. Topological entropy detects exponen¬ 
tial complexity in the orbit structure of a dy¬ 
namical system. In contrast, TFE measures the 
complexity of the control task to keep a system in 
a subset of the state space by applying appropriate 
inputs. If no escape from this subset is possible, 
the TFE is zero, no matter how complicated the 
orbit structure is. Hence, topological entropy is 
sensitive to the local behavior of the system, 


while TFE in general is not. Interpreted in terms 
of information rates, topological entropy is a 
measure for the largest average rate of informa¬ 
tion about the initial state a dynamical system can 
generate. TFE measures the smallest rate of infor¬ 
mation about the state of the system above which 
a controller is able to render the set invariant. It 
should also be mentioned that topological entropy 
was first introduced as a topological counterpart 
of the measure-theoretic entropy defined by Kol¬ 
mogorov and Sinai, and that the two notions are 
related by the variational principle, which asserts 
that the topological entropy is the supremum of 
the measure-theoretic entropies with respect to 
all invariant probability measures of the given 
system. For TFE, so far no convincing measure- 
theoretic approach exists. An excellent survey on 
the entropy theory of dynamical systems can be 
found in Katok (2007). 

The Data Rate Theorem 

The data rate theorem for the TFE confirms that 
the infimal data rate in a coding and control loop 
which guarantees strong controlled invariance of 
a set K is equal to h^(K). More precisely, sup¬ 
pose that a sensor which measures the state of the 
system at discrete times Xk = kx , k = 0,1,2 ,..., 
is connected to a coder which at time Xk has a 
finite alphabet S k of symbols available. The mea¬ 
sured state is coded by use of this alphabet and 
the corresponding symbol is sent via a noiseless 
digital channel to a controller which generates 
an input sequence of length r. This sequence is 
used to steer the system until the next symbol 
arrives at time r^+i. The associated asymptotic 
average bit rate , which depends on the sequence 
S = (S k ) k >o of coding alphabets, is given by 

l k ~ l 

R(S)= lim — £>g 2 |S,|. 
k^oo kx z ' 
i =0 

If the limit does not exist, one may replace it 
with lim inf or lim sup. The data rate theorem 
establishes the equality 

MK) = inf R(S), 
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where the infimum is taken over all coding and 
control loops which guarantee strong controlled 
invariance of K , i.e., for initial states in K they 
generate trajectories (xk)k >o with Xk e int K for 
k = 1,2,.... Similar data rate theorems can be 
proved for other variants of feedback entropy. In 
particular, the data rate theorem for the LTFE 
asserts that the infimal bit rate for local uniform 
asymptotic stabilization at an equilibrium is given 
by the LTFE. Proofs of different data rate theo¬ 
rems can be found in Nair et al. (2004), Hagihara 
and Nair (2013), and Kawan (2013). 

Estimates and Formulas 

Linear Systems 

For linear systems, under reasonable assump¬ 
tions, the feedback entropy is given by the sum of 
the unstable eigenvalues of the dynamical matrix, 
i.e., if the system is given by Xk +1 = Axk + Buk , 
then 

hvo(K)= max{0,«Alog 2 |A|}, (1) 

AsSp(.4) 

where Sp(A) denotes the spectrum of A and n\ 
is the algebraic multiplicity of the eigenvalue A 
(cf. Colonius and Kawan 2009). It is worth to 
mention that the TFE therefore coincides with 
the topological entropy of the uncontrolled sys¬ 
tem Xk +1 = Axk, as defined by Bowen for 
maps on non-compact metric spaces (cf. Bowen 
1971). However, this is a special property of 
linear systems and is related to the facts that (i) 
there is no difference between the local and the 
global dynamical behavior of uncontrolled linear 
systems and that (ii) the control sequence does 
not affect the exponential complexity of the dy¬ 
namics, since it only appears as an additive term 
in the transition map of the system. Formula (1) 
is in correspondence with several former results 
on minimal data rates for stabilization of linear 
systems. Thinking of the definition of feedback 
entropy via spanning sets of control sequences, 
its interpretation is that in order to guarantee 
invariance of a bounded set, the only reason for 
exponential growth of the number of necessary 


inputs as time increases is the volume expansion 
of the open-loop system in the unstable subspace. 

Upper Bounds Under Controllability 
Assumptions 

If the state space of the control system is a 
differentiable manifold and the right-hand side 
is continuously differentiable, under certain 
controllability assumptions upper bounds for the 
feedback entropy can be formulated in terms of 
the Lyapunov exponents of periodic trajectories 
(for the concept of Lyapunov exponents, see, 
e.g., Barreira and Vails 2008) (cf. Kawan 2011b, 
2013; Nair et al. 2004). More precisely, if there 
is a periodic trajectory in the interior of the 
given set K such that the linearization along 
this trajectory is controllable, and if complete 
approximate controllability holds on the interior 
of K (cf. Colonius and Kliemann 2000), then 

/*fbW<L max {°’^ ;1 }’ (2) 

A 

where the sum is taken over all Lyapunov expo¬ 
nents A of the periodic trajectory and n\ denotes 
the multiplicity of A. Using the definition of 
feedback entropy in terms of (r, K) -spanning 
sets, one can prove this by constructing spanning 
sets of control functions which first steer all initial 
states in K into a small neighborhood of a point 
on the given periodic orbit and then, by use 
of local controllability, keep the corresponding 
trajectories in a neighborhood of the periodic 
trajectory for arbitrary future times. Similar ideas 
first have been used in Nair et al. (2004) to prove 
that the LTFE at an equilibrium is given by the 
sum of the unstable eigenvalues of the lineariza¬ 
tion about this equilibrium. For systems given by 
differential equations the upper estimate ( 2 ) can 
be improved under additional regularity assump¬ 
tions. Assuming that the system is smooth and 
satisfies the strong jet accessibility rank condition 
(cf. Coron 1994), one can show that both as¬ 
sumptions, controllability of the linearization and 
periodicity, can be omitted. The only restriction 
that remains is that the trajectory must not leave 
a compact subset of the interior of K. However, 
in the case of nonperiodic trajectories, the sum 
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of the positive Lyapunov exponents has to be 
replaced by the maximal Lyapunov exponent of 
the induced linear flow on the exterior bundle of 
the manifold. For control-affine systems, strong 
jet accessibility can be weakened to local acces¬ 
sibility. In general, it is unlikely that such upper 
bounds are tight, since they are related to very 
specific control strategies for making the given 
set invariant. 

Lower Bounds, Volume Growth Rates, 
and Escape Rates 

A general approach to obtain lower bounds of 
feedback entropy is via a volume growth ar¬ 
gument, which in its simplest form works as 
follows. Every (r, ^-spanning set S defines a 
cover of K , consisting of the sets (cf. Kawan 
2011 a, c) 

K T u = {x e K : (p(k,x,u) e int K, 

1 < k < r}, ueS. 

It follows that (p r ,u(K TfM ) C K and hence, since K 
is bounded, the volume expansion under (px,u = 
(p(r, •, u) gives upper bounds for the volumes of 
the sets K l u , which result in a lower bound for 
the number of these sets. For instance, the lower 
estimate in (1) can be established by applying 
this argument to the system which arises by 
projection of the given linear system to the un¬ 
stable subspace of the uncontrolled part = 
Axk- A refinement of this idea also leads to 
lower estimates of feedback entropy for inho¬ 
mogeneous bilinear systems in terms of volume 
growth rates or Lyapunov exponents on unstable 
bundles, respectively. For nonlinear systems, in 
general only very rough estimates can be ob¬ 
tained by this method. However, a variation of the 
volume growth argument leads to a lower bound 
of the form 

hfb(K) > — lim inf — log sup \i {K x u ), 
r u 

where /z denotes a reference measure on the state 
space. The right-hand side of this inequality can 
be considered as a uniform escape rate from the 


set K , which under sufficiently strong hyperbol- 
icity assumptions can be estimated in terms of 
other quantities such as Lyapunov exponents and 
dynamical entropies. Key references for escape 
rates in the classical theory of dynamical sys¬ 
tems are (Young (1990) and Demers and Young 
(2006)). 


Summary and Future Directions 

The theory of feedback entropy for finite¬ 
dimensional deterministic systems is very far 
from being complete. The currently available 
results only give valuable information in very 
regular situations, and even those are not fully 
understood. For the further development of 
this theory, it will be necessary to combine 
control-theoretic methods with techniques from 
different fields such as classical, random, and 
nonautonomous dynamical systems. Some of the 
main focuses of future research will probably be 
the following: 

• The generalization of feedback entropy to 
more complex network topologies 

• The formulation of a feedback entropy theory 
for stochastic systems 

• The development of a probabilistic (resp. 
measure-theoretic) version of feedback 
entropy for both deterministic and stochastic 
systems, which is related to the topological 
version via a variational principle 

• The numerical computation of feedback en¬ 
tropy 
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Abstract 

Mathematical models of living systems are often 
based on formal representations of the under¬ 
lying reaction networks. Here, we present the 
basic concepts for the deterministic nonspatial 
treatment of such networks. We describe the 
most prominent approaches for steady-state and 
dynamic analysis using systems of ordinary dif¬ 
ferential equations. 

Keywords 

Michaelis-Menten kinetics; Reaction networks; 
Stoichiometry 

Introduction 

A biochemical network describes the intercon¬ 
version of biochemical species such as proteins 
or metabolites by chemical reactions. Such net¬ 
works are ubiquitous in living cells, where they 
are involved in a variety of cellular functions 
such as conversion of metabolites into energy 
or building material of the cell, detection and 
processing of external and internal signals of 
nutrient availability or environmental stress, and 
regulation of genetic programs for development. 

Reaction Networks 

A biochemical network can be modeled as a dy¬ 
namic system with the chemical concentration of 
each species taken as the states and dynamics de¬ 
scribed by the changes in species concentrations 
as they are converted by reactions. Assuming that 
species are homogeneously distributed in the re¬ 
action volume and that copy numbers are suffi¬ 
ciently high, we may ignore spatial and stochastic 
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effects and derive a system of ordinary differen¬ 
tial equations (ODEs) to model the dynamics. 

Formally, a biochemical network is given by r 
reactions R \,..., R r acting on n different chem¬ 
ical species S\,...,S n . Reaction Rj is given by 

Rj • & 1 ,j S\ H - * * ‘ &n ,j Sn ^ ft 1 ,j S\ 

H-+ Pn,j $n » 

where ctij , fcj e N are called the molecularities 
of the species in the reaction. Their differences 
form the stoichiometric matrix N = (fiij — 
Wi,j)i=i...n,j=i...r, with Njj describing the net 
effect of one turnover of Rj on the copy number 
of species Sj. The j th column is also called 
the stoichiometry of reaction Rj. The system 
can be opened to an un-modeled environment by 
introducing inflow reactions 0 St and outflow 
reactions Si —0. 

For example, consider the following reaction 
network: 


Ri ■ 

E + S - 

* ES 

Ri : 

ES - 

>£ + 5 

*3 : 

ES - 

* E + P 


Here, an enzyme E (a protein that acts as a 
catalyst for biochemical reactions) binds to a sub¬ 
strate species S , forming an intermediate com¬ 
plex E • S and subsequently converting S into 
a product P. Note that the enzyme-substrate 
binding is reversible, while the conversion to 
a product is irreversible. This network contains 
r = 3 reactions interconverting n — 4 chemical 
species S\ = S, S 2 = P, S 3 = E, and S 4 = 
E • S . The stoichiometric matrix is 

(-1 +1 0 \ 

N = 00+1 

-1 +1 +1 

V+i -1 - 1 / 

Dynamics 

Fet x(t) = (x\(t), . .., x n (t)) T be the vector of 
concentrations, that is, X; (t) is the concentration 
of Si at time t . Abbreviating this state vector as x 


by dropping the explicit dependence on time, its 
dynamics is governed by a system of n ordinary 
differential equations: 

^ = N.v(x,p). (1) 

at 

Here, the reaction rate vector v(x, p) = 
(ui(x, pi),..., v r (x, p r )) r gives the rate of 
conversion of each reaction per unit-time as a 
function of the current system state and of a set 
of parameters p. 

A typical reaction rate is given by the mass- 
action rate law 

n 

M x -P/)= kj- n*r j - 

i = 1 

where the rate constant kj > 0 is the only 
parameter and the rate is proportional to the 
concentration of each species participating as an 
educt (consumed component) in the respective 
reaction. 

Equation (1) decomposes the system into a 
time-independent and linear part described solely 
by the topology and stoichiometry of the reac¬ 
tion network via N and a dynamic and typically 
nonlinear part given by the reaction rate laws 
v(-,-)- O ne can define a directed graph of the 
network with one vertex per state and take N as 
the (weighted) incidence matrix. Reaction rates 
are then properties of the resulting edges. In 
essence, the equation describes the change of 
each species’ concentration as the sum of the cur¬ 
rent reaction rates. Each rate is weighted by the 
molecularity of the species in the corresponding 
reaction; it is negative if the species is consumed 
by the reaction and positive if it is produced. 

Using mass-action kinetics throughout, and 
using the species name instead of the numeric 
subscript, the reaction rates and parameters of the 
example network are given by 

Ui(x,pi) = k 1 • x E (t)-x s (t ) 
u 2 (x,p 2 ) = k 2 • x E -s(t) 
u 3 (x,p 3 ) = I<3 • x E -s(t) 
p = (kuk 2 ,k3) 
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The system equations are then 


—xs = -k i -xs-x E +k 2 - x E -s 
at 


—~xp = k 2 • xe-s 
at 


-—xe = —k\ • • XE+k 2 • xe-s + k 2 • xe-s 

at 


~tXe-s = k\ • xs • xe — k 2 • — A3 * xe-s • 

at 

Steady-State Analysis 

A reaction network is in steady state if the pro¬ 
duction and consumption of each species are bal¬ 
anced. Steady-state concentrations x* then satisfy 
the equation 

0 = N v(x*,p) . 


a combination of fluxes b r • v, the total pro¬ 
duction rate of relevant metabolites to form new 
biomass. The biomass function given by b e W 
is determined experimentally. The technique of 
flux balance analysis (FBA) then solves the linear 
program 

max b r • v 

V 

subject to 

0 = N v 

v\ < Vi < v“ 

to yield a feasible flux vector that balances the 
network while maximizing growth. Alternative 
objective functions have been proposed, for in¬ 
stance, for higher organisms that do not necessar¬ 
ily maximize the growth of each cell. 


Computing steady-state concentrations requires 
explicit knowledge of reaction rates and their 
parameter values. For biochemical reaction net¬ 
works, these are often very difficult to obtain. 
An alternative is the computation of steady-state 
fluxes v*, which only requires solving the system 
of homogeneous linear equations 


0 = N v . (2) 

Lower and upper bounds v \, vf for each flux Vf 
can be given such that v\ < u* < vf; an exam¬ 
ple is an irreversible reaction Rj which implies 
v\ = 0. The set of all feasible solutions then 
forms a pointed, convex, polyhedral flux cone in 
W . The rays spanning the flux cone correspond 
to elementary flux modes (EFMs) or extreme 
pathways (EPsj, minimal subnetworks that are 
already balanced. Each feasible steady-state flux 
can be written as a nonnegative combination 

v * = X! ^ ■ e ' * ^ - 0 


of EFMs ei, e 2 ,..., where the A; are the corre¬ 
sponding weights. 

Even if in steady state, living cells grow and 
divide. Growth of a cell is often described by 


Quasi-Steady-State Analysis 

In many reaction mechanisms, a qua,si-steady- 
state assumption (QSSA) can be made, postulat¬ 
ing that the concentration of some of the involved 
species does not change. This assumption is often 
justified if reaction rates differ hugely, leading to 
a time scale separation, or if some concentrations 
are very high, such that their change is negligible 
for the mechanism. A typical example is the 
derivation of Michaelis-Menten kinetics , which 
corresponds to our example network. There, we 
may assume that the concentration of the interme¬ 
diate species E • S stays approximately constant 
on the time scale of the overall conversion of 
substrate into product and that the substrate, at 
least initially, is in much larger abundance than 
the enzyme. On the slower time scale, this leads 
to the Michaelis-Menten rate law : 




t^max * Xs ( t ) 

K m T x s (t) 


with a maximal rate r> ma x = k 2 • x^\ where x l f l is 
the total amount of enzyme and the Michaelis- 
Menten constant K m = (k 2 + kf)/k\ as a 
direct relation between substrate concentration 
and production rate. This approximation reduces 
the number of states by two. Both parameters 
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of the Michaelis-Menten rate law are also better 
suited for experimental determination: v max is the 
highest rate achievable and K m corresponds to 
the substrate concentration that yields a rate of 

^max/ 2. 


Cooperativity and Ultra-sensitivity 

In the Michaelis-Menten mechanism, the produc¬ 
tion rate gradually increases with increasing sub¬ 
strate concentration, until saturation (Fig. 1; h = 
1). A different behavior is achieved if the enzyme 
has several binding sites for the substrate and 
these sites interact such that occupation of one 
site alters the affinity of the other sites positively 
or negatively, phenomena known as positive and 
negative cooperativity, respectively. With QSSA 
arguments as before, the fraction of enzymes 
completely occupied by substrate molecules at 
time t is given by 

_ t? max • Xs (0 

W_ tf *+ x *(0 

where K > 0 is a constant and h > 0 is 
the Hill coefficient. The Hill coefficient deter¬ 
mines the shape of the response with increasing 
substrate concentration: a coefficient of h > 1 
(h < 1) indicates positive (negative) coopera¬ 
tivity; h = 1 reduces to the Michaelis-Menten 
mechanism. With increasing coefficient h, the 
response changes from gradual to switch-like, 
such that the transition from low to high response 
becomes more rapid as indicated in Fig. 1. This 
phenomenon is also known as ultra-sensitivity. 


Constrained Dynamics 

Due to the particular structure of the system 
equation (1), the trajectories x(t) of the network 
with xo = x(0) are confined to the stoichiometric 
sub space, the intersection of xo + ImgN with 
the positive orthant. Conservation relations that 
describe conservation of mass are thus found as 
solutions to 

c T • N = 0 , 

and two initial conditions xo, x' 0 lead to the same 
stoichiometric subspace if c T • xo = c T • x' Q . This 
allows for the analysis of, for example, bistability 
using only the reaction network structure. 


Summary and Future Directions 

Reactions networks, even in simple cells, 
typically encompass thousands of components 
and reactions, resulting in potentially high¬ 
dimensional nonlinear dynamic systems. In 
contrast to engineered systems, biology is char¬ 
acterized by a high degree of uncertainty of both 
model structure and parameter values. Therefore, 
system identification is a central problem in 
this domain. Specifically, advanced methods for 
model topology and parameter identification as 
well as for uncertainty quantification need to 
be developed that take into account the very 
limited observability of biological systems. In 
addition, biological systems operate on multiple 
time, length, and concentration scales. For 
example, genetic regulation usually operates on 
the time scale of minutes and involves very few 
molecules, whereas metabolism is significantly 
faster and states are well approximated by species 


Deterministic 
Description 
of Biochemical 
Networks, Fig. 1 

Responses for cooperative 
enzyme reaction with Hill 
coefficient h = 1, 3,10, 
respectively. All other 
parameters are set to 1 
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concentrations. Corresponding systematic 
frameworks for multiscale modeling, however, 
are currently lacking. 

Cross-References 

► Dynamic Graphs, Connectivity of 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Monotone Systems in Biology 

► Robustness Analysis of Biological Models 

► Spatial Description of Biochemical Networks 

► Stochastic Description of Biochemical 
Networks 

► Synthetic Biology 

Bibliography 

Craciun G, Tang Y, Feinberg M (2006) Understanding 
bistability in complex enzyme-driven reaction net¬ 
works. Proc Natl Acad Sci USA 103(23):8697-8702 
Higham DJ (2008) Modeling and simulating chemical 
reactions. SIAM Rev 50(2):347-368 
LeDuc PR, Messner WC, Wikswo JP (2011) How do 
control-based approaches enter into biology? Annu 
Rev Biomed Eng 13:369-396 
Sontag E (2005) Molecular systems biology and control. 

Eur J Control 11(4-5):396-435 
Szallasi Z, Stelling J, Periwal V (eds) (2010) System 
modeling in cellular biology: from concepts to nuts 
and bolts. MIT, Cambridge 

Tyson JJ, Chen KC, Novak B (2003) Sniffers, buzzers, 
toggles and blinkers: dynamics of regulatory and sig¬ 
naling pathways in the cell. Curr Opin Cell Biol 
15(2):221—231 


Diagnosis of Discrete Event Systems 

Stephane Lafortune 

Department of Electrical Engineering and 
Computer Science, University of Michigan, Ann 
Arbor, MI, USA 

Abstract 

We discuss the problem of event diagnosis in 
partially observed discrete event systems. The 
objective is to infer the past occurrence, if any, of 


an unobservable event of interest based on the ob¬ 
served system behavior and the complete model 
of the system. Event diagnosis is performed by 
diagnosers that are synthesized from the system 
model and that observe the system behavior at 
run-time. Diagnosability analysis is the off-line 
task of determining which events of interest can 
be diagnosed at run-time by diagnosers. 

Keywords 

Diagnosability; Diagnoser; Fault diagnosis; Veri¬ 
fier 

Introduction 

In this entry, we consider discrete event systems 
that are partially observable and discuss the two 
related problems of event diagnosis and diagnos¬ 
ability analysis. Let the DES of interest be de¬ 
noted by M with event set E. Since M is partially 
observable, its set of events E is the disjoint 
union of a set of observable events , denoted by 
E 0 , with a set of unobservable events, denoted by 
E uo : E = E 0 U E uo . At this point, we do not spec¬ 
ify how M is represented; it could be an automa¬ 
ton or a Petri net. Let Lm be the set of all strings 
of events in E that the DES M can execute, i.e., 
the (untimed) language model of the system; cf. 
the related entries, ► Models for Discrete Event 
Systems: An Overview, ► Supervisory Control of 
Discrete-Event Systems, and ► Modeling, Anal¬ 
ysis, and Control with Petri Nets. The set E uo 
captures the fact that the set of sensors attached 
to the DES M is limited and may not cover all 
the events of the system. Unobservable events 
can be internal system events that are not directly 
“seen” by the monitoring agent that observes 
the behavior of M for diagnosis purposes. They 
can also be fault events that are included in the 
system model but are not directly observable by 
a dedicated sensor. For the purpose of diagnosis, 
let us designate a specific unobservable event of 
interest and denote it by d e E uo . Event d could 
be a fault event or some other significant event 
that is unobservable. 
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Before we can state the problem of event 
diagnosis, we need to introduce some notation. 
E* is the set of all strings of any length n e N 
that can be formed by concatenating elements 
of E. The unique string of length n = 0 is 
denoted by s and is the identity element of con¬ 
catenation. As in article ► Supervisory Control 
of Discrete-Event Systems, section “Supervisory 
Control Under Partial Observations,” we define 
the projection function P : E* -> E* that 
“erases” the unobservable events in a string and 
replaces them by e. The function P is naturally 
extended to a set of strings by applying it to each 
string in the set, resulting in a set of projected 
strings. The observed behavior of M is the lan¬ 
guage P(Lm) over event set E 0 . 

The problem of event diagnosis , or simply di¬ 
agnosis, is stated as follows: how to infer the past 
occurrence of event d when observing strings in 
P(Lm ) at run-time, i.e., during the operation of 
the system? This is model-based inferencing, i.e., 
the monitoring agent knows Lm and the partition 
E = E 0 U E u0 , and it observes strings in P(Lm)- 
When there are multiple events of interest, d\ to 
d n , and these events ar e fault events, we have 
a problem of fault diagnosis. In this case, the 
objective is not only to determine that a fault 
has occurred (commonly referred to as “fault 
detection”) but also to identify which fault has 
occurred, namely, which event dj (commonly 
referred to as “fault isolation and identification”). 
Fault diagnosis requires that Lm contains not 
only the nominal or fault-free behavior of the 
system but also its behavior after the occurrence 
of each fault event di of interest, i.e., its faulty 
behavior. Event di is typically a fault of a com¬ 
ponent that leads to degraded behavior on the 
part of the system. It is not a catastrophic failure 
that would cause the system to completely stop 
operating, as such a failure would be immediately 
observable. The decision on which fault events 
di , along with their associated faulty behaviors, to 
include in the complete model Lm is a design one 
that is based on practical considerations related to 
the diagnosis objectives. 

A complementary problem to event diagnosis 
is that of diagnosability analysis. Diagnosability 
analysis is the off-line task of determining, on the 


basis of L m and of E 0 and E u0 , if any and all 
occurrences of the given event of interest d will 
eventually be diagnosed by the monitoring agent 
that observes the system behavior. 

Event diagnosis and diagnosability analysis 
arise in numerous applications of systems 
that are modeled as DES. We mention a few 
application areas where DES diagnosis theory 
has been employed. In heating, ventilation, 
and air-conditioning systems, components such 
as valves, pumps, and controllers can fail in 
degraded modes of operation, such as a valve 
gets stuck open or stuck closed or a pump or 
controller gets stuck on or stuck off. The available 
sensors may not directly observe these faults, as 
the sensing abilities are limited. Fault diagnosis 
techniques are essential, since the components 
of the system are often not easily accessible. In 
monitoring communication networks, faults of 
certain transmitters or receivers are not directly 
observable and must be inferred from the set of 
successful communications and the topology of 
the network. In document processing systems, 
faults of internal components can lead to jams in 
the paper path or a decrease in image quality, and 
while the paper jam or the image quality is itself 
observable, the underlying fault may not be as 
the number of internal sensors is limited. 

Without loss of generality, we consider a sin¬ 
gle event of interest to diagnose, d. When there 
are multiple events to diagnose, the methodolo¬ 
gies that we describe in the remaining of this 
entry can be applied to each event of interest di , 
i = 1 , ... fz, individually; in this case, the other 
events of interest dj , j j^ i are treated the same 
as the other unobservable events in the set E uo in 
the process of model-based inferencing. 

Problem Formulation 

Event Diagnosis 

We start with a general language-based formu¬ 
lation of the event diagnosis problem. The in¬ 
formation available to the agent that monitors 
the system behavior and performs the task of 
event diagnosis is the language Lm and the set 
of observable events E 0 , along with the specific 
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string t e P(Lm) that it observes at run-time. 
The actual string generated by the system is 
s G Lm where P(s ) = t. However, as far 
as the monitoring agent is concerned, the actual 
string that has occurred could be any string in 
P ~ l (t) fl Lm , where P~ l is the inverse projection 
operation, i.e., P~ l {t) is the set of all strings 
s t G E* such that P(s t ) = t. Let us denote this 
estimate set by £{t) = P~ l (t) Pi L M , where “£” 
stands for “estimate.” If a string s G Lm contains 
event d, we write that d G Lm', otherwise, we 
write that d £ Lm- 

The event diagnosis problem is to synthesize a 
diagnostic engine that will automatically provide 
the following answers from the observed t and 
from the knowledge of Lm and E 0 \ 

1. Yes, if and only if d e s for all s G £{t). 

2. No, if and only if d s for all s G £(t). 

3. Maybe, if and only if there exists sy,s n e 

£{t) such that d G sy and d £ sjy. 

As defined, £(t) is a string-based estimate. In 
section “Diagnosis of Automata,” we discuss how 
to build a finite-state structure that will encode the 
desired answers for the above three cases when 
the DES M is modeled by a deterministic finite- 
state automaton. The resulting structure is called 
a diagnoser automaton. 

Diagnosability Analysis 

Diagnosability analysis consists in determining, a 
priori, if any and all occurrences of event d in Lm 
will eventually be diagnosed, in the sense that 
if event d occurs, then the diagnostic engine is 
guaranteed to eventually issue the decision “Yes.” 
For the sake of technical simplicity, we assume 
hereafter that Lm is a live language, i.e., any 
trace in Lm can always be extended by one more 
event. In this context, we would not want the 
diagnostic engine to issue the decision “Maybe” 
for an arbitrarily long number of event occur¬ 
rences after event d occurs. When this outcome is 
possible, we say that event d is not diagnosable 
in language L m - 

The property of diagnosability of DES is de¬ 
fined as follows. In view of the liveness assump¬ 
tion on language L m , any string s' Y that contains 
event d can always be extended to a longer string, 
meaning that it can be made “arbitrarily long” 


after the occurrence of d. That is, for any s' Y in 
L m and for any n G N, there exists Sy = s' Y t G 
Lm where the length of t is equal to n. Event d 
is not diagnosable in language Lm if there exists 
such a string sy together with a second string 
Sn that does not contain event d, and such that 
P(sy) = P(sjv). This means that the monitoring 
agent is unable to distinguish between sy and Sjy, 
yet, the number of events after an occurrence of 
d can be made arbitrarily large in sy, thereby 
preventing diagnosis of event d within a finite 
number of events after its occurrence. On the 
other hand, if no such pair of strings (sy,sjy) 
exists in Lm, then event d is diagnosable in 
Lm- (The mathematically precise definition of 
diagnosability is available in the literature cited 
at the end of this entry.) 

Diagnosis of Automata 

We recall the definition of a deterministic finite- 
state automaton, or simply automaton, from 
article ►Models for Discrete Event Systems: 
An Overview, with the addition of a set of 
unobservable events as in section “Supervisory 
Control Under Partial Observations” in article 
► Supervisory Control of Discrete-Event 
Systems. The automaton, denoted by G, is a 
four-tuple G = (X,E,f,x o) where X is the 
finite set of states, E is the finite set of events 
partitioned into E = E 0 U E uo , Vo is the initial 
state, and / is the deterministic partial transition 
function / : X x E —> X that is immediately 
extended to strings / : X x L* X. For a 
DES M represented by an automaton G, Lm is 
the language generated by automaton G, denoted 
by C(G) and formally defined as the set of all 
strings for which the extended / is defined. It 
is an infinite set if the transition graph of G 
has one or more cycles. In view of the liveness 
assumption made on Lm in the preceding section, 
G has no reachable deadlocked state, i.e., for all 
s e E* such that f(x,s ) is defined, then there 
exists o G E such that / (x, so) is also defined. 

To synthesize a diagnoser automaton that cor¬ 
rectly performs the diagnostic task formulated 
in the preceding section, we proceed as follows. 
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First, we perform the parallel composition (de¬ 
noted by ||) of G with the two-state label au¬ 
tomaton A i a bei that is defined as follows. ^4i a bei = 
({N, Y}, {d}, /label, N), where /i abe i has two tran¬ 
sitions defined: (i) /label W d) = Y and (ii) 
/labeled) = Y. The purpose of ^i abe i is to 
record the occurrence of event d , which causes 
a transition to state Y. By forming Gi a b e ied = 
G11A i a bei, we record in the states of G^beied, which 
are of the form ( xg,xa ), if the first element of 
the pair, state xq e X, was reached or not 
by executing event d at some point in the past: 
if d was executed, then xa = Y, otherwise 
xa = N. (We refer the reader to Chap. 2 in 
Cassandras and Lafortune (2008) for the formal 
definition of parallel composition of automata.) 
By construction, £ (Gilded) — £(G). 

The second step of the construction of the 
diagnoser automaton is to build the observer of 
Giabeied, denoted by Obs(Gi a beied), with respect 
to the set of observable events E 0 . (We refer 
the reader to Chap. 2 in Cassandras and Lafor¬ 
tune (2008) for the definition of the observer 
automaton and for its construction.) The con¬ 
struction of the observer involves the standard 
subset construction algorithm for nondeterminis- 
tic automata in automata theory; here, the unob¬ 
servable events are the source of nondeterminism, 
since they effectively correspond to e-transitions. 
The diagnoser automaton of G with respect to E 0 
is defined as Diag(G) = Obs(G 11 ^4label) - Us event 
set is E 0 . 

The states of Diag(G) are sets of state pairs 
of the form (xq, xa) where xa is either N or Y. 
Examination of the state of Diag(G) reached by 
string t e P[C(G)] provides the answers to the 
event diagnosis problem. Let us denote that state 

b y X Diag' Then: 

1 . The diagnostic decision is Yes if all state pairs 
in Xp iag have their second component equal 
to Y ; we call such a state a “Yes-state” of 
Diag(G). 

2. The diagnostic decision is No if all state pairs 
in Vp iag have their second component equal 
to N; we call such a state a “No-state” of 
Diag(G). 

3. The diagnostic decision is Maybe if there is 
at least one state pair in Xp iag whose second 


component is equal to Y and at least one state 
pair in v{) iag whose second component is equal 
to N; we call such a state a “Maybe-state” of 
Diag(G). 

To perform run-time diagnosis, it therefore suf¬ 
fices to examine the current state of Diag(G). 
Note that Diag(G) can be computed off-line 
from G and stored in memory, so that run-time 
diagnosis requires only updating the new state of 
Diag(G) on the basis of the most recent observed 
event (which is necessarily in E 0 ). If storing 
the entire structure of Diag(G) is impractical, its 
current state can be computed on-the-fly on the 
basis of the most recent observed event and of the 
transition structure of Giabeied; this involves one 
step of the subset construction algorithm. 

As a simple example, consider the automaton 
Gi shown in Fig. 1, where E uo = {d}. The 
occurrence of event d changes the behavior of the 
system such that event c does not cause a return to 
the initial state 1 (identified by incoming arrow); 
instead, the system gets stuck in state 3 after d 
occurs. 

Its diagnoser is depicted in Fig. 2. It contains 
one Yes-state, state {(3, F)} (abbreviated as “3Y” 
in the figure), one No-state, and two Maybe-states 
(similarly abbreviated). Two consecutive occur¬ 
rences of event c, or an occurrence of b right after 
c, both indicate that the system must be in state 3, 
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Diagnoser automaton of G i 
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Automaton G 2 

i.e., that event d must have occurred; this is 
captured by the two transitions from Maybe-state 
{(1, N), (3, F)} to the Yes-state in Diag(Gi). As 
a second example, consider the automaton G2 
shown in Fig. 3, where the self-loop b at state 3 
in Gi has been removed. Its diagnoser is shown 
in Fig. 4. 

Diagnoser automata provide as much informa¬ 
tion as can be inferred, from the available obser¬ 
vations and the automaton model of the system, 
regarding the past occurrence of unobservable 
event d. However, we may want to answer the 
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Diagnoser automaton of G 2 

question: Can the occurrence of event d always 
be diagnosed? This is the realm of diagnosability 
analysis discussed in the next section. 


Diagnosability Analysis of Automata 

As mentioned earlier, diagnosability analysis 
consists in determining, a priori, if any and all 
occurrences of event d in Lm will eventually be 
diagnosed. In the case of diagnoser automata, we 
do not want Diag(G) to loop forever in a cycle of 
Maybe-states and never enter a Yes-state if event 
d has occurred, as happens in the diagnoser in 
Fig. 2 for the string adb n when n gets arbitrarily 
large. In this case, Diag(Gi) loops in Maybe- 
state {(2, N), (3, F)}, and the occurrence of d 
goes undetected. This shows that event d is 
not diagnosable in G\\ the counterexample is 
provided by strings Sy = adb n and^ = ab n . 

For systems modeled as automata, diagnos¬ 
ability can be tested with quadratic time com¬ 
plexity in the size of the state space of G by 
forming a so-called twin-automaton (also called 
“verifier”) where G is parallel composed with 
itself, but synchronization is only enforced on ob¬ 
servable events, allowing arbitrary interleavings 
of unobservable events. The test for diagnosabil¬ 
ity reduces to detection of cycles that occur after 
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event d in the twin-automaton. It can be verified 
that for automaton G 2 in our example, event d is 
diagnosable. Indeed, it is clear from the structure 
of G 2 that Diag(G 2 ) in Fig. 4 will never loop in 
Maybe-state {(2, N), (3, F)} if event d occurs; 
rather, after d occurs, G 2 can only execute c 
events, and after two such events, Diag(G 2 ) en¬ 
ters Yes-state {(3, F)}. 

Note that diagnosability will not hold in an 
automaton that contains a cycle of unobservable 
events after the occurrence of event d , although 
this is not the only instance where the property is 
violated, as we saw in our simple example. 

Diagnosis and Diagnosability 
Analysis of Petri Nets 

There are several approaches for diagnosis and 
diagnosability analysis of DES modeled by Petri 
nets, depending on the boundedness properties 
of the net and on what is observable about its 
behavior. Let N be the Petri net model of the 
system, which consists of a Petri net structure 
along with an initial marking of all places. If the 
transitions of N are labeled by events in a set E, 
some by observable events in E 0 and some by 
unobservable events in E uo , and if the contents 
of the Petri net places are not observed except 
for the initial marking of N, then we have a 
language-based diagnosis problem as considered 
so far in this entry, for language C(N) and for 
E = E 0 U E uo , with event of interest d e E u0 . 
In this case, if the set of reachable states of the 
net is bounded, then we can use the reachability 
graph as an equivalent automaton model of the 
same system and build a diagnoser automaton 
as described earlier. It is also possible to en¬ 
code diagnoser states into the original structure 
of net N by keeping track of all possible net 
markings following the observation of an event 
in E 0 , appending the appropriate label (“N” or 
“Y”) to each marking in the state estimate. This 
is reminiscent of the on-the-fly construction of 
the current state of the diagnoser automaton dis¬ 
cussed earlier, except that the possible system 
states are directly listed as Petri net markings 
on the structure of N. Regarding diagnosability 


analysis, it can be performed using the twin- 
automaton technique of the preceding section, 
from the reachability graph of N . 

Another approach that is actively being pur¬ 
sued in current literature is to exploit the struc¬ 
ture of the net model N for diagnosis and for 
diagnosability analysis, instead of working with 
the automaton model obtained from its reachabil¬ 
ity graph. In addition to potential computational 
gains from avoiding the explicit generation of the 
entire set of reachable states, this approach is mo¬ 
tivated by the need to handle Petri nets whose sets 
of reachable states are infinite and in particular 
Petri nets that generate languages that are not 
regular and hence cannot be represented by finite- 
state automata. Moreover, in this approach, one 
can incorporate potential observability of token 
contents in the places of the Petri nets. We refer 
the interested reader to the relevant chapters in 
Campos et al. (2013) and Seatzu et al. (2013) for 
coverage of these topics. 

Current and Future Directions 

The basic methodologies described so far for 
diagnosis and diagnosability analysis have been 
extended in many different directions. We briefly 
discuss a few of these directions, which are active 
research areas. Detailed coverage of these topics 
is beyond the scope of this entry and is available 
in the references listed at the end. 

Diagnosis of timed models of DES has been 
considered, for classes of timed automata and 
timed Petri nets, where the objective is to ensure 
that each occurrence of event d is detected within 
a bounded time delay. Diagnosis of stochastic 
models of DES has been considered, in particu¬ 
lar stochastic automata, where the hard diagnos¬ 
ability constraints are relaxed and detection of 
each occurrence of event d must be guaranteed 
with some probability 1 — 6, for some small 
e > 0. Stochastic models also allow handling of 
unreliable sensors or noisy environments where 
event observations may be corrupted with some 
probability, such as when an occurrence of event 
a is observed as event a 80 % of the time and as 
some other event a' 20 % of the time. 
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Decentralized diagnosis is concerned with 
DES that are observed by several monitoring 
agents i = 1 ,... ,n, each with its own set of 
observable events E 0 j and each having access 
to the entire set of system behaviors, Lm . The 
task is to design a set of individual diagnosers, 
one for each set E 0 j , such that the n diagnosers 
together diagnose all the occurrences of event d. 
In other words, for each occurrence of event 
d in any string of Lm , there exists at least 
one diagnoser that will detect it (i.e., answer 
“Yes”). The individual diagnosers may or may 
not communicate with each other at run-time 
or they may communicate with a coordinating 
diagnoser that will fuse their information; several 
decentralized diagnosis architectures have been 
studied and their properties characterized. The 
focus in these works is the decentralized nature 
of the information available about the strings in 
Lm , as captured by the individual observable 
event sets E 0 j, i = 1,..., n. 

Distributed diagnosis is closely related to 
decentralized diagnosis, except that it normally 
refers to situations where each individual 
diagnoser uses only part of the entire system 
model. Let M be an automaton G obtained 
by parallel composition of subsystem models: 
G = \\i=i, n Gi. In distributed diagnosis, one 
would want to design each individual diagnoser 
Diag ? on the basis of G ? alone or on the basis 
of Gi and of an abstraction of the rest of the 
system, \\j=\, n -j^iGj. Here, the emphasis is on 
the distributed nature of the system, as captured 
by the parallel composition operation. In the case 
where M is a Petri net N, the distributed nature 
of the system may be captured by individual net 
models TV/, i = l,... ,n, that are coupled by 
common places, i.e., place-bordered Petri nets. 

Robust diagnosis generally refers to decentral¬ 
ized or distributed diagnosis, but where one or 
more of the individual diagnosers may fail. Thus, 
there must be built-in redundancy in the set of 
individual diagnosers so that they together may 
still detect every occurrence of event d even if 
one or more of them ceases to operate. 

So far we have considered a fixed and static 
set of observable events, E 0 C E, where ev¬ 
ery occurrence of each event in E 0 is always 


observed by the monitoring agent. However, there 
are many instances where one would want the 
monitoring agent to dynamically activate or de¬ 
activate the observability properties of a subset of 
the events in E 0 \ this arises in situations where 
event monitoring is “costly” in terms of energy, 
bandwidth, or security reasons. This is referred to 
as the case of dynamic observations , and the goal 
is to synthesize sensor activation policies that 
minimize a given cost function while preserving 
the diagnosability properties of the system. 

Cross-References 

► Models for Discrete Event Systems: An 
Overview 

► Modeling, Analysis, and Control with Petri 
Nets 

► Supervisory Control of Discrete-Event Systems 

► Modeling, Analysis, and Control with Petri 
Nets 

Recommended Reading 

There is a very large amount of literature on 
diagnosis and diagnosability analysis of DES 
that has been published in control engineering, 
computer science, and artificial intelligence jour¬ 
nals and conference proceedings. We mention 
a few recent books or survey articles that are 
a good starting point for readers interested in 
learning more about this active area of research. 
In the DES literature, the study of fault diagnosis 
and the formalization of diagnosability proper¬ 
ties started in Lin (1994) and Sampath et al. 
(1995). Chapter 2 of the textbook Cassandras 
and Lafortune (2008) contains basic results about 
diagnoser automata and diagnosability analysis 
of DES, following the approach introduced in 
Sampath et al. (1995). The research monograph 
Lamperti and Zanella (2003) presents DES diag¬ 
nostic methodologies developed in the artificial 
intelligence literature. The survey paper Zay- 
toon and Lafortune (2013) presents a detailed 
overview of fault diagnosis research in the con¬ 
trol engineering literature. The two edited books 
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Campos et al. (2013) and Seatzu et al. (2013) 
contain chapters specifically devoted to diagnosis 
of automata and Petri nets, with an emphasis 
on automated manufacturing applications for the 
latter. Specifically, Chaps. 5, 14, 15, 17, and 19 in 
Campos et al. (2013) and Chaps. 22-25 in Seatzu 
et al. (2013) are recommended for further reading 
on several aspects of DES diagnosis. Zaytoon 
and Lafortune (2013) and the cited chapters in 
Campos et al. (2013) and Seatzu et al. (2013) 
contain extensive bibliographies. 
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Abstract 

In the early 1970s, concepts from differential 
geometry were introduced to study nonlinear 
control systems. The leading researchers in this 
effort were Roger Brockett, Robert Hermann, 


Henry Hermes, Alberto Isidori, Velimir Jurdjevic, 
Arthur Krener, Claude Lobry, and Hector 
Sussmann. These concepts revolutionized our 
knowledge of the analytic properties of control 
systems, e.g., controllability, observability, 
minimality, and decoupling. With these concepts, 
a theory of nonlinear control systems emerged 
that generalized the linear theory. This theory of 
nonlinear systems is largely parallel to the linear 
theory, but of course it is considerably more 
complicated. 

Keywords 

Codistribution; Distribution; Frobenius theorem; 
Involutive distribution; Lie jet 

Introduction 

This is a brief survey of the influence of differ¬ 
ential geometric concepts on the development of 
nonlinear systems theory. Section “A Primer on 
Differential Geometry” reviews some concepts 
and theorems of differential geometry. Nonlin¬ 
ear controllability and nonlinear observability are 
discussed in sections “Controllability of Nonlin¬ 
ear Systems” and “Observability for Nonlinear 
Systems”. Section “Minimal Realizations” dis¬ 
cusses minimal realizations of nonlinear systems, 
and section “Disturbance Decoupling” discusses 
the disturbance decoupling problem. 

A Primer on Differential Geometry 

Perhaps a better title might be “A Primer on 
Differential Topology” since we will not treat 
Riemannian or other metrics. A n -dimensional 
manifold Ad is a topological space that is locally 
homeomorphic to a subset of IR n . For simplicity, 
we shall restrict our attention to smooth (C°°) 
manifolds and smooth objects on them. Around 
each point p e Ad, there is at least one coordinate 
chart that is a neighborhood A f p C Ad and a 
homeomorphism v : Af p -> U where U is 
an open subset of IR n . When two coordinate 




276 


Differential Geometric Methods in Nonlinear Control 


charts overlap, the change of coordinates should 
be smooth. For simplicity, we restrict our atten¬ 
tion to differential geometric objects described in 
local coordinates. 

In local coordinates, a vector field is just an 
ODE of the form 

X = fix) (1) 

where /(x) is a smooth IR" x 1 valued function of 
v. In a different coordinate chart with local coor¬ 
dinates z, this vector field would be represented 
by a different formula: 

z = g(z) 

If the charts overlap, then on the overlap they are 
related: 

3x dz 

f(x(z)) = -x-(z)g(z), g(z(x)) = — (z) fix) 
dz dx 

Since /(x) is smooth, it generates a smooth 
flow ( p(t,x °) where for each t , the mapping 
v i-> 0(7, x°) is a local diffeomorphism and for 
each x° the mapping t i-> 0(£,x°) is a solution 
of the ODE (1) satisfying the initial condition 
0(0, x°) = x°. We assume that all the flows 
are complete, i.e., defined for all t e IR, x e 
M. The flows are one parameter groups, i.e., 
(j){t,(j){s, x 0 )) = 0 (t + .y, x°) = (j){s,(j){t , x 0 )). 

If /(x°) = b, a constant vector, then locally 
the flow looks like translation, (j){t,x l ) = x 1 + 
tb. If /(x°) 7^ 0, then we can always choose 
local coordinates z so that in these coordinates 
the vector field is constant. Without loss of gen¬ 
erality, we can assume that x° = 0 and that the 
first component of / is f\ (0) 7 ^ 0. Define the 
local change of coordinates: x(z) = <p(zi, x ! (z)) 
where x*(z) = (0, zi, • • •, z n )'• It is not hard to 
that this is a local diffeomorphism and that, in z 
coordinates, the vector field is the first unit vector. 

If /(x°) = 0 let F = |^(x°), then if all the 
eigenvalues of F are off the imaginary axis, then 
the integral curves of /(x) and 

( 2 ) 


are locally topologically equivalent (Arnol’d 
1983). This is the Grobman-Hartman theorem. 
There exists a local homeomorphism z = h(x) 
that carries x(t) trajectories into z(s) trajectories 
in some neighborhood of x° = 0. This 

homeomorphism need not preserve time t 7^ s, 
but it does preserve the direction of time. Whether 
these flows are locally diffeomorphic is a more 
difficult question that was explored by Poincare. 
See the section on feedback linearization (Krener 
2013). 

If all the eigenvalues of F are in the open 
left half plane, then the linear dynamics (2) is 
globally asymptotically stable around z° = 0, i.e., 
if the flow of (2) is \/f(t,z ), then \/f(t,z l ) —> 0 
as t —> 00. Then it can be shown that the non¬ 
linear dynamics is locally asymptotically stable, 
0(f,x 1 ) —> x° as t —> 00 for all x 1 in some 
neighborhood of x°. 

One forms, &>(x), are dual to vector fields. 
The simplest example of a one form (also called 
a covector field) is the differential dh(x) of a 
scalar-valued smooth function h(x). This is the 
IR lxn covector field 

coix) = [^ix)... f(x)] 

Sometimes this is written as 

A 3 h 

oj(x) = > — (x) dx, 
j 3x,- 

The most general smooth one form is of the form 
co(x) = [ co 1 (x) ... co n (x) ] 

n 

= 0)1 ( x ) dxi 

i = 1 

where the co 1 (x) are smooth functions. The du¬ 
ality between one forms and vector fields is the 
bilinear pairing 

< <w(x), /(x) > = (o(f)(x ) = <o(x)/(x) 

n 

= y]«'(x)./;(x) 

i = 1 


z = Fz 
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Just as a vector field can be thought of as a 
first-order ODE, a one form can be thought of 
as a first-order PDE. Given oo(x), find h(x) such 
that dh(x) = co(x). A one form co(x) is said to 
be exact if there exists such an h(x). Of course 
if there is one solution, then there are many all 
differing by a constant of integration which we 
can take as the value of h at some point x°. 

Unlike smooth first-order ODEs, smooth first- 
order PDEs do not always have a solution. There 
are integrability conditions and topological con¬ 
ditions that must be satisfied. Suppose dh(x) = 
co(x), then ^r(x) = co l (x) so 

= jg-M = §/*) 

oxj dXj oxi oxi oxj oxi 

Therefore, for the PDE to have a solution, the 
integrability conditions 


This can be iterated 

, dL k r l h 

L k f mx) = -£-(*)/(*) 

If h(x) € IR pxX , then L f(h)(x) € IR pxl . 

The Lie bracket of two vector fields / 1 (x) and 
/ 2 (x) is another vector field 

I/ 1 ,/ 2 ] U) = ^(x)/'(x) - ^r(x)/ 2 (x) 

Clearly, the Lie bracket is skew symmetric, 
[A/ 2 ]M = -[/ 2 J']W(4 It also 

satisfies the Jacobi identity 

[f\[f 2 ,f 3 ]](x) + [f 2 ,[f\f l ]](x) 

+ [A[A/ 2 ]]W = o 


dco‘ , , do)j , , 

7 —W - -7— M = 0 

OX i OXj 


must be satisfied. The exterior derivative of a one 
form is a skew-symmetric matrix field 


Repeated Lie brackets are often expressed induc¬ 
tively as 


ad° f (g)(x) = g(x) 


ad k f (g)(x) = 


f,ad 


k ~\g) 


(x) 


doo{x) = 




dxi A dxj 


A one form co(x) is said to be closed if doo(x) = 
0. This is locally sufficient for there to exist an 
h(x) such that dh{x) = oo{x). 

Every exact form is closed but not every closed 
form is exact. A counter example on IR 2 is 


The geometric interpretation of the Lie bracket 
[/, g\ (v) is the infinitesimal commutator of their 
flows 4>(t,x) and \jf{t,x), i.e., 

f(t,<f>(t,x)) - 4>(t,f(t,x)) = [f,g](x)t 2 

+ 0(t ) 3 


0)(x) = [ —X 2 X\ ] 

This is closed but not exact. The line integral of 
this convector field around any circle centered at 
the origin is 2tt. If it were exact, the line integral 
would have been zero because the curve ends 
where it begins. 

The Lie derivative of a scalar-valued function 
h(x) by a vector field f(x) is denied to be the 
scalar-valued function 

L f (h)(x) = ^(x)/(x) =< dh(x), /(x) > 


Another interpretation of the Lie bracket is given 
by the Lie series expansion 

00 t k 

gM,x)) = y>l ) k ad k f(g){x)- 

k =0 

This is a Taylor series expansion which is conver¬ 
gent for small \t\ if /, g are real analytic vector 
fields. Another Lie series is 

00 f k 

hm,x)) = Y / L k f(h)(x)- 
k =0 
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Given a smooth mapping z = 0(x) from an 
open subset of IR n to an open subset of IR m and 
vector fields f (x) and g(z) on these open subsets, 
we sat /(v) is 6 -related to g(z) if 

on 

g(0(x)) = — (x)f(x) 

OX 

It is not hard to see that if f l {x ), f 2 {x) are 6- 
related to g 2 (z), then [Z 1 ,/ 2 ] (x) is 0- 

related to [g 1 ,# 2 ] (z). For this reason, we say 
that the Lie bracket is an intrinsic differentiation. 
The other intrinsic differentiation is the exterior 
derivative operation d. 

The Lie derivative of a one form co(x) by a 
vector field f(x) is given by 

Lf(oj)(x) = (*)/>(*) 

ij ^ J 

dfj 

+COj(x)— L (x) 

3 Xi 

It is not hard to see that 

L f (< co,g >)(x) = < L f (co),g > (x) 

+ < ft), [/, g] > (X) 

and 

Lf(dh ) (v) = d(Lf (ih)) (x) (3) 

Control systems involve multiple vector fields. 
A distribution V is a set of vector fields on M that 
is closed under addition of vector fields and under 
multiplication by scalar functions. A distribution 
defines at each v e M a subspace of the tangent 
space 

D(x) = {/(x) : / e V) 

These subspaces form a subbundle D of the 
tangent bundle. If the subspaces are all of the 
same dimension, then the distribution is said to 
be nonsingular. We will restrict our attention to 
nonsingular distributions. 

A codistribution (or Pfaffian system) £ is a 
set of one forms on M that is closed under 
addition and multiplication by scalar functions. 


A codistribution defines at each x e M a 
subspace of the cotangent space 

{&>(v) : co e £} 

These subspaces form a subbundle E of the 
cotangent bundle. If the subspaces are all of the 
same dimension, then the codistribution is said 
to be nonsingular. Again, we will restrict our 
attention to nonsingular codistributions. 

Every distribution V defines a dual codistribu¬ 
tion 

D* = {&>(v) : oo(x) f(x) = 0, for all f(x) e V} 
and vice versa 

8* = {f{x) : co(x) fix) = 0, for all &>(x) G 8} 

A k dimensional distribution V (or its dual 
codistribution V*) can be thought of as a system 
of PDEs on A4 . Find n—k independent functions 
h i(x),..., h n -kix) such that 

dhf ix) fix) = 0 for all fix)eV 

The functions h\ix),, h n -kix) are said to be 
independent if dh i(x),..., dh n -kix) are linearly 
independent at every v e M. In other words, 
dh\ix ),..., dh n -kix) span D* over the space of 
smooth functions. 

The Frobenius theorem gives the integrability 
conditions for these functions to exist locally. 
The distribution V must be involutive, i.e., closed 
under the Lie bracket, 

[D,V] = {[f,g]:f,geV}cV 

When the functions exist, their joint level sets 
{x : hiix) = Ci,i = 1 ,...,n — k} are the 
leaves of a local foliation. Through each x° in 
a convex local coordinate chart A f, there exists 
locally a a /:-dimensional submanifold {x e Af : 
hiix) = hiix 0 )}. At each x 1 in this subman¬ 
ifold, its tangent space is D(v). Whether these 
hi (v) exist globally to define a global foliation, 
a partition of M into smooth submanifolds, is a 
delicate question. Consider a distribution on IR 2 
generated by a constant vector field fix) = b 
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of irrational slope, & 2/&1 is irrational. Construct 
the toms T 2 as the quotient of IR 2 by the integer 
lattice Z 2 . The distribution passes to the quotient 
and since it is one dimensional, it is clearly 
involutive. The leaves of the quotient distribution 
are curves that wind around the torus indefinitely, 
and each curve is dense in T 2 . Therefore, any 
smooth function h{x) that is constant on such a 
leaf is constant on all of T 2 . Hence, the local 
foliation does not extend to a global foliation. 

Another delicate question is whether the quo¬ 
tient space of Ad by a foliation induced by an in¬ 
volutive distribution is a smooth manifold (Suss- 
mann 1975). This is always true locally, but it 
may not hold globally. Think of the foliation of 
T 2 discussed above. 

Given k < n vector fields f l (x),...,f k (x) 
that are linearly independent at each x and that 
commute, [f l , / ; ] (x) = 0, there exists a local 
change of coordinates z = z(x) so that in the new 
coordinates the vector fields are the first k unit 
vectors. 

The involutive closure V of V is the smallest 
involutive distribution containing V. As with all 
distributions, we always assume implicitly that 
is nonsingular. A point x l is 77-accessible from 
x° if there exists a continuous and piecewise 
smooth curve joining x° to x l whose left and 
right tangent vectors are always in D. Obvi¬ 
ously 77-accessibility is an equivalence relation. 
Chow’s theorem (1939) asserts that its equiva¬ 
lence classes are the leaves of the foliation in¬ 
duced by V. Chow’s theorem goes a step further. 
Suppose / ! (x),..., f k (x) span D(x) at each 
x e M then given any two points x°, x 1 in a 
leaf of T>, there is a continuous and piecewise 
smooth curve joining x° to x l whose left and 
right tangent vectors are always one of the f l (x). 

Controllability of Nonlinear Systems 

An initialized nonlinear system that is affine in 
the control is of the form 

X = f 0) + g(x)u 

= fix) + £7=1 g j ix) u i 

y = h(x) 

x(0) = x° (4) 


where the state x are local coordinates on an 
ft-dimensional manifold M, the control u is re¬ 
stricted to lie in some set U C IR m , and the output 
y takes values IR P . We shall only consider such 
systems. 

A particular case is a linear system of the form 
x = Fx + Gu 

y = Hx (5) 

x(0) = x° 

where M = IR n and U = IR m . 

The set A t (x°) of points accessible at time 
t > 0 from x° is the set of all x 1 e M such 
that there exists a bounded, measurable control 
trajectory u(s) e U, 0 < s < t, so that the 
solution of (4) satisfies x(t) = x 1 . We define 
A(x°) as the union of A t (x°) for all t > 0. The 
system (4) is said to be controllable at time t > 0 
if At (x°) = M and controllable in forward time 
if *4(x°) = M. 

For linear systems, controllability is a rather 
straightforward matter, but for nonlinear systems 
it is more subtle with numerous variations. The 
variation of constants formula gives the solution 
of the linear system (5) as 

x(t ) = e Ft x° + f e F ^~ s ^Gu(s) ds 

Jo 

so A t (x°) is an affine subspace of IR n for any 
^ > 0. It is not hard to see that the columns of 

[G ... F n ~ l G] (6) 

are tangent to this affine subspace, so if this 
matrix is of rank ft, then A t (x°) = IR n for any 
t > 0. This is the so-called controllability rank 
condition for linear systems. 

Turning to the nonlinear system (4), let V 
be the distribution spanned by the vector fields 
/(x), g 1 (x),..., g m (x), and let V be its involu¬ 
tive closure. It is clear that A(x°) is contained in 
the leaf of V through x°. Krener (1971) showed 
that *4(x°) has nonempty interior in this leaf. 
More precisely, Al(x°) is between an open set and 
its closure in the relative topology of the leaf. 



280 


Differential Geometric Methods in Nonlinear Control 


Let Vo be the smallest distribution containing 
the vector fields g l (x),..., g m (x) and invariant 
under bracketing by /(x), i.e., 

[f,n o]cD 0 

and let Vo be its involutive closure. Sussmann and 
Jurdjevic (1972) showed that A t (x°) is in the leaf 
of Vo through x° , and it is between an open set 
and its closure in the topology of this leaf. 

For linear systems (5) where /(v) = Fx and 
gJ (x) = G 7 , the j th column of G, it is not hard 
to see that 


ad k f (g j )(x ) = (-1 ) k F k G j 

so the nonlinear generalization of the controlla¬ 
bility rank condition is that at each v the dimen¬ 
sion of E>o(x) is n. This guarantees that A t (x°) 
is between an open set and its closure in the 
topology of A4 . 

The condition that 

dimension D (x) = n (7) 

is referred to as the nonlinear controllability rank 
condition. This guarantees that ^4(v°) is between 
an open set and its closure in the topology of M . 

There are stronger interpretations of control¬ 
lability for nonlinear systems. One is short time 
local controllability (STLC). The definition of 
this is that the set of accessible points from x° in 
any small t > 0 with state trajectories restricted 
to an arbitrarily small neighborhood of x° should 
contain v° in its interior. Hermes (1994) and 
others have done work on this. 


Observability for Nonlinear Systems 

Two possible initial states v°, x l for the non¬ 
linear system are distinguishable if there exists 
a control w(-) such that the corresponding out¬ 
puts y x (t) are not equal. They are short 

time distinguishable if there is an w(-) such that 
y®(t) 7 ^ y l (t) for all small t > 0. They are lo¬ 
cally short time distinguishable if in addition the 
corresponding state trajectories do not leave an 


arbitrarily small open set containing v°, x 1 . The 
open set need not be connected. A nonlinear sys¬ 
tem is (short time, locally short time) observable 
if every pair of initial states is (short time, locally 
short time) distinguishable. Finally, a nonlinear 
system is (short time, locally short time) locally 
observable if every x° has a neighborhood such 
that every other point x l in the neighborhood 
is (short time, locally short time) distinguishable 
from v°. 

For a linear system (5), all these definitions 
coalesce into a single concept of observability 
which can be checked by the observability rank 
condition which is that the rank of 


H 

HF 

HF n ~ l 


( 8 ) 


equals n. 

The corresponding concept for nonlinear sys¬ 
tems involves £, the smallest codistribution con¬ 
taining dh\(x),, dh p (x) that is invariant un¬ 
der repeated Lie differentiation by the vector 
fields f(x),g x {x),... ,g m {x). Let 

E{x) = {&>(v) : co e £} 

The nonlinear observability rank condition is 

dimension E(x) = n (9) 

for all x G M. This condition guarantees that 
the nonlinear system is locally short time, locally 
observable. It follows from (3) that £ is spanned 
by a set of exact one form, so its dual distribution 
£* is involutive. 

For a linear system (5), the input-output map¬ 
ping from x(0) = x l , i = 0,1 is 

y (r ) = He Ft x i + f He F(, ~ s) Gu(s ) ds 

Jo 


The difference is 

= He F, (x l -x°) 
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So if one input u (•) distinguishes x° from x 1 , then 
so does every input. 

For nonlinear systems, this is not necessarily 
true. That prompted Gauthier et al. (1992) to 
introduce a stronger concept of observability for 
nonlinear systems. For simplicity, we describe 
it for scalar input and scalar output systems. A 
nonlinear system is uniformly observable for any 
input if there exist local coordinates so that it is 
of the form 

y = x i + h(u ) 

X\ =x 2 + 

X n —\ — X n + fn—l (xi , . . . , X n —2, u) 

*n = fn(x,U ) 

Cleary if we know w(-), y(-), then by repeated 
differentiation of y(-), we can reconstruct x(-). 
It has been shown that for nonlinear systems 
that are uniformly observable for any input, the 
extended Kalman filter is a locally convergent ob¬ 
server (Krener 2002a), and the minimum energy 
estimator is globally convergent (Krener 2002b). 

Minimal Realizations 

The initialized nonlinear system (4) can be 
viewed as defining a input-output mapping from 
input trajectories w(-) to output trajectories y(-). 
Is it a minimal realization of this mapping, does 
there exists an initialized nonlinear system on a 
smaller dimensional state space that realizes the 
same input-output mapping? 

Kalman showed that (5) initialized at x° = 0 
is minimal iff the controllability rank condition 
and the observability rank condition hold. He 
also showed how to reduce a linear system to a 
minimal one. 

If the controllability rank condition does not 
hold, then the span of ( 6 ) dimension is k < n. 
This subspace contains the columns of G and 
is invariant under multiplication by F. In fact, 
it is the maximal subspace with these proper¬ 
ties. So the linear system can be restricted to 


this k -dimensional subspace, and it realizes the 
same input-output mapping from x° = 0. The 
restricted system satisfies the controllability rank 
condition. 

If the observability rank condition does not 
hold, then the kernel of (8) is a subspace of IR nxl 
of dimension n — l >0. This subspace is in the 
kernel of H and is invariant under multiplication 
by F. In fact, it is the maximal subspace with 
these properties. Therefore, there is a quotient 
linear system on the IR nXl mod, the kernel of ( 8 ) 
which has the same input-output mapping. The 
quotient is of dimension l < n, and it realizes the 
same input-output mapping. The quotient system 
satisfies the observability rank condition. 

By employing these two steps in either order, 
we pass to a minimal realization of the input- 
output map of (5) from x° = 0. Kalman also 
showed that two linear minimal realizations differ 
by a linear change of state coordinates. 

An initialized nonlinear system is a realiza¬ 
tion of minimal dimension of its input-output 
mapping if the nonlinear controllability rank con¬ 
dition (7) and the nonlinear observability rank 
condition (9) hold. 

If the nonlinear controllability rank condition 
(7) fails to hold because the dimension of D(x) 
is k < n , then by replacing the state space M 
with the k dimensional leaf through x° of the 
foliation induced by V, we obtain a smaller state 
space on which the nonlinear controllability rank 
condition (7) holds. The input-output mapping is 
unchanged by this restriction. 

Suppose the nonlinear observability rank con¬ 
dition (9) fails to hold because the dimension of 
E(x ) is / < n. Then consider a convex neigh¬ 
borhood AT of x°. The distribution £* induces 
a local foliation of Af into leaves of dimension 
n — l > 0. The nonlinear system leaves this 
local foliation invariant in the following sense. 
Suppose x° and x l are on the same leaf then if 
x 1 (t) is the trajectory starting at x l , then x°(t) 
and x l (t) are on the same leaf as long as the 
trajectories remain in AT. Furthermore, h(x) is 
constant on leaves so y°(t) = y x (t). Hence, there 
exists locally a nonlinear system whose state 
space is the leaf space. On this leaf space, the 
nonlinear observability rank condition holds (9), 
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and the projected system has the same input- 
output map as the original locally around x°. 
If the leaf space of the foliation induced by 
£* admits the structure of a manifold, then the 
reduced system can be defined globally on it. 
Sussmann (1973) and Sussmann (1977) stud¬ 
ied minimal realizations of analytic nonlinear 
systems. 

The state space of two minimal nonlinear 
systems need not be diffeomorphic. Consider the 
system 

jc = u, y = sinx 

where x, u are scalars. We can take the state 
space to be either M = IR or M = S 1 , 
and we will realize the same input-output map¬ 
ping. These two state spaces are certainly not 
diffeomorphic but one is a covering space of 
the other. 


Lie Jet and Approximations 

Consider two initialized nonlinear controlled dy¬ 
namics 

* = f °( X ) + £ 7=1 f j (x)uj 

x(0) = x° (10) 

z = g°(z) + £7=1 gj n n 

z(0) = z° (11) 


= [g J, [...\g j 2 ,g jl ]...]\(z°) (13) 

for 1 < / < k. 

On the other hand, if there is a linear map 
L such that (13) holds for 1 < / < k, then 
there exists a smooth mapping <f>(v) = z and 
constants M > 0,€ > 0 such that for any 
||w(OH < 1, the corresponding trajectories x(t) 
andz(0 satisfy (12). 

The k-Lie jet of (10) at x° is the tree of 
brackets [/ Jl [...[/ 72 , f Jl ].. .]](x°) for 1 < / < 
k. In some sense these are the coordinate-free 
Taylor series coefficients of (10) at x°. 

The dynamics (10) is free-nilpotent of degree 
k if all these brackets are as linearly independent 
as possible consistent with skew symmetry and 
the Jacobi identity and all higher degree brackets 
are zero. If it is free to degree k , the controlled 
dynamics (10) can be used to approximate any 
other controlled dynamics (11) to degree k. Be¬ 
cause it is nilpotent, integrating (10) reduces to 
repeated quadratures. If all brackets with two or 
more f l , 1 < i < m are zero at x°, then (10) is 
linear in appropriate coordinates. If all brackets 
with three or more f l , 1 < i < m are zero at x°, 
then (10) is quadratic in appropriate coordinates. 
The references Krener and Schaettler (1988) and 
Krener (2010a,b) discuss the structure of the 
reachable sets for such systems. 

Disturbance Decoupling 


Suppose that (10) satisfies the nonlinear 
controllability rank condition (7). Further, 
suppose that there is a smooth mapping O(v) = z 
and constants M > 0, € > 0 such that for any 
IH0H < 1 , the corresponding trajectories x(t) 
and z(t ) satisfy 

||<&(x(0)-z(0ll < M t k+1 ( 12 ) 

for 0 < t < 6 . 

Then it is not hard to show that the linear 
map L = |j(jc°) takes brackets up to order k 
of the vector fields / J evaluated at x° into the 
corresponding brackets of the vector fields 
evaluated at z°, 


Consider a control system affected by a distur¬ 
bance input w{t) 

X = f(x) + g(x)u + b(x)w 
y = h(x) 

The disturbance decoupling problem is to find 
a feedback u = k(x), so that in the closed- 
loop system, the output y(t) is not affected by 
the disturbance w{t). Wonham and Morse (1970) 
solved this problem for a linear system 

x = Fx + Gu + Bw 
y = Hx (15) 
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To do so, they introduced the concept of an 
F, G invariant subspace. A subspace V C IR n is 
F, G invariant if 

FVcV + G (16) 

where Q is the span of the columns of G. It is 
easy to see that V is F, G invariant iff there exists 
a K e IR mxn such that 

(F + GK)V C V (17) 

The feedback gain K is called a friend of V. 

It is easy from (16) that if V 1 is F, G invariant 
for i = 1,2, then V 1 + V 2 is also. So there exists 
a maximal F, G invariant subspace y max in the 
kernel of H . Wonham and Morse showed that the 
linear disturbance decoupling problem is solvable 
iff B C V max where B is the span of the columns 
of B. 

Isidori et al. (1981a) and independently 
Hirschorn (1981) solved the nonlinear distur¬ 
bance decoupling problem. A distribution V is 
locally /, g invariant if 


distribution p max in the kernel of dh. Moreover, 
this distribution is involutive. The disturbance 
decoupling problem is locally solvable iff 
columns of b{x) are contained in 77 max . If 
M is simply connected, then the disturbance 
decoupling problem is globally solvable iff 
columns of b{x) are contained in D max . 

Conclusion 

We have briefly described the role that differential 
geometric concepts played in the development 
of controllability, observability, minimality, 
approximation, and decoupling of nonlinear 
systems. 

Cross-References 

► Feedback Linearization of Nonlinear Systems 

► Lie Algebraic Methods in Nonlinear Control 

► Nonlinear Zero Dynamics 


[f,v\ cD + r 
[g j ,v ] cv + r 


( 18 ) 


for j = 1 ,,m where r is the distribution 
spanned by the columns of g. In Isidori et al. 
(1981b) it is shown that if V is locally f,g 
invariant, then so is its involutive closure. 

A distribution V is /, g invariant if there exists 
a(x) e IR m x 1 and invertible 0(x) e IR mxm such 
that 

[/ + got, v. 

[Zjg’tf.v 

for k = 1,..., m. It is not hard to see that a /, g 
invariant is locally /, g invariant. It is shown in 
Isidori et al. (1981b) that if V is a locally /, g 
invariant distribution, then locally there exists 
ct(x) and /3(v) so that (19) holds. Furthermore, 
if the state space is simply connected, then a(x) 
and P (x) exist globally, but the matrix field ft (x) 
may fail to be invertible at some v. 

From (18), it is clear that if V 1 is locally /, g 
invariant for i = 1,2, then so is V 1 + V 2 . Hence, 
there exists a maximal locally /, g invariant 


C V 
C V 


(19) 
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Abstract 

Disaster response robots are robotic systems used 
for preventing the worsening of disaster dam¬ 
age under emergent situations. Robots for nat¬ 
ural disasters (water disaster, volcano eruption, 
earthquakes, landslides, and fire) and man-made 
disasters (explosive ordnance disposal, CBRNE 
disasters, Fukushima Daiichi nuclear power plant 
accident) are introduced. Technical challenges 
are described on the basis of generalized data 
flow. 

Keywords 

Rescue robot; Response robot 

Introduction 

Disaster response robots are robotic systems used 
for preventing the worsening of disaster damage 
under emergent situations, such as for search and 
rescue, recovery construction, etc. 

A disaster changes its state as time passes. 
The state starts as an unforeseen occurrence 
and proceeds to prevention phase, emergency 
response phase, recovery phase, and revival 
phase. Although a disaster response robot 
usually means a system for disaster response 
and recovery in a narrow sense a system used in 
every phase of disaster can be called a disaster 
response robot in a broad sense. 

When parties of firefighters and military per¬ 
sonnel respond to disasters, robots are among 
the technical equipments used. The purposes of 
robots are (1) to perform tasks that are impos¬ 
sible/difficult to perform by humans and con¬ 
ventional equipment, (2) to reduce responders’ 
risk of inflicting secondary damage, and (3) to 
improve rapidity/efficiency of tasks, by using 
remote/automatic robot equipment. 
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Response Robots for Natural 
Disasters 

Water Disaster 

Underwater robots (ROV, remotely operated ve¬ 
hicle) are deployed to responder organizations 
in preparation for water damage such as caused 
by tsunami, flood, cataract, and accidents in the 
sea and rivers. They are equipped with cameras 
and sonars and remotely controlled by crews 
via tether from land or shipboard within several 
tens of meters area for victim search and dam¬ 
age investigation. After the Great Eastern Japan 
Earthquake in 2011, Self Defense Force and vol¬ 
unteers of International Rescue System Institute 
(IRS) and Center for Robot-Assisted Search and 
Rescue (CRASAR) used various types of ROVs 
such as SARbot shown in Fig. 1 for victim search 
and debris investigation in the port. 

Volcano Eruption 

In order to reduce risk in monitoring and re¬ 
covery construction at volcano eruptions, appli¬ 
cation of robotics and remote systems is highly 
desired. Various types of UAVs (unmanned aerial 
vehicles) such as small-sized robot helicopters 
and airplanes have been used for this purpose. 


An unmanned construction system consists of 
teleoperated robot backhoes, trucks, and bulldoz¬ 
ers with wireless relaying cars and camera vehi¬ 
cles as shown in Fig. 2 and is remotely controlled 
from an operator vehicle. It has been used since 
the 1990s for remote civil engineering works 
from a distance of a few kilometers. 

Structural Collapse by Earthquakes, 
Landslides, etc. 

Small-sized UGVs (unmanned ground vehicles) 
were developed for victim search and monitoring 
in confined spaces of collapsed buildings and 
underground structures. VGTV X-treme shown 
in Fig. 3 is a tracked vehicle remotely operated 
via a tether. It was used for victim search at 
mine accidents and the 9/11 terror attack. Active 
scope camera shown in Fig. 4 is a serpentine robot 
like a fiberscope and has been used for forensic 
investigation of structural collapse accidents. 

Fire 

Farge-scale fires in chemical plants and forests 
sometimes have a high risk, and firefighters 
cannot approach near them. Remote-controlled 
robots with firefighting nozzles for water and 
chemical extinguishing agents are deployed. 



Disaster Response Robot, Fig. 1 SARbot (Courtesy of SeaBotix Inc.) http://www.seabotix.com/products/sarbot.htm 
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Disaster Response Robot, Fig. 2 Unmanned construction system (Courtesy of Society for Unmanned Construction 
Systems) http://www.kenmukyou.gr.jp/f_souti.htm 



VGTV X-treme 
in Lowered Position 




1 


and many other 
configurations 



VGTV X-treme 


in Raised Position 


Disaster Response Robot, Fig. 3 VGTV X-treme (Courtesy of Recce Robotics) http://www.recce-robotics.com/vgtv. 
html 


Disaster Response 
Robot, Fig. 4 Active 
scope camera (Courtesy of 
International Rescue 
System Institute) 
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Large-sized robots can discharge large volumes 
of the fluid with water cannons, whereas small¬ 
sized robots have better mobility. 

Response Robots for Man-Made 
Disasters 

Explosive Ordnance Disposal (EOD) 

Detection and disposal of explosive ordnance 
is one of the most dangerous tasks. TALON, 
PackBot, and Telemax are widely used in military 
and explosive ordnance disposal teams world¬ 
wide. Telemax has an arm with seven degrees 
of freedom on a tracked vehicle with four sub¬ 
tracks as shown in Fig. 5. It can observe narrow 
spaces like overhead lockers of airplanes and 
bottom of automobiles by cameras, manipulate 
objects by the arm, and deactivate explosives by a 
disrupter. 

CBRNE Disasters 

CBRNE (chemical, biological, radiological, 
nuclear, and explosive) disasters have a high risk 
and can cause large-scale damage because human 
cannot detect contamination by the Hazmat (haz¬ 
ardous materials). Application of robotic systems 
is highly expected for this disaster. PackBot has 
sensors for toxic industrial chemicals (TIC), 
blood agents, blister agents, volatile organic 


compounds (VOCs), radiation, etc., as options 
and can measure the Hazmat in dangerous 
confined spaces (Fig. 6). Quince was developed 
for research into technical issues of UGVs at 
CBRNE disasters and has high mobility on rough 
terrain (Fig. 7). 


Fukushima Daiichi Nuclear Power Plant 
Accident 

At the Fukushima Daiichi nuclear power plant 
accident caused by tsunami in 2011, various 
disaster response robots were applied. They 
contributed to the cool shutdown and decom¬ 
missioning of the plant. For example, PackBot 
and Quince gave essential data for task planning 
by shooting images and radiation measurement 
in nuclear reactor buildings there. Unmanned 
construction system removed debris outdoors 
that were contaminated by radiological materials 
and reduced the radiation rate there significantly. 

Group INTRA in France and KHG in Ger¬ 
many are organizations for responding to nu¬ 
clear plant accidents. They are equipped with 
robots and remote-controlled construction ma¬ 
chines for radiation measurement, decontamina¬ 
tion, and constructions in emergency. In Japan, 
the Assist Center for Nuclear Emergencies was 
established after the Fukushima Accident. 



Disaster Response Robot, Fig. 5 Telemax (Courtesy equipment/unmanned-systems/products-and-services/ 
of Cobham Mission Equipment) http://www.cobham. remote-controlled-robotic-solutions/telemax-explosive- 

corn/about-cobham/mission-systems/about-us/mission- ordnance-(eod)-robot.aspx 
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Disaster Response Robot, Fig. 6 PackBot (Courtesy of iRobot) http://www.irobot.com/us/leam/defense/packbot/ 
Specifications. aspx 



Disaster Response Robot, Fig. 7 Quince (Courtesy of International Rescue System Institute) 


Summary and Future Directions 

Data flow of disaster response robots is generally 
described by a feedback system as shown in 
Fig. 8. Robots change the states of objects and 
environment by movement and task execution. 
Sensors measure and recognize them, and their 


feedback enables the robots’ autonomous motion 
and work. The sensed data are shown to oper¬ 
ators via communication, data processing, and 
human interface. The operators give commands 
of motion and work to the system via the human 
interface. The system recognizes and transmits 
them to the robot. 
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Extreme Disaster Conditions 

Disaster Response Robot, Fig. 8 Data flow of remotely controlled disaster response robots 


Each functional block has its own technical 
challenges to be fulfilled under extreme 
environments of disaster space in response to 
the objectives and the conditions. They should be 
solved technically in order to improve the robot 
performance. They include insufficient mobility 
in disaster spaces (steps, gaps, slippage, narrow 
space, obstacles, etc.), deficient workability 
(dexterity, accuracy, speed, force, work space, 
etc.), poor sensors and sensor data processing 
(image, recognition, etc.), lack of reliability and 
performance of autonomy (robot intelligence, 
multiagent collaboration, etc.), issues of wireless 
and wired communication (instability, delay, 
capacity, tether handling, etc.), operators’ 
limitations (situation awareness, decision ability, 
fatigue, mistake, etc.), basic performances 
(explosion proof, weight, durability, portability, 
etc.), and system integration that combines the 
components into the solution. Mission critical 
planning and execution including human factors, 
training, role sharing, logistics, etc. have to be 
considered at the same time. 

Research into systems and control is expected 
to solve the abovementioned challenges of com¬ 
ponents and systems. For example, intelligent 
control is essential for mobility and workability 
under extreme conditions; control of feedback 
systems including long delay and dynamic in¬ 
stability, control of human-in-loop systems, and 
system integration of heterogeneous systems are 
important research topics of systems and control. 

In the research field of disaster robotics, 
various competitions of practical robots have 


been held, e.g., RoboCupRescue targeting 
CBRNE disasters, ELROB and euRathlon 
for field activities, MAGIC for multi-robot 
autonomy, and DARPA Robotics Challenge for 
humanoid robots in nuclear disasters. These 
competitions seek to stimulate solutions of the 
above-mentioned technical issues in different 
environments by providing practical test beds for 
advanced technology developments. 


Cross-References 

► Robot Teleoperation 

► Walking Robots 

► Wheeled Robots 
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Abstract 

The causes of the complex behavior typical of hy¬ 
brid systems are multifarious and are commonly 
explained in the literature using paradigms that 
are mainly focused on the connections between 
time-driven and hybrid systems. In this entry, 
we recall some of these paradigms and further 
explore the connections between discrete event 
and hybrid systems from other perspectives. In 
particular, the role of abstraction in passing from 
a hybrid model to a discrete event one and vice 
versa is discussed. 


Keywords 

Hybrid system; Logical discrete event system; 
Timed discrete event system 


Introduction 

Hybrid systems combine the dynamics of both 
time-driven systems and discrete event systems. 

The evolution of a time-driven system can be 
described by a differential equation (in continu¬ 
ous time) or by a difference equation (in discrete 
time). An example of such a system is the tank 
shown in Fig. 1 whose behavior, assuming the 
tank is not full, is ruled in continuous time t by 
the differential equation 

d 

— F(?) = q x ( t )-q 2 (t) 

where V is the volume of liquid and q\ and q 2 
are, respectively, the input and output flow. 


A discrete event system (Lafortune and Cas- 
sandras 2007; Seatzu et al. 2012) evolves in ac¬ 
cordance with the abrupt occurrence, at possibly 
unknown irregular intervals, of physical events. 
Its states may have logical or symbolic, rather 
than numerical, values that change in response 
to events which may also be described in non- 
numerical terms. An example of such a system 
is a robot that loads parts on a conveyor, whose 
behavior is described by the automaton in Fig. 2. 
The robot can be “idle,” “loading” a part, or in an 
“error” state when a part is incorrectly positioned. 
The events that drive its evolution are a (grasp a 
part), b (part correctly loaded), c (part incorrectly 
positioned), and d (part repositioned). In a logi¬ 
cal discrete event system (► Supervisory Control 
of Discrete-Event Systems), the timing of event 
occurrences are ignored, while in a timed discrete 
event system (►Models for Discrete Event Sys¬ 
tems: An Overview), they are described by means 
of a suitable timing structure. 

In a hybrid system (►Hybrid Dynamical 
Systems, Feedback Control of), time-driven 
and event-driven evolutions are simultaneously 
present and mutually dependent. As an example, 
consider a room where a thermostat maintains 
the temperature x(t) between x a = 20 °C 

and Xb = 22 °C by turning a heat pump on 
and off. Due to the exchange with the external 
environment at temperature x(t ), when the 

pump is off, the room temperature derivative is 

d 

—x(t) = — k[x(t) — x e ] 
dt 

where k is a suitable coefficient, while when the 
pump is on, the room temperature derivative is 

—x(f) = h(t) — k[x(t ) — x e ] 
dt 

where the positive term h(t) is due to the heat 
pump. The hybrid automaton that describes this 
system is shown in Fig. 3. 

The causes of the complex behavior typical 
of hybrid systems are multifarious, and among 
the paradigms commonly used in the literature to 
describe them, we mention three. 
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Discrete Event Systems 
and Hybrid Systems, 
Connections Between, 
Fig. 1 A tank 



Discrete Event Systems 
and Hybrid Systems, 
Connections Between, 
Fig. 2 A machine with 
failures 


Discrete Event Systems 
and Hybrid Systems, 
Connections Between, 
Fig. 3 Hybrid automaton 
of the thermostat 
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• Logically controlled systems. Often, a phys¬ 
ical system with a time-driven evolution is 
controlled in a feedback loop by means 
of a controller that implements discrete 
computations and event-based logic. This is 
the case of the thermostat mentioned above. 
Classes of systems that can be described 
by this paradigm are embedded systems or, 
when the feedback loop is closed through 
a communication network, cyber-physical 
systems. 

• State-dependent mode of operation. A time- 
driven system can have different modes of 
evolution depending on its current state. As 
an example, consider a bouncing ball. While 
the ball is above the ground (vertical position 
h > 0) its behavior is that of a falling body 
subject to a constant gravitational force. How¬ 
ever, when the ball collides with the ground 


(vertical position h = 0), its behavior is that 
of a (partially) elastic body that bounces up. 
Classes of systems that can be described by 
this paradigm are piecewise affine systems and 
linear complementarity system. 

• Variable structure systems. Some systems 
may change their structure assuming different 
configuration, each characterized by a 
different behavior. As an example, consider 
a multicell voltage converter composed by a 
cascade of elementary commutation cells: 
controlling some switches, it is possible 
to insert or remove cells so as to produce 
a desired output voltage signal. Classes 
of systems that can be described by this 
paradigm are switched systems. 

While these are certainly appropriate and 
meaningful paradigms, they are mainly focused 
on the connections between time-driven and 
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hybrid systems. In the rest of this entry, we 
will discuss the connections between discrete 
event and hybrid systems from other different 
perspectives. The focus is strictly on modeling, 
thus approaches for analysis or control will not 
be discussed. 

From Hybrid Systems to Discrete 
Event System by Modeling 
Abstraction 

A system is a physical object, while a model is a 
(more or less accurate) mathematical description 
of its behavior that captures those features that are 
deemed mostly significant. In the previous pages, 
we have introduced different classes of systems, 
such as “time-driven systems,” “discrete event 
systems,” and “hybrid systems,” but properly 
speaking, this taxonomy pertains to the models 
because the terms “time driven,” “discrete event,” 
or “hybrid” should be used to classify the mathe¬ 
matical description and not the physical object. 

According to this view, a discrete event model 
is often perceived as a high-level description of a 
physical system where the time-driven dynamics 
are ignored or, at best, approximated by a timing 
structure. This procedure to derive a simpler 
model in a way that preserves the properties being 
analyzed while hiding the details that are of no 
interest is called abstraction (Alur et al. 2000). 

Consider, as an example, the thermostat in 
Fig. 3. In such a system, the time-driven evolution 
determines a change in the temperature, which in 
turn - reaching a threshold - triggers the occur¬ 
rence of an event that changes the discrete state. 
Assume one does not care about the exact form 
this triggering mechanism takes and is only inter¬ 
ested in determining if the heat pump is turned on 
or off. In such a case, we can completely abstract 
the time-driven evolution obtaining a logical dis¬ 
crete event model such as the automaton in Fig. 4, 
where label a denotes the event the temperature 
drops below 20 °C and label b denotes the event 
the temperature raises over 22 °C. 

For some purposes, e.g., to determine the 
utilization rate of the heat pump and thus its 
operating cost, the model in Fig. 4 is inadequate. 
In such a case, one can consider a less coarse 



a 


Discrete Event Systems Fig. 4 Logical discrete 
and Hybrid Systems, event model of the 
Connections Between, thermostat 

abstraction of the hybrid model in Fig. 3 
obtaining a timed discrete event model such 
as the automaton in Fig. 4. Here, to each event 
is associated a firing delay: as an example, 8 a 
represents the time it takes - when the pump is 
off - to cool down until the lower temperature 
threshold is reached and event a occurs. The 
delay may be a deterministic value or even a ran¬ 
dom one to take into account the uncertainty due 
to non-modeled time-varying parameters such 
as the temperature of the external environment. 
Note that a new state (START) and a new event b' 
have now been introduced to capture the transient 
phase in which the room temperature, from the 
initial value x(0) = 15 °C, reaches the higher 
temperature threshold: in fact, event b f has a 
delay greater than the delay of event b. 

Timed Discrete Event Systems 
Are Hybrid Systems 

Properly speaking, all timed discrete event 
systems may also be seen as hybrid systems if one 
considers the dynamics of the timers - that spec¬ 
ify the event occurrence - as elementary time- 
driven evolutions. In fact, the simplest model of 
hybrid systems is the timed automaton introduced 
by Alur and Dill (1994) whose main feature is 
the fact that each continuous variable x(t) has a 
constant derivative x(t) = 1 and thus can only 
describe the passage of time. Incidentally, we 
note that the term “timed automaton” is also used 
in the area of discrete event systems (Lafortune 
and Cassandras 2007) to denote an automaton 
in which a timing structure is associated to the 
events: such an example was shown in Fig. 5. To 
avoid any confusion, in the following, we denote 
the former model Alur-Dill automaton and the 
latter model timed DES automaton. 
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In Fig. 6 is shown an Alur-Dill automaton 
that describes the thermostat, where the time- 
driven dynamics have been abstracted and only 
the timing of event occurrence is modeled as in 
the timed DES automaton in Fig. 5. The only 
continuous variable is the value of a timer 8: 
when it goes beyond a certain threshold (e.g., 
8 > 8 a ), an event occurs (e.g., event a) changing 
the discrete state (e.g., from OFF to ON) and 
resetting the timer to zero. 

It is rather obvious that the behavior of the 
Alur-Dill automaton in Fig. 6 is equivalent to 
the behavior of timed DES automaton in Fig. 5. 
In the former model, the notion of time is en¬ 
coded by means of an explicit continuous vari¬ 
able 8. In the latter model, the notion of time 
is implicitly encoded by the timer that during 
an evolution will be associated to each event. 
In both cases, however, the overall state of the 
systems is described by a pair (l(t),x(t)) where 
the first element l takes value in a discrete set 
{START, ON, OFF } and the second element is 
a vector (in this particular case with a single 
component) of timer valuations. 

It should be pointed out that an Alur-Dill 
automaton may have a more complex structure 
than that shown in Fig. 6: as an example, the 
guard associated to a transition, i.e., the values of 
the timer that enable it, can be an arbitrary rect¬ 
angular set. However, the same is also true for a 
timed discrete event system: several policies can 
be used to define the time intervals enabling an 

START OFF a (8<l) ON 
b(S b ) 

Discrete Event Systems and Hybrid Systems, Connec¬ 
tions Between, Fig. 5 Timed discrete event model of the 
thermostat 


event (enabling policy) or to specify when a timer 
is reset (memory policy) (Ajmone Marsan et al. 
1995). Furthermore, timed discrete event system 
can have arbitrary stochastic timing structures 
(e.g., semi-Markovian processes, Markov chains, 
and queuing networks (Lafortune and Cassandras 
2007)), not to mention the possibility of having 
an infinite discrete state space (e.g., timed Petri 
nets (Ajmone Marsan et al. 1995; David and 
Alla 2004)). As a result, we can say that timed 
DES automata are far more general than Alur-Dill 
automata and represent a meaningful subclass of 
hybrid systems. 


From Discrete Event System to Hybrid 
Systems by Fluidization 

The computational complexity involved in the 
analysis and optimization of real-scale problems 
often becomes intractable with discrete event 
models due to the very large number of reachable 
states, and a technique that has shown to be 
effective in reducing this complexity is called 
fluidization (►Applications of Discrete-Event 
Systems). It should be noted that the derivation 
of a fluid (i.e., hybrid) model from a discrete 
event one is yet an example of abstraction albeit 
going in opposite direction with respect to the 
examples discussed in the section “From Hybrid 
Systems to Discrete Event System by Modeling 
Abstraction” above. 

The main drive that motivated the fluidization 
approach derives from the observation that some 
discrete event systems are “heavily populated” 
in the sense that there are many identical items 
in some component (e.g., clients in a queue). 
Fluidization consists in replacing the integer 
counter of the number of items by a real number 
and in approximating the “fast” discrete event 
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dynamics that describe how the counter changes 
by a continuous dynamics. This approach has 
been successfully used to study the performance 
optimization of fluid-queuing networks (Cas- 
sandras and Lygeros 2006) or Petri net models 
(Balduzzi et al. 2000; David and Alla 2004; 
Silva and Recalde 2004) with applications in 
domains such as manufacturing systems and 
communication networks. We also remark that 
in general, different fluid approximations are 
necessary to describe the same system, depending 
on its discrete state, e.g., in the manufacturing 
domain, machines working or down, buffers full 
or empty, and so on. Thus, the resulting model 
can be better described as a hybrid model, where 
different time-driven dynamics are associated to 
different discrete states. 

There are many advantages in using fluid ap¬ 
proximations. First, there is the possibility of 
considerable increase in computational efficiency 
because the simulation of a fluid model can often 
be performed much faster than that of its discrete 
event counterpart. Second, fluid approximations 
provide an aggregate formulation to deal with 
complex systems, thus reducing the dimension 
of the state space. Third, the resulting simple 
structures often allow explicit computation of 
performance measures. Finally, some design pa¬ 
rameters in fluid models are continuous; hence, it 
is possible to use gradient information to speed up 
optimization and to perform sensitivity analysis 
(Balduzzi et al. 2000): in many cases, it has also 
been shown that fluid approximations do not in¬ 
troduce significant errors when carrying out per¬ 
formance analysis via simulation (► Perturbation 
Analysis of Discrete Event Systems). 

Cross-References 

► Applications of Discrete-Event Systems 

► Hybrid Dynamical Systems, Feedback Control 
of 

► Models for Discrete Event Systems: An 
Overview 

► Perturbation Analysis of Discrete Event Sys¬ 
tems 

► Supervisory Control of Discrete-Event Systems 
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Abstract 

Discrete optimal control is a branch of mathe¬ 
matics which studies optimization procedures for 
controlled discrete-time models - that is, the opti¬ 
mization of a performance index associated with 
a discrete-time control system. This entry gives 
an introduction to the topic. The formulation of 
a general discrete optimal control problem is de¬ 
scribed, and applications to mechanical systems 
are discussed. 
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Definition 

Discrete optimal control is a branch of math¬ 
ematics which studies optimization procedures 
for controlled discrete-time models, that is, the 
optimization of a performance index associated 
to a discrete-time control system. 

Motivation 

Optimal control theory is a mathematical disci¬ 
pline with innumerable applications in both sci¬ 
ence and engineering. Discrete optimal control is 
concerned with control optimization for discrete¬ 
time models. Recently, in discrete optimal control 
theory, a great interest has appeared in develop¬ 
ing numerical methods to optimally control real 
mechanical systems, as for instance, autonomous 
robotic vehicles in natural environments such as 
robotic arms, spacecrafts, or underwater vehicles. 

During the last years, a huge effort has been 
made for the comprehension of the fundamen¬ 
tal geometric structures appearing in dynamical 
systems, including control systems and optimal 
control systems. This new geometric understand¬ 
ing of those systems has made possible the con¬ 
struction of suitable numerical techniques for 
integration. A collection of ad hoc numerical 
methods are available for both dynamical and 
control systems. These methods have grown up 
accordingly with the needs in research coming 
from different fields such as physics and en¬ 
gineering. However, a new breed of ideas in 
numerical analysis has started recently. They in¬ 
corporate the geometry of the systems into the 
analysis and that allows faster and more accurate 
algorithms and with less spurious effects than the 
traditional ones. All this gives birth to a new field 
called Geometric Integration (Hairer et al. 2002). 
For instance, numerical integrators for Hamil¬ 
tonian systems should preserve the symplectic 


structure underlying the geometry of the system. 
If so, they are called symplectic integrators. 

Another approach used by more and more au¬ 
thors is based on the theory of discrete mechanics 
and variational integrators to obtain geometric 
integrators preserving some of the geometry of 
the original system (Hussein et al. 2006; Marsden 
and West 2001 ; Wendlandt and Marsden 1997a,b) 
(see also the section “Discrete Mechanics”). 
These geometric integrators are easily adapted 
and applied to a wide range of mechani¬ 
cal systems: forced or dissipative systems, 
holonomically constrained systems, explicitly 
time-dependent systems, reduced systems with 
frictional contact, nonholonomic dynamics, and 
multisymplectic field theories, among others. 

As before, in optimal control theory, it is 
necessary to distinguish two kinds of numerical 
methods: the so-called direct and indirect meth¬ 
ods. If we use direct methods, we first discretize 
the state and control variables, control equations, 
and cost functional, and then we solve a nonlinear 
optimization problem with constraints given by 
the discrete control equations, additional con¬ 
straints, and boundary conditions (Bock and Plitt 
1984; Bonnans and Laurent-Varin 2006; Hager 
2001; Pytlak 1999). In this case, we typically 
need to solve a system of the type (see the sec¬ 
tion “Formulation of a General Discrete Optimal 
Control Problem”) 

( minimize F(X) X = (q °,..., q N , u \,... un ) 
with T'(X) = 0 
O(A) > 0 

On the other hand, indirect methods consist of 
solving numerically the boundary value problem 
obtained from the equations after applying Pon- 
tryagin’s Maximum Principle. 

The combination of direct methods and 
discrete mechanics allows to obtain numerical 
control algorithms which are geometric structure 
preserving and exhibit a good long-time behavior 
(Bloch et al. 2013; Jimenez et al. 2013; Junge 
and Ober-Blobaum 2005; Junge et al. 2006; 
Kobilarov 2008; Leyendecker et al. 2007; Ober- 
Blobaum 2008; Ober-Blobaum et al. 2011). 
Furthermore, it is possible to adapt many of 
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the techniques used for continuous control 
mechanical systems to the design of quantitative 
and qualitative accurate numerical methods 
for optimal control methods (reduction by 
symmetries, preservation of geometric structures, 
Lie group methods, etc.). 

Formulation of a General Discrete 
Optimal Control Problem 

Let M be an ft-dimensional manifold, x denote 
the state variables in M for an agent’s environ¬ 
ment, and u e U C M m be the control or action 
that the agent chooses to accomplish a task or 
objective. Let fd(x,u ) e M be the resulting state 
after applying the control u to the state x. For 
instance, x may be the configuration of a vehicle 
at time t and u its fuel consumption, and then 
fd(x, u ) is the new configuration of the vehicle 
at time t + h, with t,h > 0. Of course, we 
want to minimize the fuel consumption. Hence, 
the optimal control problem consists of finding 
the cheapest way to move the system from a 
given initial position to a final state. The problem 
can be mathematically described as follows: find 
a sequence of controls (mq, mi, ..., ftjv-i) and a 
sequence of states (xo, xi,..., x#) such that 

Xk+I = fd(k,x k ,u k ), (1) 

where x k e M, u k £ U, and the total cost 
N -1 

Cd — ^ ' Cj (k, x k , uf) + (pd (N , xtv) (2) 

k =o 


is minimized where (pd is a function of the final 
time and state at the final time (the terminal 
payoff) and Cj is a function depending on the 
discrete time, the state, and the control at each 
intermediate discrete time k (the running payoff). 

To solve the discrete optimal control prob¬ 
lem determined by Eqs. (1) and (2), it is pos¬ 
sible to use the classical Lagrangian multiplier 
approach. In this case, we consider the control 
equations (1) as constraint equations associating 
a Lagrange multiplier to each constraint. Assume 
for simplicity that M = W 1 . Then, we construct 
the augmented cost function 


N -1 

^ ^ (X]c +1 fd(k,x k ,u k )) 

k =0 

-C d (k,x k , u k )j - $d (N, x N ) (3) 

where p k eR n ,k = 1,..., A, are considered as 
the Lagrange multipliers. The notation xy is used 
for the scalar (inner) product x • y of two vectors 
in R n . 

From the pseudo-Hamiltonian function 

H d (k,x k , pk+i,u k ) = p k +\fd(k,x k ,u k ) 
—Cd(k,x k ,u k ), 

we deduce the necessary conditions for a con¬ 
strained minimum: 


X k -\-\ 


dH d 


(k, x k , p k +i , u k ) 


Pk 


dp 

= fd(k, X k ,u k ) 

dH d 


dq 


( k , x k , p k +i, u k ) 


d f“ n, , 

= Pk+i-x—(k,x k ,u k ) 
dq 

dCd tu 

— —(k,x k ,u k ) 
dq 

n dHd a- ^ 

0 = -r—(k,X k ,Pk+l,Uk) 
du 

dfd ,, . 

= Pk+i-x~(k,x k ,u k ) 
du 

dCd ^ , 

— —(k,x k ,u k ) 

du 


(4) 


(5) 


( 6 ) 


where 0 < k < N — 1. Moreover, we have some 
boundary conditions 


xo is given and Pn 


d®d 

dq 


(N, x N ) 


(7) 


The variable p k is called the costate of the system 
and Eq. (5) is called the adjoint equation. Observe 
that the recursion of x k given by Eq. (4) develops 
forward in the discrete time, but the recursion of 
the costate variable is backward in the discrete 
time. 
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In the sequel, it is assumed the following 
regularity condition: 


First, we will need an introduction to discrete 
mechanics and variational integrators. 


det 



7^0 


Discrete Mechanics 


where 1 < a, b < m, and ( u a ) e U c M m . Ap¬ 
plying the implicit function theorem, we obtain 
from Eq. (1) that locally u k = g(/:, /?£+i)- 

Defining the function 

H d \ Z x R 2n — > M 

(k,q k , Pk+i) i— > H d (k,q k ,p k +uu k ) 

Equations (4) and (5) are rewritten as the follow¬ 
ing discrete Hamiltonian system: 


*it+l = x k, Pk+l) (8) 

dp 

dH d a- ^ /' Q \ 

Pk = (k,x k ,p k+ x) (9) 

3 q 

The expression of the solutions of the optimal 
control problem as a discrete Hamiltonian system 
(under some regularity properties) is important 
since it indicates that the discrete evolution is 
preserving symplecticity. A simple proof of this 
fact is the following (de Leon et al. (2007)). 
Construct the following function 

G k (x k , X k -\ 1-1> Pk-\-l) = H d (k, Xfc, Pk- |-l) 
Pk+\Xk-\-\i 

with 0 < k < N — l. For each fixed k : 

dG k = ~^~(k, x k , Pk+i) dx k 
dq 

JH d w 

+ ^— (k, x k , p k + 1) dp k+ 1 
dp 

H-1 H-1 Xk+\dp k -\-\ . 

Thus, along solutions of Eqs. (8) and (9), we have 
that d ^| so luti onS = Pkdx k -p k +\dx k+x which 
implies dx k A dp k = dx k +\ A dp k +\. 

In the next section, we will study the case of 
discrete optimal control of mechanical systems. 


Let Q be an n -dimensional differentiable man¬ 
ifold with local coordinates (q l ), 1 < i < 
n. We denote by TQ its tangent bundle with 
induced coordinates (q l ,q l ). Let L:TQ -> M 
be a Lagrangian function; the associated Euler- 
Lagrange equations are given by 



—— = 0, 1 < i < n. 

dq 1 ~ ~ 


( 10 ) 


These equations are a system of implicit second- 
order differential equations. Assume that the La¬ 


grangian is regular, that is, the matrix 



is non-singular. It is well known that the origin 
of these equations is variational (see Marsden 
and West 2001). Variational integrators retain this 
variational character and also some of the key 
geometric properties of the continuous system, 
such as symplecticity and momentum conser¬ 
vation (see Hairer et al. 2002 and references 
therein). In the following, we summarize the 
main features of this type of numerical inte¬ 
grators (Marsden and West 2001). A discrete 
Lagrangian is a map L d \ Q x Q — > M, which 
may be considered as an approximation of the in¬ 
tegral action defined by a continuous Lagrangian 
L\TQ -» K: L d (q 0 ,qx) % /* L(q(t), q(t)) dt 
where q(t) is a solution of the Euler-Lagrange 
equations for L with g(0) = qo, q(h) = q i, and 
h > 0 being enough small. 


Remark 1 The Cartesian product Q x Q 
is equipped with an interesting differential 
structure, called Lie groupoid, which allows the 
extension of variational calculus to more general 
settings (see Marrero et al. 2006, 2010 for more 
details). 

Define the action sum S d : Q N+1 M, 
corresponding to the Lagrangian L d by S d = 
J2k =i L d(qk-i,qk), where q k e Q forO < k < 
N and N is the number of steps. The discrete 
variational principle states that the solutions of 
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the discrete system determined by Ld must ex- 
tremize the action sum given fixed endpoints qo 
and q^f. By extremizing Sd over^, 1 < k < N — 
1, we obtain the system of difference equations 

D\L d (q k ,q k+1 ) + D 2 L d (q k - l ,q k ) = 0, (11) 
or, in coordinates, 

BL d 3L d 

j—(q k ,qk+i) + -^—(q k -\,q k ) = 0, 

where 1 < i < ft, 1 < k < N — 1, and x,y 
denote the n -first and ft-second variables of the 
function Ld , respectively. 

These equations are usually called the discrete 
Euler-Lagrange equations. Under some regu¬ 
larity hypotheses (the matrix DnLd(qk,qk+i) is 
regular), it is possible to define a (local) discrete 
flow T Ld : QxQ -* Q x Q, by T Ld (qk-u q k ) = 
(qk,qk+ 1 ) from (11). Define the discrete Legen¬ 
dre transformations associated to L d as 

F ~L d :QxQ -> T*Q 

(qo,qi) i—> (q 0 ,-DiL d (q 0 ,qi)), 

¥+L d :QxQ -> T*Q 

(qo,qi) I—> (qi,D 2 L d (q 0 ,qi)) , 


formulations of optimal control problems (see 
Cadzow 1970; Hwang and Fan 1967; Jordan and 
Polak 1964). 

Discrete Optimal Control 
of Mechanical Systems 

Consider a mechanical system whose configu¬ 
ration space is an ft-dimensional differentiable 
manifold Q and whose dynamics is determined 
by a Lagrangian L : TQ M. The control forces 
are modeled as a mapping / : TQxU —> T* Q, 
where f(v q ,u ) e T* Q , v q e T q Q and u e U, 
being U the control space. Observe that this last 
definition also covers configuration and velocity- 
dependent forces such as dissipation or friction 
(see Ober-Blobaum et al. 2011). 

The motion of the mechanical system is de¬ 
scribed by applying the principle of Lagrange- 
D’Alembert, which requires that the solutions 
q(t ) G Q must satisfy 

8 ( L(q(t),q(t)) dt 

Jo 

+ [ f(q(0, q(0, «(0) Sq(t) dt = o, 

JO 

( 12 ) 


and the discrete Poincare-Cartan 2-form cod = 
(F + Ld)*odQ = (F ~Ld)*ooQ, where coq is the 
canonical symplectic form on T* Q. The discrete 
algorithm determined by T L d preserves the sym¬ 
plectic form cod, i.e., T £ d cod = u>d- Moreover, if 
the discrete Lagrangian is invariant under the di¬ 
agonal action of a Lie group G, then the discrete 
momentum map Jd : Q x Q -* 0 * defined by 

{Jdiflki 1 )>£) = (JJiLd (c[k , qk-\-\)i 
(qk+ 1 )) 


where (q , q) are the local coordinates of TQ and 
where we consider arbitrary variations 8q(t) e 
T q (t)Q with 8q( 0 ) = 0 and 8q(T ) = 0 (since we 
are prescribing fixed initial and final conditions 
(q (0), ^(0)) and (q(T), q(T))). 

As we consider an optimal control problem, 
the forces / must be chosen, if they exist, as the 
ones that extremize the cost functional: 

T 

C(q(t),q(t ), u(t)) dt 
+$(q(T),q(T),u(T)), (13) 


is preserved by the discrete flow. Therefore, these 
integrators are symplectic-momentum preserv¬ 
ing. Here, denotes the fundamental vector 
field determined by £ G 0 , where $ is the Lie 
algebra of G. As stated in Marsden and West 
( 2001 ), discrete mechanics is inspired by discrete 


where C : TQ x U M. 

The optimal equations of motion can now be 
derived using Pontryagin’s Maximum Principle. 
In general, it is not possible to explicitly inte¬ 
grate these equations. Then, it is necessary to 
apply a numerical method. In this work, using 
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discrete variational techniques, we first discretize 
the Lagrange-d’Alembert principle and then the 
cost functional. We obtain a numerical method 
that preserves some geometric features of the 
original continuous system as described in the 
sequel. 

Discretization of the Lagrangian 
and Control Forces 

To discretize this problem, we replace the tangent 
space TQ by the Cartesian product Q x Q and 
the continuous curves by sequences qo,q \,... qN 
(we are using N steps, with time step h fixed, in 
such a way t k = kh and Nh = T). The discrete 
Lagrangian L d : g x g R is constructed as 
an approximation of the action integral in a single 
time step (see Marsden and West 2001), that is, 

r{k+\)h 

Ld{qk,qk+ 1 ) « / dt. 

Jkh 

We choose the following discretization for the 
external forces: : QxQxU T*Q , where 

U C R m , m < n, such that 

fd(4k,qk+i,u k ) e T* k Q, 
fd~ (tfk, Qk+U u k) ^ T q k+1 Q- 

f~f~ and are right and left discrete forces (see 
Ober-Blobaum et al. 201 1). 

Discrete Lagrange-d'Alembert Principle 

Given such forces, we define the discrete 
Lagrange-d’Alembert principle, which seeks 
sequences {q k }^=o satisfy 


N -1 

8 L L d(qk,qk+ 0 

k =0 

N—\ 

+ £ (/r (flk-> Qk + h Mk) 8q k 

k= 0 

+f/(qk,qk+uUk)8qk+i) = 0, 


manipulations, we arrive to the forced discrete 
Euler-Lagrange equations 

D 2 L d (q k -i,q k ) + D x L d (q k , q k +\) 

T - f d (tfk—hQ k iU k —\) 

T - f d (flki Qk+l ? Uk) = 0 , 

(14) 


with/: = 1,..., N — 1. 

Boundary Conditions 

For simplicity, we assume that the boundary con¬ 
ditions of the continuous optimal control problem 
are given by q( 0) = x 0 , q(0) = v 0 , q(T) = 
q(T) = vt- To incorporate these conditions 
to the discrete setting, we use both the continu¬ 
ous and discrete Legendre transformations. From 
Marsden and West (2001), given a forced system, 
we can define the discrete momenta 


H'k — D\L d iflki q k -\-i) fd (flki q k +\iUk)i 
ftk+l = k) 2 Ld (flk-> tfk+l) T" fd (flk 5 Qk-\-\> Uk ). 

From the continuous Lagrangian, we have the 
momenta 


3L 

= FL(x 0 , Uo) = (*o, Uo)) 
cm 

p T = F L(x t ,v t ) = ^ x T , ^(x T ,v T )\ 


Therefore, the natural choice of boundary condi¬ 
tions is 


xo = q o, xt = qN 

FL(xo,v 0 ) = -DiL d (qo,qi) - f d (qo,qi,uo) 

F L(xt,vt) = D 2 Ld(qN-\,qN ) 

+ /d + fev-l> M^v-l) 


for arbitrary variations {8q k }% =0 with 8qo = that we add to the discrete optimal control 
Sq N = 0. After some straightforward problem. 
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Discrete Cost Function 

We can also approximate the cost functional (13) 
in a single time step h by 


Cd iflki tfk +1 > u k) 

r{k+\)h 

^ / C(q(t),q(t),u(t)) dt, 

Jkh 


yielding the discrete cost functional: 


N -1 

Y. C d (q k ,q k +i,u k ) + ^jfev-i, 

k=0 


Discrete Optimal Control Problem 

With all these elements, we have the following 
discrete optimal control problem: 


N -1 

min £ Cd(qk, qk+i, u k) 
k =o 

+ ®d (qN-l,qN,UN-l) 


subject to 


D 2 L d (q k -i,q k ) + D x L d (q k ,q k+x ) 
fd~ i.qk— 1 , qk , l) H - ((/A: ? qk+l Mk) — o, 


Xq = qo, xt = qN 

9 L 

—— (a: 0 , Vo) = -D x L d (q 0 ,q x ) - f d (qo,qi,u 0 ) 
ov 

9L 

— (xt,Vt) = D 2 L d (q N -i,q N ) 

dt; 

+ // MiV-l) 


with k = 1,..., N — 1 (see Jimenez and Martin 
de Diego 2010; Jimenez et al. 2013 for a mod¬ 
ification of these equations admitting piecewise 
controls). 

The system now is a constrained nonlinear 
optimization problem, that is, it corresponds 
to the minimization of a function subject to 
algebraic constraints. The necessary conditions 
for optimality are derived applying nonlinear 
programming optimization. For the concrete 
implementation, it is possible to use sequential 
quadratic programming (SQP) methods to 
numerically solve the nonlinear optimization 
problem (Ober-Blobaum et al. 201 1). 


Optimal Control Systems 
with Symmetries 

In many interesting cases, the continuous optimal 
control of a mechanical system is defined on a Lie 
group and the Lagrangian, cost function, control 
forces are invariant under the group action. The 
goal is again the same as in the previous section, 
that is, to move the system from its current state 
to a desired state in an optimal way. In this 
particular case, it is possible to adapt the contents 
of the section “Discrete Mechanics” in a similar 
way to the continuous case (from the standard 
Euler-Lagrange equations to the Euler-Poincare 
equations) and to produce the so-called Lie group 
variational integrators. These methods preserve 
the Lie group structure avoiding the use of local 
charts, projections, or constraints. Based on these 
methods, for the case of controlled mechanical 
systems, we produce the discrete Euler-Poincare 
equations with controls and the discrete cost 
function. Consequently, it is possible to deduce 
necessary optimality conditions for this class of 
invariant systems (see Bloch et al. 2009; Bou- 
Rabee and Marsden 2009; Hussein et al. 2006; 
Kobilarov and Marsden 2011). 

Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Numerical Methods for Nonlinear Optimal 
Control Problems 

► Optimal Control and Mechanics 

► Optimal Control and Pontryagin’s Maximum 
Principle 
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Distributed Model Predictive Control 
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Abstract 

Distributed model predictive control refers to a 
class of predictive control architectures in which 
a number of local controllers manipulate a subset 
of inputs to control a subset of outputs (states) 
composing the overall system. Different levels 
of communication and (non)cooperation exist, al¬ 
though in general the most compelling properties 
can be established only for cooperative schemes, 
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those in which all local controllers optimize local 
inputs to minimize the same plantwide objective 
function. Starting from state-feedback algorithms 
for constrained linear systems, extensions are dis¬ 
cussed to cover output feedback, reference target 
tracking, and nonlinear systems. An outlook of 
future directions is finally presented. 

Keywords 

Constrained large-scale systems; Cooperative 
control systems; Interacting dynamical systems 

Introduction and Motivations 

Large-scale systems (e.g., industrial processing 
plants, power generation networks, etc.) 
usually comprise several interconnected units 
which may exchange material, energy, and 
information streams. The overall effectiveness 
and profitability of such large-scale systems 
depend strongly on the level of local effectiveness 
and profitability of each unit but also on the level 
of interactions among the different units. An 
overall optimization goal can be achieved by 
adopting a single centralized model predictive 
control (MPC) system (Rawlings and Mayne 
2009) in which all control input trajectories are 
optimized simultaneously to minimize a common 
objective. 

This choice is often avoided for several rea¬ 
sons. When the overall number of inputs and 
states is very large, a single optimization problem 
may require computational resources (CPU time, 
memory, etc.) that are not available and/or com¬ 
patible with the system’s dynamics. Even if these 
limitations do no hold, it is often the case that 
organizational reasons require the use of smaller, 
local controllers, which are easier to coordinate 
and maintain. 

Thus, industrial control systems are often de¬ 
centralized, i.e., the overall system is divided into 
(possibly mildly coupled) subsystems and a local 
controller is designed for each unit disregard¬ 
ing the interactions from/to other subsystems. 
Depending on the extent of dynamic coupling, 
it is well known that the performance of such 
decentralized systems may be poor, and stability 


properties may be even lost. Distributed predic¬ 
tive control architectures arise to meet perfor¬ 
mance specifications (stability at minimum) sim¬ 
ilar to centralized predictive control systems, still 
retaining the modularity and local character of the 
optimization problems solved by each controller. 

Definitions and Architectures 
for Constrained Linear Systems 

Subsystem Dynamics, Constraints, 
and Objectives 

We start the description of distributed MPC al¬ 
gorithms by considering an overall discrete-time 
linear time-invariant system in the form: 

x + = Ax + Bu , y = Cx (1) 

in which x e W 1 and x + e W 1 are, respectively, 
the system state at a given time and at a successor 
time: u e M m is the input: and y e R p is the 
output. 

We consider that the overall system (1) is 
divided into M subsystems, S/, defined by (dis¬ 
joint) sets of inputs and outputs (states), and each 
S i is regulated by a local MPC. For each S z , we 
denote by yt e R Pi its output, by x, e R ni 
its state, and by U[ e R mi the control input 
computed by the i th MPC. Due to interactions 
among subsystems, the local output yt (and state 
X/) is affected by control inputs computed by 
(some) other MPCs. Hence, the dynamics of S* 
can be written as 

x t — A / X/ T- BfUf T - ^ ^ BfjUj, yi — C/X/ 
jeMi 

( 2 ) 

in which Mi denotes the indices of neighbors 
of Si, i.e., the subsystems whose inputs have 
an influence on the states of $/. To clarify the 
notation, we depict in Fig. 1 the case of three 
subsystems, with neighbors M\ = {2, 3}, M 2 = 
{1}, and A/3 = {2}. 

Without loss of generality, we assume that 
each pair (A/, Bf) is stabilizable. Moreover, the 
state of each subsystem x* is assumed known (to 
the i th MPC) at each decision time. For each 
subsystem $/, inputs are required to fulfill (hard) 
constraints: 
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Overall system 



Distributed Model Predictive Control, Fig. 1 

Interconnected systems and neighbors definition 

Ui G U 7 , / = 1, . . . , M (3) 

in which U; are polyhedrons containing the 
origin in their interior. Moreover, we consider 
a quadratic stage cost function (x, u) = 
^(. x'QiX + u'Rju) and a terminal cost function 

V fi (x) = i x'PjX, with Qi e /?,■ e 

and P/ G positive definite. 

Without loss of generality, let x 7 (0) be the state 
of Si at the current decision time. Consequently, 
the finite-horizon cost function associated with 
Si is given by: 

N -1 

Vi (Xi (0), u , ,{Uj} j eATi) = t i (Xi (k),Ui (k)) 
i =0 

+ 7 // (jc / (W)) (4) 

in which u z = (uj (0), m/( 1), ..., M/(Af — 1)) is 
a finite-horizon sequence of control inputs of 
$/, and u 7 is similarly defined as a sequence 
of control inputs of each neighbor j G A//. 
Notice that V/ (•) is a function of neighbors’ input 
sequences, {u 7 }jeMi > due to the dynamics (2). 

Decentralized, Noncooperative, and 
Cooperative Predictive Control 
Architectures 

Several levels of communications and (non) co¬ 
operation can exist among the controllers, as 
depicted in Fig. 2 for the case of two subsystems. 


In decentralized MPC architectures, interac¬ 
tions among subsystems are neglected by forcing 
Mi = 0 for all i even if this is not true. 
That is, the subsystem model used in each local 
controller, instead of ( 2 ), is simply 

Xi — Ai Xi -|- Bi Ui , yi — Qxi (5) 

Therefore, an inherent mismatch exists between 
the model used by the local controllers (5) and 
the actual subsystem dynamics (2). Each local 
MPC solves the following finite-horizon optimal 
control problem (FHOCP): 

P? e : min V t (•) s.t. u,- e Uf, M = 0 ( 6 ) 

We observe that in this case, Vi (•) depends only 
on local inputs, u 7 , because it is assumed that Mi 
= 0 . Hence, each Pf e is solved independently 
of the neighbors computations, and no iterations 
are performed. Clearly, depending on the actual 
level of interactions among subsystems, decen¬ 
tralized MPC architectures can perform poorly, 
namely, being non-stabilizing. Performance cer¬ 
tifications are still possible resorting to robust 
stability theory, i.e., by treating the neglected dy¬ 
namics Bij u j as (bounded) disturbances 

(Riverso et al. 2013). 

In noncooperative MPC architectures, the 
existing interactions among the subsystems are 
fully taken into account through (2). Given a 
known value of the neighbors’ control input 
sequences, {u j}jeMi^ eac h local MPC solves the 
following FHOCP: 

Pf CDi : min V (■) s.t. u,- e Uf (7) 

The obtained solution can be exchanged with 
the other local controllers to update the assumed 
neighbors’ control input sequences, and iterations 
can be performed. We observe that this approach 
is noncooperative because local controllers try 
to optimize different, possibly competing, objec¬ 
tives. In general, no convergence is guaranteed in 
noncooperative iterations, and when this scheme 
converges, it leads to a so-called Nash equilib¬ 
rium. However, the achieved local control inputs 
do not have proven stability properties (Rawlings 
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Decentralized MPC: no communication, local objectives Non-cooperative MPC: communication, local objectives 



Cooperative MPC: communication, global objective 


Distributed Model Predictive Control, Fig. 2 Three distributed control architectures: decentralized MPC, noncoop¬ 
erative MPC, and cooperative MPC 


and Mayne 2009, §6.2.3). To ensure closed-loop 
stability, variants can be formulated by including 
a sequential solution of local MPC problems, 
exploiting the notion (if any) of an auxiliary 
stabilizing decentralized control law non-iterative 
noncooperative schemes are also proposed, in 
which stability guarantees are provided by ensur¬ 
ing a decrease of a centralized Lyapunov function 
at each decision time. 

Finally, in cooperative MPC architectures, 
each local controller optimizes a common 
(plantwide) objective: 

M 

V(x (0) ,u) = J2 Pi Vi (• x i (0), U« , {«j }j eM) 

i = 1 

( 8 ) 

in which p t >0, for all i , are given scalar weights 
and u = (ui,...,U m) is the overall control 
sequence. In particular, given a known value 
of other subsystems’ control input sequences, 
{u each local MPC solves the following 
FHOCP: 


Pf Di : min V(-) s.t. u, e Uf (9) 

As in noncooperative schemes, the obtained so¬ 
lution can be exchanged with the other local con¬ 
trollers, and further iterations can be performed. 
Notice that in Pp Dl , the (possible) implications 
of the local control sequence u, to all other 
subsystems’ objectives, Vj (•) with j ^ i are 
taken into account, as well as the effect of the 
neighbors’ sequences {u j}jeAfi on the local state 
evolution through (2). Clearly, this approach is 
termed cooperative because all controllers com¬ 
pute local inputs to minimize a global objective. 
Convergence of cooperative iterations is guaran¬ 
teed, and under suitable assumptions the con¬ 
verged solution is the centralized Pareto-optimal 
solution (Rawlings and Mayne 2009, §6.2.4). 
Furthermore, the achieved local control inputs 
have proven stabilizing properties (Stewart et al. 
2010). Variants are also proposed in which each 
controller still optimizes a local objective, but 
cooperative iterations are performed to ensure a 
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decrease of the global objective at each decision 
time (Maestre et al. 201 1). 

Cooperative Distributed MPC 

Cooperative schemes are preferable over nonco¬ 
operative schemes from many points of view, 
namely, in terms of superior theoretical guaran¬ 
tees and no larger computational requirements. In 
this section we focus on a prototype cooperative 
distributed MPC algorithm adapted from Stewart 
et al. (2010), highlighting the required compu¬ 
tations and discussing the associated theoretical 
properties and guarantees. 

Basic Algorithm 

We present in Algorithm 1 a streamlined descrip¬ 
tion of a cooperative distributed MPC algorithm, 
in which each local controller solves Pp Dl , given 
a previously computed value of all other subsys¬ 
tems’ input sequences. For each local controller, 
the new iterate is defined as a convex combination 
of the newly computed solution with the previous 
iteration. A relative tolerance is defined, so that 
cooperative iterations stop when all local con¬ 
trollers have computed a new iterate sufficiently 
close to the previous one. A maximum number 
of cooperative iterations can also be defined, so 
that a finite bound on the execution time can be 
established. 


We observe that Step 8 implicitly defines the 
new overall iterate as a convex combination of the 
overall solutions achieved by each controller, that 
is, 

M 

u c = y>(ur 1 .....u$7‘) do) 

i = 1 

It is also important to observe that Steps 5,8, and 
9 are performed separately by each controller. 

Properties 

The basic cooperative MPC described in Algo¬ 
rithm 1 enjoys several nice theoretical and practi¬ 
cal properties, as detailed (Rawlings and Mayne 
2009, §6.3.1): 

1. Feasibility of each iterate : u^" 1 G Uf implies 

G Uf, for all / = 1,..., M and c G I>q. 

2. Cost decrease at each iteration : V(x (0), u c ) < 
F(x(0), u c_1 ) for all c G I>o. 

3. Cost convergence to the centralized optimum'. 
linWoo F(x(0),u c ) = min u€U v V (x (0) , u ), 
in which U = Ui x • • • x Um- 

Resorting to suboptimal MPC theory, the 
above properties (1) and (2) can be exploited 
to show that the origin of closed-loop system 

= Ax + Bk c (x), with k c (x) = u c ( 0) (11) 


Algorithm 1 (Cooperative MPC). Require: 
Overall warm start u° = (u?,..., u^), convex 


= (uj,•• 

step weights > 0, s.t. — 1? relative 

tolerance parameter s > 0, maximum cooperative 
iterations c max 


1 initialize: c 0 and e z <— 2e for i = 1 , ..., M . 
2:while (c < c max ) and (3i \e t > e) do 
3: c < c T 1. 

4: for i = 1 to M do 
5: Solve Pf Dl in (9) obtaining u*. 

6: end for 

7: for i = 1 to M do 

8: Define new iterate: = w z u* + (1 — w z ) xx c ~ l 

n ^ A II uf -up 1 II 

9: Compute convergence error: ej = ^ . 

10: end for 
ll:end while 

12: return Overall solution: u c = (uj,..., u^). 


is exponentially stable for any finite c G I>q. This 
result is of paramount (practical and theoretical) 
importance because it ensures closed-loop 
stability using cooperative distributed MPC 
with any finite number of cooperative itera¬ 
tions. As in centralized MPC based on the 
solution of a FHOCP (Rawlings and Mayne 
2009, §2.4.3), particular care of the terminal 
cost function Vfi (•) is necessary, possibly 
in conjunction with a terminal constraint 
Xi(N) G Xfi. Several options can be adopted 
as discussed, e.g., in Stewart et al. (2010, 
2011 ). 

Moreover, the results in Pannocchia et al. 
(2011) can be used to show inherent robust 
stability to system’s disturbances and mea¬ 
surement errors. Therefore, we can confidently 
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state that well-designed distributed cooperative 
MPC and centralized MPC algorithms share 
the same guarantees in terms of stability and 
robustness. 


Complementary Aspects 

We discuss in this section a number of comple¬ 
mentary aspects of distributed MPC algorithms, 
omitting technical details for the sake of space. 

Coupled Input Constraints and State 
Constraints 

Convergence of the solution of cooperative 
distributed MPC towards the centralized (global) 
optimum holds when input constraints are in the 
form of (3), i.e., when no constraints involve 
inputs of different subsystems. Sometimes this 
assumption fails to hold, e.g., when several 
units share a common utility resource, that is, in 
addition to (3) some constraints involve inputs of 
more than one unit. In this situation, it is possible 
that Algorithm 1 remains stuck at a fixed point, 
without improving the cost, even if it is still away 
from the centralized optimum (Rawlings and 
Mayne 2009, §6.3.2). It is important to point out 
that this situation is harmless from a closed- 
loop stability and robustness point of view. 
However, the degree of suboptimality in 
comparison with centralized MPC could be 
undesired from a performance point of view. 
To overcome this situation, a slightly different 
partitioning of the overall inputs into non-disjoint 
sets can be adopted (Stewart et al. 2010). 

Similarly the presence of state constraints, 
even in decentralized form X/ g X; (with i = 
1 ,..., M), can prevent convergence of a cooper¬ 
ative algorithm towards the centralized optimum. 
It is also important to point out that the local MPC 
controlling S/ needs to consider in the optimal 
control problem, besides local state constraints 
Xi G X/, also state constraints of all other 
subsystems S j such that i G A/}. This ensures 
feasibility of each iterate and cost reduction, 
hence closed-loop stability (and robustness) can 
be established. 


Output Feedback and Offset-Free Tracking 

When the subsystem state cannot be directly mea¬ 
sured, each local controller can use a local state 
estimator, namely, a Kalman filter (or Luenberger 
observer). Assuming that the pair (A;, Q) is 
detectable, the subsystem state estimate evolves 
as follows: 

X} — Af Xf T" Uf T - ^ ' Bfj Uj-\- Li (y/ C/ Xi ) 

je Mi 

( 12 ) 

in which Li G W liX P i is the local Kalman predic¬ 
tor gain, chosen such that the matrix (A/ — Li C ? ) 
is Schur. Stability of the closed-loop origin can be 
still established using minor variations (Rawlings 
and Mayne 2009, §6.3.3). 

When offset-free control is sought, each lo¬ 
cal MPC can be equipped with an integrating 
disturbance model similarly to centralized offset- 
free MPC algorithms (Pannocchia and Rawlings 
2003). Given the current estimate of the subsys¬ 
tem state and disturbance, a target calculation 
problem is solved to compute the state and input 
equilibrium pair such that (a subset of) output 
variables correspond to given set points. Such a 
target calculation problem can be performed in 
a centralized fashion or in a distributed manner, 
although in the latter case several issues arise 
and associated precautions should be taken into 
account (Rawlings and Mayne 2009, §6.3.4). 

Distributed Control for Nonlinear Systems 

Several nonlinear distributed MPC algorithms 
have been recently proposed (Liu et al. 2009; 
Stewart et al. 2011). Some schemes require the 
presence of a coordinator, thus introducing a 
hierarchical structure (Scattolini 2009). In Stew¬ 
art et al. (2011), instead, a cooperative distributed 
MPC architecture similar to the one discussed in 
the previous section has been proposed for non¬ 
linear systems. Each local controller considers 
the following subsystem model: 

x? = fi ( Xi,Ui,Uj), with j € Mi 

yi=hi(x t ) 
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A problem (formally) identical to Pp Dl in (9) 
is solved by each controller and cooperative 
iterations are performed. However, non¬ 
convexity of Pp Dl can make a convex combi¬ 
nation step similar to Step 8 in Algorithm 1 not 
necessarily a cost improvement. As a workaround 
in such cases, Stewart et al. (2011) propose 
deleting the least effective control sequence 
computed by a local controller (repeating this 
deletion if necessary). In this way it is possible to 
show a monotonic decrease of the cost function 
at each cooperative iteration. 

Summary and Future Directions 

We presented the basics and foundations of 
distributed model predictive control (DMPC) 
schemes, which prove useful and effective in the 
control of large-scale systems for which a single 
centralized predictive controller is not regarded 
as a possible or desirable solution, e.g., due to or¬ 
ganizational requirements and/or computational 
limitations. In DMPCs, the overall controlled 
system is organized into a number of subsystems, 
in general featuring some dynamic couplings, and 
for each subsystem a local MPC is implemented. 

Different flavors of communication and coop¬ 
eration among the local controllers can be chosen 
by the designer, ranging from decentralized to 
cooperative schemes. In cooperative DMPC 
algorithms, the dynamic interactions among 
the subsystems are fully taken into account, 
with limited communication overheads, and the 
same overall objective can be optimized by each 
local controller. When cooperative iterations 
are performed upon convergence, such DMPC 
algorithms achieve the same global minimum 
control sequence as that of the centralized MPC. 
Termination prior to convergence does not hinder 
stability and robustness guarantees. 

In this contribution, after discussing an 
overview on possible communication and 
cooperation schemes, we addressed the design of 
a state-feedback and distributed MPC algorithm 
for linear systems subject to input constraints, 
with convergence and stability guarantees. Then, 


we discussed various extensions to coupled 
input constraints and state constraints, output 
feedback, reference target tracking, and nonlinear 
systems. 

The research on DMPC algorithms has been 
extensive during the last decade, and some excel¬ 
lent review papers have been recently made avail¬ 
able (Christofides et al. 2013; Scattolini 2009). 
Still, we expect DMPC to attract research efforts 
in various directions, as briefly discussed: 

• Nonlinear DMPC algorithms (Liu et al. 2009; 
Stewart et al. 2011) will require improvements 
in terms of global optimum goals. 

• Economic DMPC and tracking DMPC (Fer- 
ramosca et al. 2013) will replace current for¬ 
mulations designed for regulation around the 
origin, especially for nonlinear systems. 

• Reconfigurability , e.g., addition/deletion of 
new local controllers, is an ongoing topic, and 
preliminary results available for decentralized 
architectures (Riverso et al. 2013) may be 
extended to cooperative and noncooperative 
schemes. It is also desirable to improve 
the resilience of DMPC to communication 
disruptions (Alessio et al. 2011). 

• Preliminary results on constrained distributed 
estimation (Farina et al. 2012) will draw at¬ 
tention and require further insights to bridge 
the gap between constrained estimation and 
control algorithms. 

• Specific optimization algorithms tailored to 
DMPC local problems (Doan et al. 2011) 
will increase the effectiveness of DMPC al¬ 
gorithms, as well as distributed optimization 
approaches will be exploited even for dynam¬ 
ically uncoupled systems. 


Cross-References 

► Cooperative Solutions to Dynamic Games 

► Nominal Model-Predictive Control 

► Optimization Algorithms for Model Predictive 
Control 

► Tracking Model Predictive Control 
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Recommended Reading 

General overviews on DMPC can be found in 
Christofides et al. (2013), Rawlings and Mayne 
(2009), and Scattolini (2009). DMPC algorithms 
for linear systems are discussed in Alessio et al. 
(2011), Ferramosca et al. (2013), Riverso et al. 
(2013), Stewart et al. (2010), and Maestre et al. 
(201 1), and for nonlinear systems in Farina et al. 
(2012), Liu et al. (2009), and Stewart et al. 
(2011). Supporting results for implementation 
and robustness theory can be found in Doan et al. 
(2011), Pannocchia and Rawlings (2003), and 
Pannocchia et al. (201 1). 
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Abstract 

The paper provides an overview of the distributed 
first-order optimization methods for solving a 
constrained convex minimization problem, where 
the objective function is the sum of local objec¬ 
tive functions of the agents in a network. This 
problem has gained a lot of interest due to its 
emergence in many applications in distributed 
control and coordination of autonomous agents 
and distributed estimation and signal processing 
in wireless networks. 
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Introduction 

There has been much recent interest in distributed 
optimization pertinent to optimization aspects 
arising in control and coordination of networks 
consisting of multiple (possibly mobile) agents 
and in estimation and signal processing in sensor 
networks (Bullo et al. 2009; Hendrickx 2008; 
Kar and Moura 2011; Martinoli et al. 2013; 
Mesbahi and Egerstedt 2010; Olshevsky 2010). 
In many of these applications, the network 
system goal is to optimize a global objective 
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through local agent-based computations and 
local information exchange with immediate 
neighbors in the underlying communication 
network. This is motivated mainly by the 
emergence of large-scale data and/or large- 
scale networks and new networking applications 
such as mobile ad hoc networks and wireless 
sensor networks, characterized by the lack of 
centralized access to information and time- 
varying connectivity. Control and optimization 
algorithms deployed in such networks should 
be completely distributed (relying only on local 
observations and information), robust against 
unexpected changes in topology (i.e., link or node 
failures) and against unreliable communication 
(noisy links or quantized data) (see ► Networked 
Systems). Furthermore, it is desired that the 
algorithms are scalable in the size of the 
network. 

Generally speaking, the problem of distributed 
optimization consists of three main components: 

1. The optimization problem that the network of 
agents wants to solve collectively (specifying 
an objective function and constraints) 

2. The local information structure, which de¬ 
scribes what information is locally known or 
observable by each agent in the system (who 
knows what and when) 

3. The communication structure, which specifies 
the connectivity topology of the under¬ 
lying communication network and other 
features of the communication environ¬ 
ment 

The algorithms for solving such global 
network problems need to comply with the 
distributed knowledge about the problem among 
the agents and obey the local connectivity 
structure of the communication network 
(►Networked Systems; ►Graphs for Modeling 
Networked Interactions). 


Networked System Problem 

Given a set /V = {1,2,..., /i} of agents (also 
referred to as nodes), the global system problem 
has the following form: 


minimize £"=i fi(x) 
subject to x e X. 

Each f : —> R is a convex function which 

represents the local objective of agent i, while 
X c is a closed convex set. The function 
fi is a private function known only to agent /, 
while the set X is commonly known by all agents 
i e N. The vector x e X represents a global 
decision vector which the agents want to optimize 
using local information. The problem is a simple 
constrained convex optimization problem, where 
the global objective function is given by the sum 
of the individual objective functions f (v) of the 
agents in the system. As such, the objective func¬ 
tion is the sum of non-separable convex functions 
corresponding to multiple agents connected over 
a network. 

As an example, consider the problem arising 
in support vector machines (SVMs), which are 
a popular tool for classification problems. Each 
agent i has a set 5/ = of 

sample-label pairs, where dj e R d is a data 

point and e {+1, — 1} is its corresponding 
(correct) label. The number m, of data points for 
every agent i is typically very large (hundreds 
of thousands). Without sharing the data points, 
the agents want to collectively find a hyperplane 
that separates all the data, i.e., a hyperplane that 
separates (with a maximal separation distance) 
the data with label 1 from the data with label — 1 
in the global data set (J” =1 £/■ Thus, the agents 
need to solve an unconstrained version of the 
problem (1), where the decision variable x e R d 
is a hyperplane normal and the objective function 
f of agent i is given by 

* rrii 

fi(x)= 2 ll*l | 2 + X]max jo, 1 -bf (x'af)} , 

j =i 

where A is a regularization parameter (common 
to all agents). 

The network communication structure is rep¬ 
resented by a directed (or undirected) graph G = 
(N, E ), with the vertex set N and the edge set 
E. The network is used as a medium to diffuse 
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the information from an agent to every other 
agent through local agent interactions over time 
(► Graphs for Modeling Networked Interactions). 
To accommodate the information spread through 
the entire network, it is typically assumed that the 
network communication graph G = (N, E) is 
strongly connected (► Dynamic Graphs, Connec¬ 
tivity of). In the graph, a link (/, j) means that 
agent i e N receives the relevant information 
from agent j e N. 

Distributed Algorithms 

The algorithms for solving problem (1) are con¬ 
structed by using standard optimization tech¬ 
niques in combination with a mechanism for 
information diffusion through local agent inter¬ 
actions. A control point of view for the design 
of distributed algorithms has a nice exposition in 
Wang and Elia (2011). 

One of the existing optimization techniques 
is the so-called incremental method, where the 
information is processed along a directed cycle 
in the graph. In this approach the estimate is 
passed from an agent to its neighbor (along the 
cycle), and only one agent updates at a time 
(Bertsekas 1997; Blatt et al. 2007; Johansson 
2008; Johansson et al. 2009; Nedic and Bertsekas 
2000,2001 ; Nedic et al. 2001 ; Rabbat and Nowak 
2004; Ram et al. 2009; Tseng 1998); for a de¬ 
tailed literature on incremental methods, see the 
textbooks Bertsekas (1999) and Bertsekas et al. 
(2003). 

More recently, one of the techniques that 
gained popularity as a mechanism for information 
diffusion is a consensus protocol, in which 
the agent diffuses the information through the 
network through locally weighted averaging of 
their incoming data (►Averaging Algorithms 
and Consensus). The problem of reaching 
a consensus on a particular scalar value, or 
computing exact averages of the initial values of 
the agents, has gained an unprecedented interest 
as a central problem inherent to cooperative 
behavior in networked systems (Blondel et al. 
2005; Boyd et al. 2005; Cao et al. 2005, 2008a,b; 
Jadbabaie et al. 2003; Olfati-Saber and Murray 


2004; Olshevsky 2010; Olshevsky and Tsitsiklis 
2006, 2009; Touri 2011; Vicsek et al. 1995; Wan 
and Lemmon 2009). 

Using the consensus technique, a class 
of distributed algorithms has emerged, as a 
combination of the consensus protocols and the 
gradient-type methods. The gradient-based 
approaches are particularly suitable, as they 
have a small overhead per iteration and are, in 
general, robust to various sources of errors and 
uncertainties. 

The technique of using the network as a 
medium to propagate the relevant information 
for optimization purpose has its origins in the 
work by Tsitsiklis (1984), Tsitsiklis et al. (1986), 
and Bertsekas and Tsitsiklis (1997), where the 
network has been used to decompose the vector x 
components across different agents, while all 
agents share the same objective function. 

Algorithms Using Weighted Averaging 

The technique has recently been employed in 
Nedic and Ozdaglar (2009) (see also Nedic and 
Ozdaglar 2007, 2010) to deal with problems 
of the form (1) when the agents have different 
objective functions /•, but their decisions are 
fully coupled through the common vector vari¬ 
able v. In a series of recent work (to be detailed 
later), the following distributed algorithm has 
emerged. Letting x* (k) e A be an estimate 
(of the optimal decision) at agent i and time 
k , the next iterate is constructed through two 
updates. The first update is a consensus-like it¬ 
eration, whereby, upon receiving the estimates 
Xj ( k ) from its (in)neighbors j , the agent i aligns 
its estimate with its neighbors through averaging, 
formally given by 

Vj(k) = WijXj(k), (2) 

JSNi 

where Nf is the neighbor set 

N i = {j eN\(j,i)eE}U{i}. 

The neighbor set A ? includes agent i itself, since 
the agent always has access to its own informa¬ 
tion. The scalar wy is a nonnegative weight that 
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agent i places on the incoming information from 
neighbor j G Nf. These weights sum to 1, i.e., 
I ZjeNt w ij = thus y ieldin g v i (k) as a (local) 
convex combination of Xj(k ), j G Ni , obtained 
by agent i . 

After computing Vi(k), agent i computes a 
new iterate x* (k + 1) by performing a gradient- 
projection step, aimed at minimizing its own 
objective fi, of the following form: 

Xi(k + 1) = Y\ x [vi(k) - u(k)S7 fi(vi(k))], 

(3) 

where Tlx [x] is the projection of a point x on the 
set X (in the Euclidean norm), a(k) > 0 is the 
stepsize at time k , and V (z) is the gradient of 
fi at a point z- 

When all functions are zero and X is the entire 
space P d , the distributed algorithm (2) and (3) 
reduces to the linear-iteration method: 

Xi(k + 1) = ^ WijXj(k), (4) 

j£Ni 

which is known as consensus or agreement pro¬ 
tocol. This protocol is employed when the agents 
in the network wish to align their decision vectors 
Xj (k) to a common vector x. The alignment is 
attained asymptotically (as k —> oo). 

In the presence of objective function and con¬ 
straints, the distributed algorithm in (2) and (3) 
corresponds to “forced alignment” guided by the 
gradient forces Yfi =i ^ fi ( x )- Under appropriate 
conditions, the alignment is forced to a common 
vector v* that minimizes the network objective 
i fi(, x ) over the set X. This corresponds to 
the convergence of the iterates x* ( k ) to a common 
solution x* G X as k —> oo for all agents i e N. 

The conditions under which the convergence 
of {x,(/:)} to a common solution x* e X occurs 
are a combination of the conditions needed for 
the consensus protocol to converge and the con¬ 
ditions imposed on the functions fi and the step- 
size ct{k) to ensure the convergence of standard 
gradient-projection methods. For the consensus 
part, the conditions should guarantee that the 
pure consensus protocol in (4) converges to the 
average d Yfi=i x i (0) of the initial agent values. 
This requirement transfers to the condition that 


the weights wy give rise to a doubly stochastic 
weight matrix W, whose entries are w/y defined 
by the weights in (4) and augmented by wy = 0 
for j i Ni. 

Intuitively, the requirement that the weight 
matrix W is doubly stochastic ensures that each 
agent has the same influence on the system be¬ 
havior, in a long run. More specifically, the dou¬ 
bly stochastic weights ensure that the system as 
whole minimizes Yfi =i where the fac¬ 

tor 1 In is seen as portion of the influence of 
agent i . When the weight matrix W is only row 
stochastic, the consensus protocol converges to 
a weighted average Yfi=\ ^/X/(0) of the agent 
initial values, where 7t is the left eigenvector of 
W associated with the eigenvalue 1. In general, 
the values 7r, can be different for different indices 
/, and the distributed algorithm in (2) and (3) 
results in minimizing the function Yfi =i 71 i f ( x ) 
over X, and thus not solving problem (1). 

The distributed algorithm (2) and (3) has 
been proposed and analyzed in Ram et al. 
(2010a, 2012), where the convergence had 
been established for diminishing stepsize rule 
(i.e., a (k) — 00 and < °°)- 

seen from the convergence analysis (see, e.g., 
Nedic and Ozdaglar 2009; Ram et al. 2012), 
for the convergence of the method, it is critical 
that the iterate disagreements || x z ( k ) — xj ( k ) || 
converge linearly in time, for all i j . This fast 
disagreement decay seems to be indispensable 
for ensuring the stability of the iterative process 

(2) and (3). 

According to the distributed algorithm (2) and 

(3) , at first, each agent aligns its estimate X/ (k) 
with the estimates Xj (k) that are received from 
its neighbors and, then, updates based on its local 
objective f which is to be minimized over x G 
X. Alternatively, the distributed method can be 
constructed by interchanging the alignment step 
and the gradient-projection step. Such a method, 
while having the same asymptotic performance 
as the method in (2) and (3), exhibits a somewhat 
slower (transient) convergence behavior due to 
a larger misalignment resulting from taking the 
gradient-based updates at first. This alternative 
has been initially proposed independently in 
Lopes and Sayed (2006) (where simulation 
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results have been reported), in Nedic and 
Ozdaglar (2007, 2009) (where the convergence 
analysis for a time-varying networks and a 
constant stepsize is given), and in Nedic et al. 
(2008) (with the quantization effects) and further 
investigated in Lopes and Sayed (2008), Nedic 
et al. (2010), and Cattivelli and Sayed (2010). 
More recently, it has been considered in Lobel 
et al. (2011) for state-dependent weights and in 
Tu and Sayed (2012) where the performance is 
compared with that of an algorithm of the form 
(2) and (3) for estimation problems. 

Algorithm Extensions : Over the past years, 
many extensions of the distributed algorithm in 
(2) and (3) have been developed, including the 
following: 

(a) Time-varying communication graphs : The 
algorithm naturally extends to the case of 
time-varying connectivity graphs {G(k)}, 
with G(k ) = (N, E(k)) defined over 
the node set N and time-varying links 
E(k). In this case, the weights wy in (2) 
are replaced with wy (k) and, similarly, 
the neighbor set TV/ is replaced with the 
corresponding time-dependent neighbor set 
Ni (k) specified by the graph G(k). The 
convergence of the algorithm (2) and (3) 
with these modifications typically requires 
some additional assumptions of the network 
connectivity over time and the assumptions 
on the entries in the corresponding weight 
matrix sequence {W(k)}, where Wij(k ) = 0 
for j £ Nf (k). These conditions are the same 
as those that guarantee the convergence of 
the (row stochastic) matrix sequence { W(k )} 
to a rank-one row-stochastic matrix, such 
as a connectivity over some fixed period 
(of a sliding-time window), the nonzero 
diagonal entries in W(k ), and the existence of 
a uniform lower bound on positive entries in 
W(k)\ see, for example, Cao et al. (2008b), 
Touri (2011), Tsitsiklis (1984), Nedic and 
Ozdaglar (2010), Moreau (2005), and Ren 
and Beard (2005). 

(b) Noisy gradients : The algorithm in (2) and (3) 
works also when the gradient computations 
V fi(x) in update (3) are erroneous 
with random errors. This corresponds to 
using a stochastic gradient V ft (v) instead of 


V ft (v), resulting in the following stochastic 
gradient-projection step: 

x t (k + 1) = n, [viiV-ctiWMviik))] 

instead of (3). The convergence of these 
methods is established for the cases when the 
stochastic gradients are consistent estimates 
of the actual gradient, i.e., 


E 


VMviimvtik)] = S7fi(vi(k)). 


The convergence of these methods typically 
requires the use of non-summable but square- 
summable stepsize sequence {ct(k)} (i.e., 

= 00 and J2k a2 (k) < °°X e.g., 
Ram et al. (2010a). 

(c) Noisy or unreliable communication links : 
Communication medium is not always 
perfect and, often, the communication links 
are characterized by some random noise 
process. In this case, while agent j e Nf 
sends its estimate xj(k) to agent /, the 
agent does not receive the intended message. 
Rather, it receives xj (k) with some random 
link-dependent noise tjij(k), i.e., it receives 
xj (k) + %ij ( k ) instead of xj (k) (see Kar and 
Moura 2011; Patterson et al. 2009; Touri and 
Nedic 2009 for the influence of noise and 
link failure on consensus). In such cases, the 
distributed optimization algorithm needs to 
be modified to include a stepsize for noise 
attenuation and the standard stepsize for 
the gradient scaling. These stepsizes are 
coupled through an appropriate relative- 
growth conditions which ensure that the 
gradient information is maintained at the 
right level and, at the same time, link-noise 
is attenuated appropriately (Srivastava and 
Nedic 2011; Srivastava et al. 2010). Other 
imperfections of the communication links 
can also be modeled and incorporated into 
the optimization method, such as link failures 
and quantization effects, which can be built 
using the existing results for consensus 
protocol (e.g., Carli et al. 2007; Kar and 
Moura 2010, 2011; Nedic et al. 2008). 

(d) Asynchronous implementations : The method 
in (2) and (3) has simultaneous updates, 
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evident in all agents exchanging and updating 
information in synchronous time steps 
indexed by k. In some communication 
settings, the synchronization of agents is 
impractical, and the agents are using their 
own clocks which are not synchronized 
but do tick according to a common time 
interval. Such communications result in 
random weights wy ( k ) and random neighbor 
set Ni(k) c E in (2), which are typically 
independent and identically distributed (over 
time). Most common models are random 
gossip and random broadcast. In the gossip 
model, at any time, two randomly selected 
agents i and j communicate and update, 
while the other agents sleep (Boyd et al. 
2005; Kashyap et al. 2007; Ram et al. 
2010b; Srivastava 2011; Srivastava and Nedic 
2011). In the broadcast model, a random 
agent i wakes up and broadcasts its estimate 
Xf ( k ). Its neighbors that receive the estimate 
update their iterates, while the other agents 
(including the agent who broadcasted) do not 
update (Aysal et al. 2008; Nedic 2011). 

(e) Distributed constraints : One of the more 
challenging aspects is the extension of the 
algorithm to the case when the constraint 
set X in (1) is given as an intersection of 
closed convex sets Xj, one set per agent. 
Specifically, the set X in (1) is defined by 

*=rW 

i = 1 

where the set Xf is known to agent i only. In 
this case, the algorithm has a slight modifica¬ 
tion at the update of Xf (k + 1) in (3), where 
the projection is on the local set X t instead of 
X, i.e., the update in (3) is replaced with the 
following update: 

Xi(k + 1) = Y\ [ v i(k) ~ a(k)V fi(vj(k))]. 

Xi 

(5) 

The resulting method (2), (5) converges under 
some additional assumptions on the sets X/, 
such as the nonempty interior assumption 
(i.e., the set Q” =1 %i has a nonempty inte¬ 
rior), a linear-intersection assumption (each 


Xi is an intersection of finitely many linear 
equality and/or inequality constraints), or the 
Slater condition (Lee and Nedic 2012; Nedic 
et al. 2010; Srivastava 2011; Srivastava and 
Nedic 2011; Zhu and Martinez 2012). 

In principle, most of the simple first-order 
methods that solve a centralized problem of 
the form (1) can also be distributed among the 
agents (through the use of consensus protocols) 
to solve distributed problem (1). For example, 
the Nesterov dual-averaging subgradient method 
(Nesterov 2005) can be distributed as proposed 
in Duchi et al. (2012), a distributed Newton- 
Raphson method has been proposed and studied 
in Zanella et al. (2011), while a distributed 
simplex algorithm has been constructed and 
analyzed in Burger et al. (2012). An interesting 
method based on finding a zero of the gradient 
V/ = YTi=\ V// - , distributedly, has been 
proposed and analyzed in Lu and Tang (2012). 
Some other distributed algorithms and their 
implementations can be found in Johansson et al. 
(2007), Tsianos et al. (2012a, b), Dominguez- 
Garcia and Hadjicostis (2011), Tsianos (2013), 
Gharesifard and Cortes (2012a), Jakovetic et al. 
(201 la, b), and Zargham et al. (2012). 

Summary and Future Directions 

The distributed optimization algorithms have 
been developed mainly using consensus protocols 
that are based on weighted averaging, also known 
as linear-iterative methods. The convergence 
behavior and convergence rate analysis of these 
methods combines the tools from optimization 
theory, graph theory, and matrix analysis. The 
main drawback of these algorithms is that they 
require (at least theoretically) the use of doubly 
stochastic weight matrix W (or W(k) in time- 
varying case) in order to solve problem (1). This 
requirement can be accommodated by allowing 
agents to exchange locally some additional 
information on the weights that they intend to 
use or their degree knowledge. However, in 
general, constructing such doubly stochastic 
weights distributedly on directed graphs is rather 
a complex problem (Gharesifard and Cortes 
2012 b). 
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As an alternative, which seems a promising di¬ 
rection for future research, is the use of so-called 
push-sum protocol (or sum-ratio algorithm) for 
consensus problem (Benezit et al. 2010; Kempe 
et al. 2003). This direction is pioneered in Tsianos 
et al. (2012b), Tsianos (2013), and Tsianos and 
Rabbat (2011) for static graphs and recently ex¬ 
tended to directed graphs Nedic and Olshevsky 
(2013) for an unconstrained version of prob¬ 
lem (1). 

Another promising direction lies in the use 
of alternating direction method of multipliers 
(ADMM) in combination with the graph- 
Laplacian formulation of consensus constraints 
NiXi = J^jeNi x j • A nice exposure to ADMM 
method is given in Boyd et al. (2010). The first 
work to address the development of distributed 
ADMM over a network is Wei and Ozdaglar 
(2012), where a static network is considered. 
Its distributed implementation over time-varying 
graphs will be an important and challenging task. 

Cross-References 
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► Graphs for Modeling Networked Interactions 
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Abstract 

Dynamic networks have recently emerged as 
an efficient way to model various forms of 
interaction within teams of mobile agents, such as 


sensing and communication. This article focuses 
on the use of graphs as models of wireless 
communications. In this context, graphs have 
been used widely in the study of robotic and 
sensor networks and have provided an invaluable 
modeling framework to address a number of 
coordinated tasks ranging from exploration, 
surveillance, and reconnaissance to cooperative 
construction and manipulation. In fact, the 
success of these stories has almost always 
relied on efficient information exchange and 
coordination between the members of the team, 
as seen, e.g., in the case of distributed state 
agreement where multi-hop communication 
has been proven necessary for convergence and 
performance guarantees. 

Keywords 

Algebraic graph theory; Convex optimization; 
Distributed and hybrid control; Graph connectiv¬ 
ity 

Introduction 

Communication in networked dynamical systems 
has typically relied on constructs from graph 
theory, with disc-based and weighted-proximity 
graphs gaining the most popularity; see Fig. la, b. 
Besides their simplicity, these models owe their 
popularity to their resemblance to radio signal 
strength models, where the signals attenuate with 
the distance (Neskovic et al. 2000; Pahlavan and 
Levesque 1995; Parsons 2000). In this context, 
multi-hop communication becomes equivalent to 
network connectivity, defined as the property of 
a graph to transmit information between any pair 
of its nodes; see Fig. lc. 

Specifically, let Q{t) = {V, £(t), W(/)} de¬ 
note a graph on n nodes that can be robots or 
mobile sensors, so that V = {1,is the set 
of vertices, £(t) c V x V is the set of edges at 
time t, and W(t ) = {wy (t) | (/, j) e V x V} is a 
set of weights so that wy (t) = 0 if (/, j) $ £(t) 
and (t) >0 otherwise. If w/j (t) = wj\ (t) for 
all pairs of nodes i , j , then the graph is called 
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Dynamic Graphs, Connectivity of. Fig. 1 (a) Disc-based model of communication; (b) Weighted, proximity-based 
model of communication; (c) Connected network of mobile robots 


undirected; otherwise it is called directed. The 
weights in W{t) typically model signal strength 
or channel reliability, as per the disc-based and 
weighted-proximity models in Fig. la, b. In these 
models communication between nodes is related 
to their pairwise distance, giving rise to the dy¬ 
namic or time-varying nature of the graph Q(t ) 
due to node mobility. Given an undirected dy¬ 
namic graph Q(t ), we say that this graph is 
connected at time t if there exists a path, i.e., a se¬ 
quence of distinct vertices such that consecutive 
vertices are adjacent, between any two vertices 
in G(t). In the case of directed graphs, two 
notions of connectivity are defined. A directed 
graph G(t) is called strongly connected if there 
exists a directed path between any two of its 
vertices or equivalently, if every vertex is reach¬ 
able from any other vertex. On the other hand, 
a directed graph is called weakly connected if 
replacing all directed edges by undirected edges 
produces a connected undirected graph. Finally, 
a collection of graphs {G{t) \ t = to, ... , 4 } 
is called jointly connected over time if the union 
graph U ‘ t k =t0 Q(t) = {V, U£=,„£(*)} is connected. 
Clearly checking for the existence of paths be¬ 
tween all pairs of nodes in a graph is difficult, 
especially so as the number of nodes in the graph 
increases. For this reason, equivalent, algebraic 
representations of graphs are employed that allow 
for efficient algebraic ways to check for connec¬ 
tivity, as we discuss in the following section. 

While connectivity is necessary for informa¬ 
tion propagation in a network, it is also relevant 
to the performance of many networked dynamical 
processes, such as synchronization and gossiping, 


via its relation to the network eigenvalue spectra 
(Preciado 2008). For example, the spectrum of 
the Laplacian matrix of a network plays a key role 
in the analysis of synchronization in networks of 
nonlinear oscillators (Pecora and Carrollg 1998; 
Preciado and Verghese 2005), distributed algo¬ 
rithms (Lynch 1997), and decentralized control 
problems (Fax and Murray 2004; Olfati Saber 
and Murray 2004). Similarly, the spectrum of 
the adjacency matrix determines the speed of 
viral information spreading in a network (Van 
Mieghem et al. 2009). Additionally, more robust 
versions of connectivity, such as k-node or k-edge 
connectivity, can be used to introduce robustness 
of a network to node or link failures, respectively 
(Zavlanos and Pappas 2005, 2008). 


Graph-Theoretic Connectivity Control 

Connectivity Using the Graph Laplacian 
Matrix 

A metric that is typically employed to capture 
connectivity of dynamic networks is the second 
smallest eigenvalue A 2 (L) of the Laplacian 
matrix L e of the graph, also known as 

the algebraic connectivity or Fiedler value of the 
graph. For a weighted graph Q = {V,£,W}, 
the entries of the Laplacian matrix are typically 
related to the weights in W so that the i, j 
entry of L is given by [L]/ y - = Y^)=\ W U if 
i = j and [L] t j = —Wij if i 7 ^ j. The 
Laplacian matrix of an undirected graph is always 
a symmetric, positive semidefinite matrix whose 
smallest eigenvalue X\ (L) is identically zero 
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with corresponding eigenvector the vector of all 
entries equal to one. Additionally, the algebraic 
connectivity A 2 (L) is a concave function of the 
Laplacian matrix that is positive if and only if 
the graph is connected (Fiedler 1973; Godsil and 
Royle 2001; Merris 1994; Mohar 1991). 

As the algebraic connectivity A 2 (L) plays 
a critical role in determining whether a graph is 
connected or not, a number of methods have been 
proposed for its decentralized estimation and 
control. These range from methods that employ 
market-based control to underestimate the 
algebraic connectivity and accordingly control 
the network structure (Zavlanos and Pappas 
2008) to methods that enforce the states of the 
nodes to oscillate at frequencies that correspond 
to the Laplacian eigenvalues and then use fast 
Fourier transform to estimate these eigenvalues 
(Franceschelli et al. 2013), to methods that 
iteratively update the interval where the algebraic 
connectivity is supposed to lie (Montijano et al. 
2011 ), and to methods that rely on the power 
iteration method and its variants (DeGennaro 
and Jadbabaie 2006; Kempe and McSherry 2008; 
Knorn et al. 2009; Oreshkin et al. 2010; Sabattini 
et al. 2011; Yang et al. 2010). All the above 
techniques are often integrated with appropriate 
controllers to regulate mobility of the nodes while 
ensuring connectivity of the network. Another 
way that A 2 (L) can be used to ensure connectivity 
of dynamic graphs is via optimization-based 
methods that maximize it away from its zero 
value. Such approaches were initially centralized 
as connectivity is a global property of a graph 
(Kim and Mesbahi 2006), although recently 
distributed subgradient algorithms (DeGennaro 
and Jadbabaie 2006) as well as non-iterative 
decomposition techniques (Simonetto et al. 
2013) have also been proposed. As the algebraic 
connectivity is a non-differentiable function 
of the Laplacian matrix, designing continuous 
feedback controllers to maintain it positive 
definite is a challenging task. This problem 
was overcome in Zavlanos and Pappas (2007) 
via the use of gradient flows that maintain 
positive definiteness of the determinant of the 
projected Laplacian matrix to the space that is 
perpendicular to eigenvector of ones. 


Connectivity Using the Graph Adjacency 
Matrix 

Alternatively, connectivity can be captured by the 
sum of powers Y^k =0 ^ °f the adjacency matrix 
A e R nxn of the network for K < n — 1. The en¬ 
tries of the adjacency matrix are typically related 
to the weights in W as [A]ij = . For disc- 

based graphs as in Fig. la, the /, j entry of the 
Ath power of the adjacency matrix [A k ]jj captures 
the number of paths of length k between nodes 
i and j ; for weighted graphs, [A k ]ij captures 
a weighted sum of those paths. Therefore, the 
entries of Y^k=o ^ represent the number of paths 
up to length K between every pair of nodes in 
the graph (Godsil and Royle 2001). By definition 
of graph connectivity, if all entries of Y^k=o ^ 
are positive for K = n — 1, then the network 
is connected. Clearly, for K < n — 1, not all 
entries of J2k =0 ^ are necessarily positive, even 
if the graph is connected. Maintaining positive 
definiteness of the positive entries of Y^k =0 ^ 
of an initially connected graph maintains paths 
of length K between the corresponding nodes 
and, as shown in Zavlanos and Pappas (2005), 
is sufficient to maintain connectivity of the graph 
throughout. 

The ability to capture graph connectivity 
using the adjacency matrix has given rise 
to optimization-based connectivity controllers 
(Srivastava and Spong 2008; Zavlanos and 
Pappas 2005) that are often centralized due to the 
multi-hop dependencies between nodes due to the 
powers of the adjacency matrix. Since smaller 
powers correspond to shorter dependencies 
(paths), decentralization is possible as K 
decreases. If K = 1, connectivity maintenance 
reduces to preserving the pairwise links between 
the nodes in an initially connected network. Since 
the adjacency matrix of weighted graphs is often 
a differentiable function, this approach can result 
in continuous feedback solution techniques. 
Discrete-time approaches are discussed in Ando 
et al. (1999), Notarstefano et al. (2006), and 
Bullo et al. (2009), while Spanos and Murray 
(2004), Dimarogonas and Kyriakopoulos (2008), 
Cornejo and Lynch (2008), Yao and Gupta 
(2009), Zavlanos et al. (2007), and Ji and 
Egerstedt (2007) rely on local gradients that 
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may also incorporate switching in the case 
of link additions. Switching between arbitrary 
spanning topologies has also been studied in the 
literature, with the spanning subgraphs being 
updated by local auctions (Zavlanos and Pappas 

2008) , distributed spanning tree algorithms 
(Wagenpfeil et al. 2009), combination of 
information dissemination algorithms and graph 
picking games (Schuresko and Cortes 2009b), or 
intermediate rendezvous (Schuresko and Cortes 
2009a; Spanos and Murray 2005). This class 
of approaches is typically hybrid, combining 
continuous link maintenance and discrete 
topology control. The algebraic connectivity 
X 2 OL) and number of paths Y^k=o^ k metr i cs 
can also be combined to give controllers that 
maintain connectivity, while enforcing desired 
multi-hop neighborhoods for all agents (Stump 
et al. 2008). 

A recent, comprehensive survey on graph- 
theoretic approaches for connectivity control of 
dynamic graphs can be found in Zavlanos et al. 
( 2011 ). 

Applications in Mobile Robot 
Network Control 

Methods to control connectivity of dynamic 
graphs have been successfully applied to multiple 
scenarios that require network connectivity to 
achieve a global coordinated objective. Indicative 
of the impact of this work is recent literature 
on connectivity preserving rendezvous (Ando 
et al. 1999; Cortes et al. 2006; Dimarogonas and 
Kyriakopoulos 2008; Ganguli et al. 2009; Ji and 
Egerstedt 2007), flocking (Zavlanos et al. 2007, 

2009) , and formation control (Ji and Egerstedt 
2007; Schuresko and Cortes 2009a), where 
so far connectivity had been an assumption. 
Further extensions and contributions involve 
connectivity control for double integrator agents 
(Notarstefano et al. 2006), agents with bounded 
inputs (Ajorlou and Aghdam 2010; Ajorlou et al. 
2010; Dimarogonas and Johansson 2008), and 
indoor navigation (Stump et al. 2008), as well 
as for communication based on radio signal 
strength (Hsieh et al. 2008; Mostofi 2009; 


Powers and Balch 2004; Wagner and Arkin 
2004) and visibility constraints (Anderson et al. 
2003; Ando et al. 1999; Arkin and Diaz 2002; 
Flocchini et al. 2005; Ganguli et al. 2009). 
Periodic connectivity for robot teams that need to 
occasionally split in order to achieve individual 
objectives (Hollinger and Singh 2010; Zavlanos 

2010 ) and sufficient conditions for connectivity 
in leader-follower networks (Gustavi et al. 2010) 
also adds to the list. Early experimental results 
have demonstrated efficiency of these algorithms 
also in practice (Hollinger and Singh 2010; 
Michael et al. 2009; Tardioli et al. 2010). 

Summary and Future Directions 

Although graphs provide a simple abstraction of 
inter-robot communications, it has long been rec¬ 
ognized that since links in a wireless network do 
not entail tangible connections, associating links 
with arcs on a graph can be somewhat arbitrary. 
Indeed, topological definitions of connectivity 
start by setting target signal strengths to draw 
the corresponding graph. Even small differences 
in target strengths might result in dramatic dif¬ 
ferences in network topology (Lundgren et al. 
2002). As a result, graph connectivity is neces¬ 
sary but not nearly sufficient to guarantee com¬ 
munication integrity, interpreted as the ability 
of a network to support desired communication 
rates. 

To address these challenges, a new body of 
work is recently appearing that departs from tra¬ 
ditional graph-based models of communication. 
Specifically, Zavlanos et al. (2013) employs a 
simple, yet effective, modification that relies on 
weighted graph models with weights that capture 
the packet error probability of each link (De- 
Couto et al. 2006). When using reliabilities as 
link metrics, it is possible to model routing and 
scheduling problems as optimization problems 
that accept link reliabilities as inputs (Ribeiro 
et al. 2007, 2008). The key idea proposed in 
Zavlanos et al. (2013) is to define connectivity 
in terms of communication rates and to use op¬ 
timization formulations to describe optimal op¬ 
erating points of wireless networks. Then, the 
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communication variables are updated in discrete 
time via a distributed gradient descent algorithm 
on the dual function, while robot motion is reg¬ 
ulated in continuous time by means of appro¬ 
priate distributed barrier potentials that maintain 
desired communication rates. Related approaches 
consider optimal communications based on T-slot 
time averages of the primal variables for general 
mobility schemes Neely (2010), as well as opti¬ 
mization of mobility and communications based 
on the end-to-end bit error rate between nodes 
(Ghaffarkhah and Mostofi 2011; Yan and Mostofi 
2012 ). 


Cross-References 

► Flocking in Networked Systems 

► Graphs for Modeling Networked Interactions 
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Abstract 

In this entry, we present models of dynamic 
noncooperative games, solution concepts and al¬ 
gorithms for finding game solutions. For the sake 
of exposition, we focus mostly on finite games, 
where the number of actions available to each 


player is finite, and discuss briefly extensions to 
infinite games. 
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Introduction 

Dynamic noncooperative games allow multiple 
actions by individual players, and include explicit 
representations of the information available to 
each player for selecting its decision. Such games 
have a complex temporal order of play and an 
information structure that reflects uncertainty as 
to what individual players know when they have 
to make decisions. This temporal order and in¬ 
formation structure is not evident when the game 
is represented as a static game between play¬ 
ers that select strategies. Dynamic games often 
incorporate explicit uncertainty in outcomes, by 
representing such outcomes as actions taken by a 
random player (called chance or “Nature”) with 
known probability distributions for selecting its 
actions. 

We focus our exposition on models of finite 
games and discuss briefly extensions to infinite 
games at the end of the entry. 

Finite Games in Extensive Form 

The extensive form of a game was introduced 
by von Neumann (1928) and later refined by 
Kuhn (1953) to represent explicitly the order of 
play, the information and actions available to 
each player for making decisions at each of their 
turns, and the payoffs that players receive after 
a complete set of actions. Let I = {0,1 ,,n} 
denote the set of players in a game, where player 
0 corresponds to Nature. The extensive form is 
represented in terms of a game tree, consisting 
of a rooted tree with nodes J\f and edges £. The 
root node represents the initial state of the game. 
Nodes x in this tree correspond to positions or 
“states” of the game. Any non-root node with 
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more than one incident edge is an internal node 
of the tree and is a decision node associated 
with a player in the game; the root node is also 
a decision node. Each decision node x has a 
player o(x ) E I assigned to select a decision 
from a finite set of admissible decisions A(x). 
Using distance from the root node to indicate 
direction of play, each edge that follows node x 
corresponds to an action in A(x) taken by player 
p(x), which evolves the state of the game into a 
subsequent node x'. 

Non-root nodes with only one incident edge 
are terminal nodes, which indicate the end of 
the game; such nodes represent outcomes of the 
game. The unique path from the root node to a 
terminal node is a called a play of the game. For 
each outcome node x, there are payoff functions 
Ji (x) associated with each player i e {1,..., n}. 

The above form represents the different play¬ 
ers, the order in which players select actions and 
their possible actions, and the resulting payoffs 
to each player from a complete set of plays of 
the game. The last component of interest is to 
represent the information available for players to 
select decisions at each decision node. Due to the 
tree structure of the extensive form of a game, 
each decision node x contains exact information 
on all of the previous actions taken that led to 
state v. In games of perfect information , each 
player knows exactly the decision node x at 
which he/she is selecting an action. To represent 
imperfect information, the extensive form uses 
the notion of an information set , which represents 
a group of decision nodes, associated with the 
same player, where the information available to 
that player is that the game is in one of the states 
in that information set. Formally, let A// C M be 
the set of decision nodes associated with player 
i , for i E I . Fet Hi denote a partition of A// so 
that, for each set h\ E Hi , we have the following 
properties: 

• If x,x' E h\, then they have the same admis¬ 
sible actions: A(x) = A(x'). 

• If x,x' E h^, then they cannot both belong 
to a play of the game; that is, both v and x' 
cannot be on a path from the root node to an 
outcome. 

Elements h\ for some player i are the information 
sets. Each decision node x belongs to one and 
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only one information set associated with player 
p(x). The constraints above ensure that, for each 
information set, there is a unique player identified 
to select actions, and the set of admissible actions 
is unambiguously defined. Denote by A(h^) the 
set of admissible actions at information set h\. 
The last condition is a causality condition that 
ensures that a player who has selected a previous 
decision remembers that he/she has already made 
that previous decision. 

Figure 1 illustrates an extensive form for a 
two-person game. Player 1 has two actions in any 
play of the game, whereas player 2 has only 1 
action. Player 1 starts the game at the root node 
a\ the information set h\ shows that player 2 is 
unaware of this action, as both nodes that descend 
from node a are in this information set. After 
player 2’s action, player 1 gets to select a sec¬ 
ond action. However, the information sets h\, h\ 
indicate that player 1 recalls what earlier action 
he/she selected, but he/she has not observed the 
action of player 2. The terminal nodes indicate 
the payoffs to player 1 and player 2 as an ordered 
pair. 


Strategies and Equilibrium Solutions 

A pure strategy y* for player i E {1,is a 
function that maps each information set of player 
i into an admissible decision. That is, 
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Yi : Hi -> A 

such that Yi(h f) G A(h^). The set of pure 
strategies for player i is denoted as T z . 

Note that pure strategies do not include actions 
selected by Nature. Nature’s actions are selected 
using probability distributions over the choices 
available for the information sets corresponding 
to Nature, where the choice at each information 
set is made independently of other choices. For 
finite games, the number of pure strategies for 
each player is also finite. Given a tuple of pure 
strategies y = (yi,..., y n ), we can define the 
probability of the play of the game resulting in 
outcome node x as tt(x), computed as follows: 
Each outcome node has a unique path (the play) 
p rx from root node r. We initialize n(r) = 1 
and node n = r. If the player at node n is 
Nature and the next node in p rx is n ', then 
7t(n') = 7t(n) * p(n,n'), where p(n,n') is the 
probability that Nature chooses the action that 
leads to n '. Otherwise, let i = pin) be the player 
at node n and let h^(n) denote the information 
set containing node n. Then, if y l (h^(n)) = 
a(n,n '), let 7t(n') = Tt(n ), where a(n,n') is 
the action in A(n) that leads to n'\ otherwise, set 
jr(n') = 0. The above process is repeated letting 
n' = n, until n' equals the terminal node x. Using 
these probabilities, the resulting expected payoff 
to player i is 

Ji (y) = n(x)Ji(x) 

x terminal, x€j\f 

This representation of payoffs in terms of 
strategies transforms the game from an extensive 
form representation to a strategic or normal form 
representation, where the concept of dynamics 
and information has been abstracted away. The 
resulting strategic form looks like a static game 
as discussed in the encyclopedia entry ► Strategic 
Form Games and Nash Equilibrium, where each 
player selects his/her strategy from a finite set, 
resulting in a vector of payoffs for the players. 
Using these payoffs, one can now define so¬ 
lution concepts for the game. Let the notation 
y_. = (yi,..., Yi-i, Yi+i, Yn) denote the set of 
strategies in a tuple excluding the / th strategy. 


A Nash equilibrium solution is a tuple of feasible 
strategies y* = (y*,..., y*) such that 

Ji (y *) > J (y *. , Yi) for all Yi e T z , 
for all i e {1 ,... ,n} (1) 

The special case of two-person games where 
Ji(y) = —Jiiy) are known as zero-sum games. 

As discussed in the encyclopedia entry on 
static games (► Strategic Form Games and Nash 
Equilibrium), the existence of Nash equilibria 
or even saddle point strategies in terms of pure 
strategies is not guaranteed for finite games. 
Thus, one must consider the use of mixed 
strategies. A mixed strategy /z z for player 
i G {1,is a probability distribution over 
the set of pure strategies T z . The definition of 
payoffs can be extended to mixed strategies by 
averaging the payoffs associated with the pure 
strategies, as 

JiUP) = ■■■ Ml(yi)---MnO'n) 

Denote the set of probability distributions over 
Ti as A(r f ). An n- tuple /z* = (/zi,..., /z w ) of 
mixed strategies is said to be a Nash equilibrium 
if 

Ji ijJ) > Ji (p*_. , Pi) for all pi 
G A(T 7 ), for all i G {1,... ,n} 

Theorem 1 (Nash 1950,1951) Every finite 
n-person game has at least one Nash equilibrium 
point in mixed strategies. 

Mixed strategies suggest that each player’s 
randomization occurs before the game is played, 
by choosing a strategy at random from its choices 
of pure strategies. For games in extensive form, 
one can introduce a different class of strategies 
where a player makes a random choice of action 
at each information set, according to a proba¬ 
bility distribution that depends on the specific 
information set. The choice of action is selected 
independently at each information set according 
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to a selected probability distribution, in a man¬ 
ner similar to how Nature’s actions are selected. 
These random choice strategies are known as 
behavior strategies. Let A (A(h^)) denote the 
set of probability distributions over the deci¬ 
sions for player i at information set h !\. 

A behavior strategy for player i is an element 
Xi G Ylh^en A(A(A^)), where Xi(h\) denotes 
the probability distribution of the behavior strat¬ 
egy Xi over the available decisions at information 
set h\. 

Note that the space of admissible behavior 
strategies is much smaller than the space of ad¬ 
missible mixed strategies. To illustrate this, con¬ 
sider a player with K information sets and two 
possible decisions for each information set. The 
number of possible pure strategies would be 2 K , 
and thus the space of mixed strategies would be 
a probability simplex of dimension 2 K — 1. In 
contrast, behavior strategies would require spec¬ 
ifying probability distributions over two choices 
for each of K information sets, so the space 
of behavior strategies would be a product space 
of K probability simplices of dimension 1, to¬ 
taling dimension K. One way of understanding 
this difference is that mixed strategies introduce 
correlated randomization across choices at differ¬ 
ent information sets, whereas behavior strategies 
introduce independent randomization across such 
choices. 

For every behavior strategy, one can find an 
equivalent mixed strategy by computing the prob¬ 
abilities of every set of actions that result from 
the behavior strategy. The converse is not true 
for general games in extensive form. However, 
there is a special class of games for which the 
converse is true. In this class of games, players 
recall what actions they have selected previously 
and what information they knew previously. A 
formal definition of perfect recall is beyond the 
scope of this exposition but can be found in 
Hart (1991) and Kuhn (1953). The implication of 
perfect recall is summarized below: 

Theorem 2 (Kuhn 1953) Given a finite n- 
person game in which player i has perfect recall, 
for each mixed strategy pi for player i, there 
exists a corresponding behavior strategy Xi that 


is equivalent, where every player receives the 
same payoffs under both strategies ptj and x z . 

This equivalence was extended by Aumann 
to infinite games (Aumann 1964). In dynamic 
games, it is common to assume that each player 
has perfect recall and thus solutions can be found 
in the smaller space of behavior strategies. 

Computation of Equilibria 

Algorithms for the computation of mixed- 
strategy Nash equilibria of static games can 
be extended to compute mixed-strategy Nash 
equilibria for games in extensive form when 
the pure strategies are enumerated as above. 
However, the number of pure strategies grows 
exponentially with the size of the extensive form 
tree, making these methods hard to apply. For 
two-person games in extensive form with perfect 
recall by both players, one can search for Nash 
equilibria in the much smaller space of behavior 
strategies. This was exploited in Koller et al. 
(1996) to obtain efficient linear complementarity 
problems for nonzero-sum games and linear 
programs for zero-sum games where the number 
of variables involved is linear in the number of 
internal decision nodes of the extensive form 
of the game. A more detailed overview of 
computation algorithms for Nash equilibria can 
be found in McKelvey and McLennan (1996). 
The Gambit Web site (McKelvey et al. 2010) 
provides software implementations of several 
techniques for computation of Nash equilibria in 
two- and ft-person games. 

An alternative approach to computing Nash 
equilibria for games in extensive forms is based 
on subgame decomposition, discussed next. 

Subgames 

Consider a game G in extensive form. A node 
c is a successor of a node n if there is a path 
in the game tree from ft to c. Let h be a node 
in G that is not terminal and is the only node in 
its information set. Assume that if a node c is a 
successor of h , then every node in the information 
set containing c is also a successor of h. In this 
situation, one can define a subgame H of G 
with root node h , which consists of node h and 
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Dynamic Noncooperative Games, Fig. 2 Simple game 
with multiple Nash equilibria 

its successors, connected by the same actions as 
in the original game G, with the payoffs and 
terminal vertices equal to those in G. This is 
the subgame that would be encountered by the 
players had the previous play of the game reached 
the state at node h. Since the information set 
containing node h contains no other node and all 
the information sets in the subgame contain only 
nodes in the subgame, then every player in the 
subgame knows that they are playing only in the 
subgame once h has been reached. 

Figure 2 illustrates a game where every non¬ 
terminal node is contained in its own information 
set. This game contains a subgame rooted at 
node c. Note that the full game has two Nash 
equilibria in pure strategies: strategies (L, L) and 
(R, R). However, strategy (L, L) is inconsistent 
with how player 2 would choose its decision if it 
were to find itself at node c. This inconsistency 
arises because the Nash equilibria are defined in 
terms of strategies announced before the game 
is played and may not be a reasonable way 
to select decisions if unanticipated plays occur. 
When player 1 chooses L, node c should not be 
reached in the play of the game, and thus player 
2 can choose L because it does not affect the 
expected payoff. One can define the concept of 
Nash equilibria that are sequentially consistent as 
follows. 

Let Xi be behavior strategies for player i in 
game G. Denote the restriction of these strate¬ 
gies to the subgame H as xf 1 . This restriction 
describes the probabilities for choices of actions 


for the information sets in H . Suppose the game 
G has a Nash equilibrium achieved by strategies 
(x\,..., x n ). This Nash equilibrium is called sub¬ 
game perfect (Selten 1975) if, for every subgame 
H of G, the strategies (vf 7 ,..., x„ ) are a Nash 
equilibrium for the subgame H . In the game in 
Fig. 2, there is only one subgame perfect equi¬ 
librium, which is (R, R). There are several other 
refinements of the Nash equilibrium concept to 
enforce sequential consistency, such as sequential 
equilibria (Kreps and Wilson 1982). 

An important application of subgames is to 
compute subgame perfect equilibria by backward 
induction. The idea is to start with a subgame 
root node as close as possible to an outcome node 
(e.g., node c in Fig. 2). This small subgame can 
be solved for its Nash equilibria, to compute the 
equilibrium payoffs for each player. Then, in the 
original game G, the root node of the subgame 
can be replaced by an outcome node, with payoffs 
equal to the equilibrium payoffs in the subgame. 
This results in a smaller game, and the process 
can be applied inductively until the full game is 
solved. The subgame perfect equilibrium strate¬ 
gies can then be computed as the solution of 
the different subgames solved in this backward 
induction process. For Fig. 2, the subgame at c is 
solved by player 2 selecting R , with payoffs (4,1). 
The reduced new game has two choices for player 
1, with best decision R. This leads to the overall 
subgame perfect equilibrium (R, R). 

This backward induction procedure is similar 
to the dynamic programming approach to solving 
control problems. Backward induction was first 
used by Zermelo (1912) to analyze zero-sum 
games of perfect information such as chess. An 
extension of Zermelo’s work by Kuhn (1953) 
establishes the following result: 

Theorem 3 Every finite game of perfect informa¬ 
tion has a subgame perfect Nash equilibrium in 
pure strategies. 

This result follows because, at each step in the 
backward induction process, the resulting sub¬ 
game consists of a choice among finite actions for 
a single player, and thus a pure strategy achieves 
the maximal payoff possible. 
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Infinite Noncooperative Games 

When the number of options available to players 
is infinite, game trees are no longer appropriate 
representations for the evolution of the game. 
Instead, one uses state space models with ex¬ 
plicit models of actions and observations. A typ¬ 
ical multistage model for the dynamics of such 
games is 

x(t + 1) = f(x(t), u\(t ),..., u n (t), w(t),t), 
t = 0,..., T - 1 (2) 

with initial condition x(0) = w(0), where 

x{t) is the state of the game at stage t , and 
u\(t ),..., u n {t) are actions selected by players 
1 ,,n at stage t , and w(t) is an action selected 
by Nature at stage t. The space of actions for each 
player are restricted at each time to infinite sets 
Ai ( t ) with an appropriate topological structure, 
and the admissible state x(t) at each time belongs 
to an infinite set X with a topological structure. In 
terms of Nature’s actions, for each stage t , there is 
a probability distribution that specifies the choice 
of Nature’s action w(7), selected independently 
of other actions. 

Equation (2) describes a play of the game, 
in terms of how different actions by players 
at the various stages evolve the state of the 
game. A play of the game is thus a history 
h = (x(0), u\(0), ..., u n (0), x(l), ..., 

Uni I),--- ,x(T)). Associated with each play of 
the game is a set of real-valued functions / 7 ( h ) 
that indicates the payoff to player i in this play. 
This function is often assumed to be separable 
across the variables in each stage. 

To complete the description of the extensive 
form, one must now introduce the available in¬ 
formation to each player at each stage. Define 
observation functions 

yiit) = gi(x(t),v(t),t), i = 

where y 7 ( t ) takes values in observations 
spaces which may be finite or infinite, and 
v(t) are selected by Nature given their 
probability distributions, independent of other 


selections. Define the information available 
for player i at stage t to be U(t), a subset 
of {yi(0),... ,y„(0),.. .., y n (t); ki(0), 

..., u n (0 ),..., u\{t — 1 u n (t — 1)}. With 
this notation, strategies y 7 (t) for player i are 
functions that map, at each stage, the available 
information // ( t ) into admissible decisions 
ui(t) e Af{t). With appropriate measurability 
conditions, specifying a full set of strategies 
y = (yi,... y n ) for each of the players induces a 
probability distribution on the plays of the game, 
which leads to the expected payoff Nash 

equilibria are defined in identical fashion to (1). 

Obtaining solutions of multistage games is a 
difficult task that depends on the ability to use 
the subgame decomposition techniques discussed 
previously. Such subgame decompositions are 
possible when games do not include actions by 
Nature and the payoff functions have a stagewise 
additive property. Under such cases, backward 
induction allows the construction of Nash equi¬ 
librium strategies through the recursive solutions 
of static infinite games, such as those discussed 
in the encyclopedia entry on static games. 

Generalizations of multistage games to con¬ 
tinuous time result in differential games, covered 
in two articles in the encyclopedia, but for the 
zero-sum case. Additional details on infinite dy¬ 
namic noncooperative games, exploiting different 
models and information structures and studying 
the existence, uniqueness, or nonuniqueness of 
equilibria, can be found in Basar and Olsder 
(1982). 

Conclusions 

In this entry, we reviewed models for dynamic 
noncooperative games that incorporate temporal 
order of play and uncertainty as to what indi¬ 
vidual players know when they have to make 
decisions. Using these models, we defined so¬ 
lution concepts for the games and discussed al¬ 
gorithms for determining solution strategies for 
the players. Active directions of research include 
development of new solution concepts for dy¬ 
namic games, new approaches to computation of 
game solutions, the study of games with a large 
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number of players, evolutionary games where 
players’ greedy behavior evolves toward equilib¬ 
rium strategies, and special classes of dynamic 
games such as Markov games and differential 
games. Several of these topics are discussed in 
other entries in the encyclopedia. 

Cross-References 

► Stochastic Dynamic Programming 

► Stochastic Games and Learning 

► Strategic Form Games and Nash Equilibrium 
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Abstract 

In 2012 the fleet of dynamically positioned (DP) 
ships and rigs was probably larger than 3,000 
units, predominately operating in the offshore oil 
and gas industry. The complexity and function¬ 
ality vary subject to the targeted marine opera¬ 
tion, vessel concept, and risk level. DP systems 
with advanced control functions and redundant 
sensor, power, and thruster/propulsion configu¬ 
rations are designed in order to provide high- 
precision fault-tolerant control in safety-critical 
marine operations. The DP system is customized 
for the particular application with integration to 
other control systems, e.g., power management, 
propulsion, drilling, oil and gas production, off¬ 
loading, crane operation, and pipe and cable lay¬ 
ing. For underwater vehicles such as remotely 
operated vehicles (ROVs) and autonomous un¬ 
derwater vehicles (AUVs), DP functionality also 
denoted as hovering is implemented on several 
vehicles. 


Keywords 

Autonomous underwater vehicles (AUVs); Fault- 
tolerant control; Remotely operated vehicles 
(ROVs) 


Introduction 

The offshore oil and gas industry is the 
dominating market for DP vessels. The various 
offshore applications include offshore service 
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vessels, drilling rigs (semisubmersibles) and 
ships, shuttle tankers, cable and pipe layers, 
floating production, storage, and off-loading 
units (FPSOs), crane and heavy lift vessels, 
geological survey vessels, rescue vessels, and 
multipurpose construction vessels. DP systems 
are also installed on cruise ships, yachts, fishing 
boats, navy ships, tankers, and others. 

A DP vessel is by the International Maritime 
Organization (IMO) and the maritime class so¬ 
cieties (DNV GL, ABS, LR, etc.) defined as a 
vessel that maintains its position and heading 
(fixed location denoted as stationkeeping or pre¬ 
determined track) exclusively by means of active 
thrusters. The DP system as defined by class 
societies is not only limited to the DP control 
system including computers and cabling. Position 
reference systems of various types measuring 
North-East coordinates (satellites, hydroacoustic, 
optics, taut wire, etc.), sensors (heading, roll, 
pitch, wind speed and direction, etc.), the power 
system, thruster and propulsion system, and in¬ 
dependent joystick system are also essential parts 
of the DP system. In addition, the DP operator is 
an important element securing safe and efficient 
DP operations. Further development of human- 
machine interfaces, alarm systems, and operator 
decision support systems is regarded as top pri¬ 
ority bridging advanced and complex technology 
to safe and efficient marine operations. Sufficient 
DP operator training is a part of this. 

The thruster and propulsion system controlled 
by the DP control system is regarded as one of 
the main power consumers on the DP vessel. An 
important control system for successful integra¬ 
tion with the power plant and the other power 
consumers such as drilling system, process sys¬ 
tem, heating, and ventilation system is the power 
and energy management system (PMS/EMS) bal¬ 
ancing safety requirements and energy efficiency. 
The PMS/EMS controls the power generation 
and distribution and the load control of heavy 
power consumers. In this context both transient 
and steady-state behaviors are of relevance. A 
thorough understanding of the hydrodynamics, 
dynamics between coupled systems, load char¬ 
acteristics of the various power consumers, con¬ 
trol system architecture, control layers, power 


system, propulsion system, and sensors is impor¬ 
tant for successful design and operation of DP 
systems and DP vessels. 

Thruster-assisted position mooring is another 
important stationkeeping application often used 
for FPSOs, drilling rigs, and shuttle tanker oper¬ 
ations where the DP system has to be redesigned 
accounting for the effect of the mooring system 
dynamics. In thruster-assisted position mooring, 
the DP system is renamed to position mooring 
(PM) system. PM systems have been commer¬ 
cially available since the 1980s. While for DP- 
operated ships the thrusters are the sole source 
of the stationkeeping, the assistance of thrusters 
is only complementary to the mooring system. 
Here, most of the stationkeeping is provided by 
a deployed anchor system. In severe environmen¬ 
tal conditions, the thrust assistance is used to 
minimize the vessel excursions and line tension 
by mainly increasing the damping in terms of 
velocity feedback control and adding a bias force 
minimizing the mean tensions of the most loaded 
mooring lines. Modeling and control of turret- 
anchored ships are treated in Strand et al. (1998) 
and Nguyen and Sprensen (2009). 

Overview of DP systems including references 
can be found in Fay (1989), Fossen (2011), and 
Sprensen (2011). The scientific and industrial 
contributions since the 1960s are vast, and many 
research groups worldwide have provided impor¬ 
tant results. In Sprensen et al. (2012), the devel¬ 
opment of DP system for ROVs is presented. 

Mathematical Modeling of DP Vessels 

Depending on the operational conditions, the 
vessel models may briefly be classified into 
stationkeeping, low-velocity, and high-velocity 
models. As shown in Sprensen (2011) and 
Fossen (2011) and the references therein, 
different model reduction techniques are 
used for the various speed regimes. Vessel 
motions in waves are defined as seakeeping 
and will here apply both for stationkeeping 
(zero speed) and forward speed. DP vessels 
or PM vessels can in general be regarded as 
stationkeeping and low-velocity or low Froude 
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number applications. This assumption will 
particularly be used in the formulation of math¬ 
ematical models used in conjunction with the 
controller design. It is common to use a two-time 
scale formulation by separating the total model 
into a low-frequency (LF) model and a wave- 
frequency (WF) model (seakeeping) by superpo¬ 
sition. Hence, the total motion is a sum of the 
corresponding LF and the WF components. The 
WF motions are assumed to be caused by first- 
order wave loads. Assuming small amplitudes, 
these motions will be well represented by a linear 
model. The LF motions are assumed to be caused 
by second-order mean and slowly varying wave 
loads, current loads, wind loads, mooring (if 
any), and thrust and rudder forces and moments. 

For underwater vehicles operating below the 
wave zone, estimated to be deeper than half the 
wavelength, the wave loads can be disregarded 
and of course the effect of the wind loads as well. 

Modeling Issues 

The mathematical models may be formulated in 
two complexity levels: 

• Control plant model is a simplified mathe¬ 
matical description containing only the main 
physical properties of the process or plant. 
This model may constitute a part of the con¬ 
troller. The control plant model is also used 
in analytical stability analysis based on, e.g., 
Lyapunov stability. 

• Process plant model or simulation model is a 
comprehensive description of the actual pro¬ 
cess and should be as detailed as needed. The 
main purpose of this model is to simulate 
the real plant dynamics. The process plant 
model is used in numerical performance and 
robustness analysis and testing of the control 
systems. As shown above, the process plant 
models may be implemented for off-line or 
real-time simulation (e.g., HIL testing; see Jo¬ 
hansen et al. 2007) purposes defining different 
requirements for model fidelity. 

Kinematics 

The relationship between the Earth-fixed position 
and orientation of a floating structure and its 
body-fixed velocities is 


" V 


"Jl (12) 

03x3 

Vi 



03x3 

J2 Cn 2 )_ 

V 2 


( 1 ) 


The vectors defining the Earth-fixed vessel posi¬ 
tion (r) ]) and orientation (k] 2 ) using Euler angles 
and the body-fixed translation ( v \) and rotation 
( 1 * 2 ) velocities are given by 

r)i = [x y z] r ,r) 2 = [0 9 if, 

vi = [u v ,v 2 = [p q r] T . (2) 

The rotation matrix Ji (ti 2 ) g SO(3) and the 
velocity transformation matrix J 2 O 12 ) e H£ 3x3 
are defined in Fossen (2011). For ships, if only 
surge, sway and yaw (3DOF) are considered, the 
kinematics and the state vectors are reduced to 


Y) = R(^)i>, or 
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Process Plant Model: Low-Frequency 
Motion 

The 6-DOF LF model formulation is based on 
Fossen (2011) and Sprensen (2011). The equa¬ 
tions of motion for the nonlinear LF model of a 
floating vessel are given by 


Mi) + C rb(v)v + C A (v r )v r + D(u r ) + G(ii) 
= ”T\vave2 T "Twine! T" "Chr T" Tmoor* (4) 

where M g M 6x6 is the system inertia matrix 
including added mass; Crb(v) g M 6x6 and 
CU(u r ) G M 6x6 are the skew-symmetric Coriolis 
and centripetal matrices of the rigid body and 
the added mass; G(yi) g M 6 is the generalized 
restoring vector caused by the mooring lines 
(if any), buoyancy, and gravitation; x t hr £ M 6 
is the control vector consisting of forces and 
moments produced by the thruster system; x w i nc i 
and x wav e 2 € M 6 are the wind and second-order 
wave load vectors, respectively. 
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The damping vector may be divided into linear 
and nonlinear terms according to 


D (v r ) = d L (v r ,K)v r + d NL (v r ,y r )v r , (5) 

where v r e M 6 is the relative velocity vector 
between the current and the vessel. The nonlin¬ 
ear damping, djvL, is assumed to be caused by 
turbulent skin friction and viscous eddy-making, 
also denoted as vortex shedding (Faltinsen 1990). 
The strictly positive linear damping matrix di e 
M 6x6 is caused by linear laminar skin friction 
and is assumed to vanish for increasing speed 
according to 


d L (v r ,/c) 


~X Ur e~ K ^ 
_N Ur e ~ K W 


X r e ~ K|r| 



coefficient - co. However, this is a common 
used formulation denoted as a pseudo-differential 
equation. An important feature of the added mass 
terms and the wave radiation damping terms 
is the memory effects, which in particular are 
important to consider for nonstationary cases, 
e.g., rapid changes of heading angle. Memory 
effects can be taken into account by introducing 
a convolution integral or a so-called retardation 
function (Newman 1977) or state space models 
as suggested by Fossen (201 1). 

Control Plant Model 

For the purpose of controller design and analysis, 
it is convenient to apply model reduction and 
derive a LF and WF control plant model in 
surge, sway, and yaw about zero vessel velocity 
according to 


where k is a positive scaling constant such that 
k e R + . 

Process Plant Model: Wave-Frequency 
Motion 

The coupled equations of the WF motions in 
surge, sway, heave, roll, pitch, and yaw are as¬ 
sumed to be linear and can be formulated as 

M(co)v Rw + Djr 7 (o;)iii^ w + Gr) Rw = T wave i, 

*iw = J (i 2 )W 

(7) 

where \\ Rw gM 6 is the WF motion vector in the 
hydrodynamics frame. \\ w e M 6 is the WF motion 
vector in the Earth-fixed frame. x wave i e M 6 is 
the first-order wave excitation vector, which will 
be modified for varying vessel headings relative 
to the incident wave direction. M(&>) e M 6x6 is 
the system inertia matrix containing frequency 
dependent added mass coefficients in addition to 
the vessel’s mass and moment of inertia. T> p (co) 
e M 6x6 is the wave radiation (potential) damping 
matrix. The linearized restoring coefficient ma¬ 
trix G e R 6x6 is due to gravity and buoyancy 
affecting heave, roll, and pitch only. For anchored 
vessels, it is assumed that the mooring system 
will not influence the WF motions. 

Remark 1 Generally, a time domain equation 
cannot be expressed with frequency domain 


— A pw \) w T E^w pw , 

(8) 

rj = R (^r)u, 

(9) 

b = — T^b + E b w b , 

(10) 

= —Dlv + R r (V f )b + x, 

(11) 

(bp = 0, 

(12) 

T ~\ T 

(if] + C'jjwPw) ’ 

(13) 


where co p e R is the peak frequency of the waves 
(PFW). The estimated PFW can be calculated 
by spectral analysis of the pitch and roll mea¬ 
surements assumed to dominantly oscillate at the 
peak wave frequency. In the spectral analysis, 
the discrete Fourier transforms of the measured 
roll and pitch, which are collected through a 
period of time, are done by taking the n-point 
fast Fourier transform (FFT). The PFW may be 
found to be the frequency at which the power 
spectrum is maximal. The assumption (b p = 0 
is valid for slowly varying sea state. You can 
also find the wave frequency using nonlinear 
observers/EKF and signal processing techniques. 
It is assumed that the second-order linear model is 
sufficient to describe the first-order wave-induced 
motions, and then p w e M 6 is the state of the WF 
model. A pw e M 6x6 is assumed Hurwitz and de¬ 
scribes the first-order wave-induced motion as a 
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mass-damper-spring system, e M 3 is a zero- 
mean Gaussian white noise vector, y is the mea¬ 
surement vector. The WF measurement matrix 
C pw e M 3x6 and the disturbance matrix E pw e 
M 6x3 are formulated as 

c pw = [o 3 x 3 i 3 x 3 ],eJ w = [o 3 x3 K] T ■ 

(14) 

Here, a 3-DOF model is assumed adopting the 
notation in (3) such that x\ e M 3 and v e M 3 are 
the LF position vector in the Earth-fixed frame 
and the LF velocity vector in the body-fixed 
frame, respectively. M e M 3x3 and D L £ M 3x3 are 
the mass matrix including hydrodynamic added 
mass and linear damping matrix, respectively. 
The bias term accounting for unmodeled affects 
and slowly varying disturbances b £ M 3 is mod¬ 
eled as Markov processes with positive definite 
diagonal matrix T& e M 3x3 of time constants. If 
Tb is removed, a Wiener process is used, w b £ 
M 3 is a bounded disturbance vector, and e 
M 3x3 is a disturbance scaling matrix, x e M 3 is the 
control force. As mentioned later in the paper, the 
proposed model reduction considering only hor¬ 
izontal motions may create problems conducting 
DP operations of structures with low waterplane 
area such as semisubmersibles. More details can 
be found in Sprensen (2011) and Fossen (2011) 
and the references therein. 

For underwater vehicles, 6-DOF model should 
be used. For underwater vehicles with self- 
stabilizing roll and pitch, a 4-DOF model with 
surge, sway, yaw, and heave may be used; see 
Sprensen et al. (2012). 

Control Levels and Integration 
Aspects 

The real-time control hierarchy of a marine con¬ 
trol system (Sprensen 2005) may be divided into 
three levels: the guidance system and local op¬ 
timization , the high-level plant control (e.g., DP 
controller including thrust allocation), and the 
low-level thruster control. The DP control system 
consists of several modules as indicated in Fig. 1: 
• Signal processing for analysis and testing of 
the individual signals including voting and 
weighting when redundant measurements are 


available. Ensuring robust and fault-tolerant 
control proper diagnostics and change detec¬ 
tion algorithms is regarded as maybe one of 
the most important research areas. For an 
overview of the field, see B as Seville and Niki¬ 
forov (1993) and Blanke et al. (2003). 

• Vessel observer for state estimation and 
wave filtering. In case of lost sensor 
signals, the predictor is used to provide 
dead reckoning, which is required by class 
societies. Prediction error which is the 
deviation between the measurements and the 
estimated measurements is also one important 
barrier in the failure detection. 

• Feedback control law is often of multivari¬ 
able PID type, where feedback is produced 
from the estimated low-frequency (LF) posi¬ 
tion and heading deviations and estimated LF 
velocities. 

• Feedforward control law is normally the 
wind force and moment. For the different 
applications (pipe laying, ice operations, 
position mooring), tailor-made feedforward 
control functions are also used. 

• Guidance system with reference models is 
needed in achieving a smooth transition 
between setpoints. In the most basic case, 
the operator specifies a new desired position 
and heading, and a reference model generates 
smooth reference trajectories/paths for 
the vessel to follow. A more advanced 
guidance system involves way-point tracking 
functionality with optimal path planning. 

• Thrust allocation computes the force and 
direction commands to each thruster device 
based on input from the resulting feedback 
and feedforward controllers. The low-level 
thruster controllers will then control the 
propeller pitch, speed, torque, and power. 

• Model adaptation provides the necessary cor¬ 
rections of the vessel model and the controller 
settings subject to changes in the vessel draft, 
wind area, and variations in the sea state. 

• Power management system performs diesel 
engine control, power generation management 
with frequency and voltage monitoring, active 
and passive load sharing, and load dependent 
start and stop of generator sets. 
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Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 1 Controller structure 


DP Controller 

In the 1960s the first DP system was introduced 
for horizontal modes of motion (surge, sway, and 
yaw) using single-input single-output PID control 
algorithms in combination with low-pass and/or 
notch filters. In the 1970s more advanced output 
control methods based on multivariable optimal 
control and Kalman filter theory were proposed 
by Balchen et al. (1976) and later refined in 
Saelid et al. (1983); Grimble and Johnson (1988); 
and others as referred to in Sprensen (2011). 
In the 1990s nonlinear DP controller designs 
were proposed by several research groups; for 
an overview see Strand et al. (1998), Fossen and 
Strand (1999), Pettersen and Fossen (2000), and 
Sprensen (2011). Nguyen et al. (2007) proposed 
the design of hybrid controller for DP from calm 
to extreme sea conditions. 

Plant Control 

By copying the control plant model (8)—(13) and 
adding an injection term, a passive observer may 


be designed. A nonlinear output horizontal-plane 
positioning feedback controller of PID type may 
be formulated as 

x pid = —R e T K P e - R e T K p3 f(e) - K d v -R T K iZ , 

(15) 

where eE R 3 is the position and heading deviation 
vector, v E R 3 is the velocity deviation vector, ze 
R 3 is the integrator states, and f(e) is a third-order 
stiffness term defined as 

e = [e x ,e 2 ,e 3 ] T = R r (4> rf )(T| - x\ d ), 

V = V -R T (<t> d )T] d , 

R e = R((J) - cM = R r (^)R(4>), 
f(e) = [e\,e\,e\] T . 

Experience from implementation and operations 
of real DP control systems has shown that §d 
in the calculation of the error vector e generally 
gives a better performance with less noisy sig¬ 
nals than using However, this is only valid 
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DP Controller 

computes desired 
force [N] in surge, 
sway and moment 
[Nm] in yaw 


Thrust Allocation 

computes desired 
force [N] each 
thruster must 
produce 


In the Thrust Characteristics 

mapping desired speed, 
torque or power for each 
propeller is calculated. 

Here, speed mapping is 
shown 


Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 2 Thrust allocation 


under the assumption that the vessel maintains 
its desired heading with small deviations. As 
the DP capability for ships is sensitive to the 
heading angle, i.e., minimizing the environmental 
loads, heading control is prioritized in case of 
limitations in the thrust capacity. This feature is 
handled in the thrust allocation. 

An advantage of this is the possibility to re¬ 
duce the first-order proportional gain matrix, re¬ 
sulting in reduced dynamic thruster action for 
smaller position and heading deviations. More¬ 
over, the third-order restoring term will make 
the thrusters to work more aggressive for larger 
deviations. K p , K p3 , K d , and Ki e M 3x3 are 
the nonnegative controller gain matrices for pro¬ 
portional, third-order restoring, derivative, and 
integrator controller terms, respectively, found by 
appropriate controller synthesis methods. 

For small-waterplane-area marine vessels such 
as semisubmersibles, often used as drilling rigs, 
Sprensen and Strand (2000) proposed a DP con¬ 
trol law with the inclusion of roll and pitch 
damping according to 
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where p and q are the estimated pitch and roll an¬ 
gular velocities. The resulting positioning control 
law is written as 

T = X W FF + "CpiD + Tqxj, (12) 

where t w ff £ M 3 is the wind feedforward control 
law. 

Thrust allocation or control allocation (Fig. 2) 
is the mapping between plant and actuator con¬ 
trol. It is assumed to be a part of the plant control. 
The DP controller calculated the desired force 
in surge and sway and moment in yaw. Depen¬ 
dent on the particular thrust configuration with 
installed and enabled propellers, tunnel thrusters, 
azimuthing thrusters, and rudders, the allocation 
is a nontrivial optimization problem calculating 
the desired thrust and direction for each enabled 
thruster subject to various constraints such as 
thruster ratings, forbidden sectors, and thrust effi¬ 
ciency. References on thrust allocation are found 
in Johansen and Fossen (2013). 

In Sprensen and Smogeli (2009), torque and 
power control of electrically driven marine pro¬ 
pellers are shown. Ruth et al. (2009) proposed 
anti-spin thrust allocation, and Smogeli et al. 
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(2008) and Smogeli and Sdrensen (2009) pre¬ 
sented the concept of anti-spin thruster control. 


Cross-References 

► Control of Ship Roll Motion 

► Fault-Tolerant Control 

► Mathematical Models of Ships and Underwater 
Vehicles 

► Motion Planning for Marine Control Systems 

► Underactuated Marine Control Systems 
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Abstract 

Economic model predictive control (EMPC) is 
a variant of model predictive control aimed at 
maximization of system’s profitability. It allows 
one to explicitly deal with hard and average 
constraints on system’s input and output variables 
as well as with nonlinearity of dynamics. We 
provide basic definitions and concepts of the 
approach and highlight some promising research 
directions. 
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Introduction 

Most control tasks involve some kind of eco¬ 
nomic optimization. In classical linear quadratic 


(LQ) control, for example, this is cast as a trade¬ 
off between control effort and tracking perfor¬ 
mance. The designer is allowed to settle such a 
trade-off by suitably tuning weighting parameters 
of an otherwise automatic design procedure. 

When the primary goal of a control system 
is profitability rather than tracking performance, 
a suboptimal approach has often been devised, 
namely, a hierarchical separation is enforced be¬ 
tween the economic optimization layer and the 
dynamic real-time control layer. 

In practice, while set points are computed by 
optimizing economic revenue among all equilib¬ 
ria fulfilling the prescribed constraints, the task 
of the real-time control layer is simply to drive 
(basically as fast as possible) the system’s state 
to the desired set-point value. 

Optimal control or LQ control may be used 
to achieve the latter task, possibly in conjunc¬ 
tion with model predictive control (MPC), but 
the actual economics of the plant are normally 
neglected at this stage. 

The main benefits of this approach are 
twofold: 

1. Reduced computational complexity with re¬ 
spect to infinite-horizon dynamical program¬ 
ming 

2. Stability robustness in the face of uncertainty, 
normally achieved by using some form of 
robust control in the real-time control layer 
The hierarchical approach, however, is subop¬ 
timal in two respects: 

1. First of all, given nonlinearity of the 
plant’s dynamics and/or nonconvexity of the 
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functions characterizing the economic 
revenue, there is no reason why the most 
profitable regime should be an equilibrium. 

2. Even when systems are most profitably oper¬ 
ated at equilibrium, transient costs are totally 
disregarded by the hierarchical approach and 
this may be undesirable if the time constants 
of the plant are close enough to the time scales 
at which set point’s variations occur. 
Economic model predictive control seeks to 
remove these limitations by directly using the 
economic revenue in the stage cost and by the for¬ 
mulation of an associated dynamic optimization 
problem to be solved online in a receding horizon 
manner. It was originally developed by Rawlings 
and co-workers, in the context of linear control 
systems subject to convex constraints as an effec¬ 
tive technique to deal with infeasible set points 
(Rawlings et al. 2008) (in contrast to the classical 
approach of redesigning a suitable quadratic cost 
that achieves its minimum at the closest feasible 
equilibrium). Preserving the original cost has 
the advantage of slowing down convergence to 
such an equilibrium when the transient evolution 
occurs in a region where the stage cost is better 
than at steady state. Stability and convergence 
issues are at first analyzed, thanks to convexity 
and for the special case of linear systems only. 
Subsequently Diehl introduced the notion of ro¬ 
tated cost (see Diehl et al. 2011) that allowed a 
Lyapunov interpretation of stability criteria and 
paved the way for the extension to general dissi¬ 
pative nonlinear systems (Angeli et al. 2012). 

Economic MPC Formulation 

In order to describe the most common versions 
of economic MPC, assume that a discrete-time 
finite-dimensional model of state evolution is 
available for the system to be controlled: 

X + = f{x,u) (1) 

where x e X C is the state variable, u e 
U C M m is the control input, and /: X x U 
X is a continuous map which computes the next 
state value, given the current one and the value of 


the input. We also assume that Z C X x U is a 
compact set which defines the (possibly coupled) 
state/input constraints that need to hold pointwise 
in time: 

(x(t), u(t)) e Z We N. (2) 

In order to introduce a measure of economic 
performance, to each feasible state/input pair 
(x,u) e Z, we associate the instantaneous net 
cost of operating the plant at that state when 
feeding the specified control input: 

l(x,u) : Z^M. (3) 

The function t (which we assume to be contin¬ 
uous) is normally referred to as stage cost and 
together with actuation and/or inflow costs should 
also take into account the profits associated to 
possible output/outflows of the system. Let (v*, 
u*) denote the best equilibrium/control input pair 
associated to (3) and (2), namely, 

t(x*,u*) = min XM l(x,u) 
subject to 

(x,u) e Z 1 J 

X = f(x, u ) 

Notice that, unlike in tracking MPC, it is not 
assumed here that 

l(x*,u*) <l(x,u) V(jc,w)gZ. (5) 

This is, technically speaking, the main point of 
departure between economic MPC and tracking 
MPC. 

As there is no natural termination time to 
operation of a system, our goal would be to 
optimize the infinite-horizon cost functional: 

y^/(x(t),u(t)) ( 6 ) 

<€N 

possibly in an average sense (or by introducing 
some discounting factor to avoid infinite 
costs) and subject to the dynamic/operational 
constraints (1) and (2). 
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To make the problem computationally more 
tractable and yet retain some of the desirable 
economic benefits of dynamic programming, (6) 
is truncated to the following cost functional: 

iV-l 

7(z, v) = v(k)) + Vf(z(N )) (7) 

k =0 

where z = [z(0),z(l z(N)] e X N + l ,x = 
[v(0), v(l),..u(iV —1)] G U N and Vf. X M 
is a terminal weighting function whose properties 
will be specified later. 

The virtual state/control pair (z*, v*) at time t 
is the solution (which for the sake of simplicity 
we assume to be unique) of the following opti¬ 
mization problem: 

V{x{t)) = min z v /(z, v) 
subject to 

z(k + l) = f(z(k)Mk)) m 

(z(k),v(k))eZ w 

for k G {0, 1,...,A- 1} 
z(0)=x(t), z(N)eXf. 

Notice that z(0) is initialized at the value of the 
current state x(t). Thanks to this fact, z* and v* 
may be seen as functions of the current state x{t). 
At the same time, z(N) is constrained to belong 
to the compact set X/ Cl whose properties will 
be detailed in the next paragraph. 

As customary in model predictive control, a 
state-feedback law is defined by applying the first 
virtual control to the plant, that is, by letting 
u(t ) = r>*(0) and restating, at the subsequent 
time instant, the same optimization problem from 
initial state x(t + 1) which, in the case of exact 
match between plant and model, can be computed 
as f(x(t),u(t)). 

In the next paragraph, we provide details on 
how to design the “terminal ingredients” (namely, 
Vf and X/) in order to endow the basic algo¬ 
rithm (8) with important features such as recur¬ 
sive feasibility and a certain degree of average 
performance and/or stability). 

Hereby it is worth pointing out how, in 
the context of economic MPC, it makes 
sense to treat, together with pointwise-in-time 


constraints, asymptotic average constraints on 
specified input/output variables. In tracking 
applications, where the control algorithm 
guarantees asymptotic convergence of the state to 
a feasible set point, the average asymptotic value 
of all input/output variables necessarily matches 
that of the corresponding equilibrium/control 
input pair. In economic MPC, the asymptotic 
regime resulting in closed loop may, in general, 
fail to be an equilibrium; therefore, it might 
be of interest to impose average constraints on 
system’s inflows and outflows which are more 
stringent than those indirectly implied by the 
fulfillment of (2). To this end, let the system’s 
output be defined as 

y(t) = h(x(t),u(t )) (9) 

with h(x,u) : Z —> a continuous map, and 
consider the convex compact set Y. We may de¬ 
fine the set of asymptotic averages of a bounded 
signal y as follows: 

A v[y] = | ?7 G R p :3{t n }^L l :t n oo as n oo 
and rj = lim ( J2 jW) /**} 

U=0 ) ) 

Notice that for converging signals, or even 
for periodic ones, Av[y] always is a singleton 
but may fail to be such for certain oscillatory 
regimes. An asymptotic average constraint can be 
expressed as follows: 

A y[y] c Y (10) 

where y is the output signal as defined in (9). 

Basic Theory 

The main theoretical results in support of the 
approach discussed in the previous paragraph are 
discussed below. Three fundamental aspects are 
treated: 

• Recursive feasibility and constraint satisfac¬ 
tion 

• Asymptotic performance 

• Stability and convergence 
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Feasibility and Constraints 

The departing point of most model predictive 
control techniques is to ensure recursive 
feasibility, namely, the fact that feasibility of 
the problem (8) at time 0 implies feasibility at all 
subsequent times, provided there is no mismatch 
between the true plant and its model (1). This is 
normally achieved by making use of a suitable 
notion of control invariant set which is used as 
a terminal constraint in (8). Economic model 
predictive control is not different in this respect, 
and either one of the following set of assumptions 
is sufficient to ensure recursive feasibility: 

1. Assumption 1: Terminal constraint 

X f = {x*} V f = 0 

2. Assumption 2: Terminal penalty function 

There exists a continuous map k : Xf 
U such that 

(x, K(x)) e Z Vx e Xf 

f(x,K(x)) eX f Vx g X f 

The following holds: 

Theorem 1 Let x(0) be a feasible state for (8) 
and assume that either Assumption 1 or 2 hold. 
Then, the closed-loop trajectory x(t ) resulting 
from receding horizon implementation of the 
feedback u(t ) = u*(0) is well defined for all 
t e N (i.e., x(t) is a feasible initial state of (8) 
for all t e N) and the resulting closed-loop 
variables (x(t), u(t)) fulfill the constraints in (2). 

The proof of this Theorem can be found in 
Angeli et al. (2012) and Amrit et al. (2011), for 
instance. When constraints on asymptotic aver¬ 
ages are of interest, the optimization problem (8) 
can be augmented by the following constraints: 

N -1 

J2Hz(k),v(k))e Y, (11) 

k =0 

provided Y t is recursively defined as 

( 12 ) 


where 0 denotes Pontryagin’s set sum. (A 05 : = 
{c : 3a e A, 3b e B : c = a + b}) The sequence 
is initialized asYo = AY 0 Yoo where Yoo is 
an arbitrary compact set in R p containing 0 in its 
interior. The following result can be proved. 

Theorem 2 Consider the optimization prob¬ 
lem (8) with additional constraints (11), and 
assume that x(0) is a feasible initial state. 
Then, provided a terminal equality constraint 
is adopted, the closed-loop solution x(t) is 
well defined and feasible for all t e N and 
the resulting closed-loop variable y(t) = 
h(x(t ), u(t)) fulfills the constraint (10). 

Extending average constraints to the case of eco¬ 
nomic MPC with terminal penalty function is 
possible but outside the scope of this brief tuto¬ 
rial. It is worth mentioning that the set Yoo plays 
the role of an initial allowance that is shrunk or 
expanded as a result of how close are closed- 
loop output signals to the prescribed region. In 
particular, Yoo can be selected a posteriori (after 
computation of the optimal trajectory) just for 
t = 0, so that the feasibility region of the 
algorithm is not affected by the introduction of 
average asymptotic constraints. 

Asymptotic Average Performance 

Since economic MPC does not necessarily lead 
to converging solutions, it is important to have 
bounds which estimate the asymptotic average 
performance of the closed-loop plant. To this end, 
the following dissipation inequality is needed for 
the approach with terminal penalty function: 

V f (f (x, K(x))) < V f (x)—l(x, K(x))+£(x*, u*) 

(13) 

which shall hold for all x e Xf. We are now 
ready to state the main bound on the asymptotic 
performance: 

Theorem 3 Let x(0) be a feasible state for (8) 
and assume that either Assumption 1 or Assump¬ 
tion 2 together with (13) hold. Then, the closed- 
loop trajectory x(t) resulting from receding hori¬ 
zon implementation of the feedback u(t) = r>*(0) 
is well defined for all t e N and fulfills 


Y ?+ i — Yt 0 Y 0 {—h(x(t), u(t))} 
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lim sup 

T —>*+00 


Tj= q j(x(t),u{t)) 
T 


<i{x*,u*). 


(14) 


The proof of this fact can be found in Angeli et al. 
(2012) and Amrit et al. (2011). When periodic 
solutions are known to outperform, in an average 
sense, the best equilibrium/control pair, one may 
replace terminal equality constraints by periodic 
terminal constraints (see Angeli et al. 2012). This 
leads to an asymptotic performance at least as 
good as that of the solution adopted as a terminal 
constraint. 


Stability and Convergence 

It is well known that the cost-to-go V(x) as 
defined in (8) is a natural candidate Lyapunov 
function for the case of tracking MPC. In fact, 
the following estimate holds along solutions of 
the closed-loop system: 

V(x(t +1)) < V(x(t))-£(x(t), u(t )) +1 (x*, u*). 

(15) 

This shows, thanks to inequality (5), that V(x(t)) 
is nonincreasing. Owing to this, stability and con¬ 
vergence can be easily achieved under mild addi¬ 
tional technical assumptions. While property (15) 
holds for economic MPC, both in the case of 
terminal equality constraint and terminal penalty 
function, it is no longer true that (5) holds. As 
a matter of fact, x* might even fail to be an 
equilibrium of the closed-loop system, and hence, 
convergence and stability cannot be expected in 
general. 

Intuitively, however, when the most profitable 
operating regime is an equilibrium, the aver¬ 
age performance bound provided by Theorem 3 
seems to indicate that some form of stability or 
convergence to x* could be expected. This is 
true under an additional dissipativity assumption 
which is closely related to the property of optimal 
operation at steady state. 

Definition 1 A system is strictly dissipative with 
respect to the supply function s(x,u ) if there 
exists a continuous function A : X R and 
p : X —> M positive definite with respect to x* 
such that for all x and u in X x U, it holds: 


A(/(x, u)) < A(x) + s(x, u) — p(x). (16) 

The next result highlights the connection between 
dissipativity of the open-loop system and stability 
of closed-loop economic MPC. 

Theorem 4 Assume that either Assumption 1 or 
Assumption 2 together with (13) hold. Let the 
system (1) be strictly dissipative with respect to 
the supply function s(x,u ) = l(x, u) — l(x*, u*) 
as from Definition 1 and assume there exists a 
neighborhood of feasible initial states containing 
x* in its interior. Then provided V is continuous 
at x*, x* is an asymptotically stable equilibrium 
with basin of attraction equal to the set of feasible 
initial states. 

See Angeli et al. (2012) and Amrit et al. (201 1) 
for proofs and discussions. Convergence results 
are also possible for the case of economic MPC 
subject to average constraints. Details can be 
found in Muller et al. (2013a). 

Hereby it is worth mentioning that finding 
a function satisfying (16) (should one exist) is 
in general a hard task (especially for nonlinear 
systems and/or nonconvex stage costs); it is akin 
to the problem of finding a Lyapunov function 
and therefore general construction methods do 
not exist. Let us emphasize, however, that while 
existence of a storage function A is a sufficient 
condition to ensure convergence of closed-loop 
economic MPC, formulation and resolution of 
the optimization problem (8) can be performed 
irrespectively of any explicit knowledge of such 
function. Also, we point out that existence of A 
as in Definition 1 and Theorem 4 is only possible 
if the optimal infinite-horizon regime of operation 
for the system is an equilibrium. 


Summary and Future Directions 

Economic model predictive control is a fairly 
recent and active area of research with great 
potential in those engineering applications where 
economic profitability is crucial rather than track¬ 
ing performance. 

The technical literature is rapidly growing in 
application areas such as chemical engineering 
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(see Heidarinejad 2012) or power systems engi¬ 
neering (see Hovgaard et al. 2010; Muller et al. 
2013a) where system’s output is in fact physical 
outflows which can be stored with relative ease. 

We only dealt with the basic theoretical de¬ 
velopments and would like to provide pointers to 
interesting recent and forthcoming developments 
in this field: 

• Generalized terminal constraints: possibility 
of enlarging the set of feasible initial states 
by using arbitrary equilibria as terminal 
constraints, possibly to be updated on line in 
order to improve asymptotic performance (see 
Fagiano and Teel 2012; Muller et al. 2013b). 

• Economic MPC without terminal constraints: 
removing the need for terminal constraints 
by taking a sufficiently long control horizon 
is an interesting possibility offered by 
standard tracking MPC. This is also possible 
for economic MPC at least under suitable 
technical assumptions as investigated in 
Grime (2012, 2013). 

• The basic developments presented in the 
previous paragraph only deal with systems 
unaffected by uncertainty. This is a severe 
limitation of current approaches and it is to be 
expected that, as for the case of tracking MPC, 
a great deal of research in this area could be 
developed in the future. In particular, both 
deterministic and stochastic uncertainties are 
of interest. 

Cross-References 

► Model-Predictive Control in Practice 

► Optimization Algorithms for Model Predictive 
Control 

Recommended Reading 

Papers Amrit et al. (2011), Angeli et al. (2011, 
2012), Diehl et al. (2011), Muller et al. (2013a), 
and Rawlings et al. (2008) set out the basic 
technical tools for performance and stability anal¬ 
ysis of EMPC. To readers interested in the gen¬ 
eral theme of optimization of system’s economic 
performance and its relationship with classical 


turnpike theory in economics, please refer to 
Rawlings and Amrit (2009). Potential applica¬ 
tions of EMPC are described in Hovgaard et al. 
(2010), Heidarinejad (2012), and Ma et al. (201 1) 
while Rawlings et al. (2012) is an up-to-date 
survey on the topic. Fagiano and Teel (2012) and 
Grime (2012, 2013) deal with the issue of relax¬ 
ation or elimination of terminal constraints, while 
Muller et al. (2013b) explore the possibility of 
adaptive terminal costs and generalized equality 
constraints. 
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Abstract 

Power electronics and their applications for elec¬ 
tric energy transfer and control are introduced. 
The fundamentals of the power electronics are 
presented, including the commonly used semi¬ 
conductor devices and power converter circuits. 
Different types of power electronic controllers 
for electric power generation, transmission and 
distribution, and consumption are described. The 
advantages of power electronics over traditional 
electromechanical or electromagnetic controllers 
are explained. The future directions for power 
electronic application in electric power systems 
are discussed. 


Keywords 

Electric energy control; Electric energy transfer; 
Power electronics 


Introduction 

Modern society runs on electricity or electric 
energy. The electric energy generally must be 
transferred before consumption since the energy 
sources, such as thermal power plants, hydro 
dams, and wind farms, are often some distances 
away from the loads. In addition, electric energy 
needs to be controlled as well since the energy 
transfer and use often require electricity in a 
form different from the raw form generated at 
the source. Examples are the voltage magnitude 
and frequency for long distance transmission; the 
voltage needs to be stepped up at the sending end 
to reduce the energy loss along the lines and then 
stepped down at the receiving end for users; for 
many modern consumer devices, DC voltage is 
needed and obtained through transforming the 50 
or 60 Hz utility power. Note that electric energy 
transfer and control is often used interchangeably 
with the electric power transfer and control. This 
is because the modern electric power systems 
have very limited energy storage and the energy 
generated must be consumed at the same time. 

Since the beginning of the electricity era, 
electric energy transfer and control technologies 
have been an essential part of electric power 
systems. Many types of equipment were invented 
and applied for these purposes. The commonly 
used equipment includes electric transmission 
and distribution lines, generators, transformers, 
switchgears, inductors or reactors, and capacitor 
banks. The traditional equipment has limited 
control capability. Many cannot be controlled 
at all or can only be connected or disconnected 
with mechanical switches, others with limited 
range, such as transformers with tap changers. 
Even with fully controllable equipment such as 
generators, the control dynamics is relatively 
slow due to the electromechanical or magnetic 
nature of the controller. 

Power electronics are based on semiconductor 
devices. These devices are derivatives from tran¬ 
sistors and diodes used in microelectronic circuits 
with the additional large power handling capa¬ 
bility. Due to their electronic nature, power elec¬ 
tronic devices are much more flexible and faster 
than their electromechanical or electromagnetic 
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counterparts for electric energy transfer and con¬ 
trol. Since the advent of power electronics in the 
1950s, they have steadily gained ground in power 
system applications. Today, power electronic 
controllers are an important part of equipment 
for electric energy transfer and control. Their 
roles are growing rapidly with the continuous im¬ 
provement of the power electronic technologies. 

Fundamentals of Power Electronics 

Different from semiconductor devices in micro¬ 
electronics, the power electronic devices only act 
as switches for desired control functions, such 
that they incur minimum losses when they are 
either on (closed) or off (open). As a result, 
the power electronic controllers are basically the 
switching circuits. The semiconductor switches 
are therefore the most important elements of the 
power electronic controllers. Since the 1950s, 
many different types of power semiconductor 
switches have been developed and can be selected 
based on the applications. 

The performance of the power semiconductor 
devices is mainly characterized by their voltage 
and current ratings, conduction or on-state loss, 
as well as the switching speed (or switching 
frequency capability) and associated switching 
loss. Main types of power semiconductor devices 
are listed with their symbols and state-of-the-art 
rating and frequency range shown in Table 1 : 

• Power diode - a two terminal device with 
similar characteristics to diodes used in micro¬ 
electronics but with higher-voltage and power 
ratings. 


• Thyristor - also called SCR (silicon- 
controlled rectifier). Unlike diode, thyristor 
is a three-terminal device with an additional 
gate terminal. It can be turned on by a current 
pulse through gate but can only be turned 
off when the main current goes to zero with 
external means. Thyristor has low conduction 
loss but slow switching speed. 

• GTO - stands for gate-turn-off thyristor. GTO 
can be turned on similarly as a regular thyris¬ 
tor and can also be turned off with a large 
negative gate current pulse. GTO has been 
largely replaced by IGBT and IGCT due to its 
complex gate driving needs and slow switch¬ 
ing speed. 

• Power BJT - similar to bipolar transistor 
for microelectronics and requires a sustained 
gate current to turn on and off. It has been 
replaced by IGBT and power MOSFET with 
simpler gate signals and faster switching 
speed. 

• Power MOSFET - similar to metal-oxide 
semiconductor field effect transistor for 
microelectronics and can be turned on and 
off with a gate voltage signal. It is the 
fastest device available but has relatively 
high conduction loss and relatively low- 
voltage/power ratings. 

• IGBT - stands for insulated-gate bipolar tran¬ 
sistor. Unlike regular BJT, it can be turned 
on and off with a gate voltage like MOS¬ 
FET. It has relatively low conduction loss and 
fast switching speed. IGBT is becoming the 
workhorse of the power electronics for high 
power applications. 


Electric Energy Transfer and Control via Power Electronics, Table 1 Commonly use Si-based power semiconduc¬ 
tor devices and their ratings 


Types 

Symbol 

Voltage 

Current 

Switching frequency 

Power diodes — 


Max 80 kV, typical < lOkV 

lOkA 

Various 

Thyristor 


Max 8 kV 

4.5 kA 

AC line frequency 

GTO 

**- 

Max 10 kV 

6.5 kA 

<500 Hz 

Power MOSFET 

4 


Max 4.5 kV, typical < 600 V 

1.6 kA 

10 s of kHz to MHz 

IGBT 

X 

i 

Max 6.5 kV, typical > 600 V 

2.4 kA 

1 kHz to 10 s of kHz 

IGCT 


Max 10 kV, typical > 4.5 kV 

6.5 kA 

<2 kHz 












Electric Energy Transfer and Control via Power Electronics 


345 


• IGCT - stands for integrated-gate-commutated 
thyristor. It is basically a GTO with an 
integrated gate drive circuit allowing a hard 
driven turnoff. It therefore has faster switching 
speed than regular GTO but slower than IGBT. 

Except for diodes, all other devices above can 
be turned on and/or off through a gate signal, so 
they are active switches, while diodes are called 
passive switch. 

With different types of power semiconduc¬ 
tors, many power electronics circuits have been 
developed. Based on their functions, they can be 
classified as: 

• Rectifier - rectifiers convert AC to DC. De¬ 
pending on AC sources, rectifiers can be three 
phase or single phase; depending on device 
types, they can be passive (diode based), phase 
controlled (thyristor controlled), or actively 
switched. 

• Inverter - inverters convert DC to AC. They 
again can be three phase or single phase. 
Inverters generally require active switching 
devices. 

• DC-DC converter - also called choppers, DC- 
DC converters convert one DC voltage level 
to another. Sometimes they also contains a 
magnetic isolation. DC-DC converters can 
have unidirectional or bidirectional power 
flow and generally requires active switching 
devices. 

• AC-AC converter - directly converts one AC 
to another, either only the voltage magnitude 
or both magnitude and frequency. The former 
can also be called AC switch, and the latter can 


be called frequency changer. Active devices 
are needed for these types of converters. 

There are a variety of converter topologies 
for each type of the converters listed above. 
The most commonly used basic topologies for 
power system applications are shown in Fig. 1. 
These basic topologies can be expanded through 
paralleling or series of devices and/or converters 
to achieve higher current and voltage ratings. 
Other variations such as multilevel converters are 
also popular for high-voltage applications using 
lower-voltage rating devices. 

It should be noted that passive components, 
i.e., inductors and capacitors, are essential parts 
of power electronic converters. In fact, power 
electronic converters transfer or control the elec¬ 
tric energy by storing it temporarily in induc¬ 
tors or capacitors while reformatting the original 
voltage or current waveform through switching 
actions. The other key function of the passives is 
filtering the harmonics caused by switching. 

Power Electronic Controller Types for 
Energy Transfer and Control 

For almost all traditional non-power-electronic 
equipment for electric energy transfer and 
control, there can be corresponding power 
electronic-based counterpart, often with better 
controllability. However, power electronic 
equipment can be more expensive and therefore 
only used when it provides better overall 
performance and cost benefits. In other cases, 
only power electronic equipment can achieve the 
required control functions. 
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Electric Energy Transfer and Control via Power Electronics, Fig. 1 Commonly used basic power electronics 
converter topologies (only one phase shown for the AC switch) 
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The power electronic controllers can be 
categorized as for energy generation, delivery, 
and consumption. For generation, the thermal or 
hydro generators both use synchronous machines 
with excitation windings on the rotor, which 
require DC current. A thyristor-based rectifier, 
called exciter, is generally used for this purpose. 
Wind turbine generators usually use a back-to- 
back VSI to interface to the AC grid, and PV 
solar sources use a DC-DC converter cascaded 
with a VSI. 

Power electronic controllers for transmission 
and distribution controllers include so-called 
flexible AC transmission systems (FACTS) and 
high-voltage DC transmission (HVDC). Some of 
the more commonly used controllers and their 
functions and circuit topologies are listed in 
Table 2. 

The main power electronic controllers for 
loads include variable speed motor drives; 
electronic ballast for fluorescent lights and power 
supplies for LED; various power supplies for 
computer, IT, and other electronic loads; and 
chargers for electric vehicles. The percentage 
of power electronics controlled loads in power 
systems have been steadily increasing. Power 
electronics can generally result in improved 
performance and efficiency. 


Future Directions 

Power electronics have progressed steadily 
since the invention of thyristors in the 1950s. 
The progress is in all aspects, semiconductor 
devices, passives, circuits, control, and system 
integration, leading to converter systems with 
better performance, higher efficiency, higher 
power density, higher reliability, and lower 
cost. Because of these progresses, the power 
electronics applications in power systems have 
become more and more widespread. However, 
in general, power electronic controllers are 
still not sufficiently cost-effective, reliable, 
or efficient. Many improvements are needed 
and expected, especially in the following 
areas: 


• Semiconductor devices - Devices used today 
are almost exclusively based on silicon. The 
emerging devices based on wide-bandgap 
materials such as SiC and GaN are expected 
to revolutionize power electronics with their 
capabilities of higher voltage, lower loss, 
faster switching speed, higher temperature, 
and smaller size. 

• Power electronic converters - More cost- 
effective and reliable converters will be 
developed as a result of better devices, 
passive components, and circuit structures. 
Modular, distributed, and hybrid with non- 
power-electronics approaches are expected to 
result in overall better benefits. 

• Enhanced functions - Power electronic con¬ 
trollers can be designed to have multiple func¬ 
tions in the system. For example, wind and PV 
solar inverters can provide reactive power to 
the grid in addition to transferring real energy. 
Today, power electronic controllers are mostly 
locally controlled. With better measurement 
and communication technologies, they may be 
controlled over a wide area for supporting the 
system level functions. 

• New applications - The new applications for 
future power system include DC grid based 
on multiterminal HVDC and energy storage. 
Critical technologies include cost-effective 
and efficient DC transformers and DC circuit 
breakers. Power electronics will play key roles 
in these technologies. 

Cross-References 

► Cascading Network Failure in Power Grid 
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► Coordination of Distributed Energy Resources 
for Provision of Ancillary Services: Architec¬ 
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Abstract 

Engine control is the enabling technology for 
efficiency, performance, reliability, and cleanli¬ 
ness of modern vehicles for a wide variety of 
uses and users. It has also a paramount impor¬ 
tance for many other engine applications like 
power plants. Engines are essentially chemical 
reactors, and the core task of engine control 
consists in preparing and starting the reaction 
(mixing the reactants and igniting the mixture) 
while the reaction itself is not controlled. The 
technical challenge derives from the combination 
of high complexity, wide range of conditions of 
use, performance requirements, significant time 
delays, and use of the constraints on the choice 
of components. In practice, engine control is to a 
large extent feed-forward control, feedback loops 
being used either for low-level control or for up¬ 
dating the feed-forward. Industrial engine control 
is based on very complex structures calibrated 
experimentally, but there is a growing interest 
for model-based control with stronger feedback 
action, supported by the breakthrough of new 
computational and communication possibilities, 
as well as the introduction of new sensors. 
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Introduction 

Most vehicles are moved by internal combus¬ 
tion engines (ICE), whose key function is the 
conversion of chemical into mechanical energy, 
basically by oxidation, e.g., in the case of propane 

C 3 H 8 + 50 2 = 3C0 2 +4H 2 0 + 46.3MJ/kg (1) 

The chemical energy is first transformed into 
heat and then converted by the ICE into me¬ 
chanical energy (Heywood 1988). The key task 
of engine control (Guzzella and Onder 2010; 
Kiencke and Nielssen 2005) is to make sure that 
the reactants (fuel and oxygen) meet in the right 
proportion (“mixture formation”) and that the 
combustion is started (or “ignited”) to deliver the 
required torque at the engine crankshaft. Several 
combustion processes are known, the most com¬ 
mon ones being Otto and Diesel. For the first 
kind (also called SI for spark ignited), the mix¬ 
ture is prepared outside the combustion chamber 
and combustion is ignited by spark, while in 
the second one fuel is injected directly into the 
combustion chamber and combustion is ignited 
by compression (Cl, compression ignited). GDI 
(gasoline direct injection) is a variant of SI en¬ 
gines with direct fuel injection as Cl but spark 
ignition as SI. 

Unfortunately, the chemical equation (1) is not 
the whole truth. Indeed, the way the mixture is 
prepared and ignited affects the efficiency of the 
conversion from thermal into mechanical energy, 
but also secondary reactions, like pollutant for¬ 
mation, and other aspects, like noise, vibrations 
and harshness (NVH), and mechanical fatigue 
and thus life expectancy. Furthermore, driveabil¬ 
ity requirements are primarily determined by the 
ability of an ICE to change fast its operating 
point, and this sets additional requirements to 
the engine control. These requirements have to 
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be met for all vehicles in spite of production 
variability and under all relevant operating con¬ 
ditions, including all drivers, road, traffic, and 
weather conditions. 

As first principle models are often not avail¬ 
able or very time-consuming to tune and sel¬ 
dom precise enough, engine control is based on 
very complex heuristic descriptions which can 
be tuned experimentally and even automatically 
(Schoggl et al. 2002) - a modern engine control 
unit (ECU) can include up to 40.000 labels (pa¬ 
rameters or maps). This structure is mainly feed¬ 
forward, with feedback loops typically used for 
control of actuators, primarily calibrated under 
laboratory conditions but with adaptation loops 
designed to correct parameters to take in account 
production and wear effects. Figure 1 shows an 
engine test bench setup with the engine control 
unit (ECU) and a calibration system. 


The Target System 

Figure 2 shows the basic setup of an ICE as Cl 
and SI. In both cases, the main components of an 
ICE are fuel path, air path, combustion chamber, 
and exhaust aftertreatment system. 

Roughly speaking, ICEs exhibit three time 
scales. Changes in the setting of the fuel path - 
responsible to deliver the fuel to the combus¬ 
tion chamber - act very fast for Cl and GDI 
engines (e.g., 50 Hz) and rather fast for SI engines 
(10 Hz or more). The same is not true for the 
air path which brings the gas mixture (fresh air 
and possibly recirculated exhaust gas) into the 
combustion chamber and is the slowest system 
(typically in the range of 0.5-2 Hz). In SI and 
GDI engines, spark timing can be changed for 
each combustion too. A still faster dynamics is 
associated with the combustion process itself, 
pressure sensors with the required dynamics to 
monitor it are being introduced in a growing 
number of applications, but until now no suitable 
actuators are available for its closed loop control. 
The torque demand changes typically with the 
vehicle dynamics, which are usually still slower 
than the air path. 


The Control Tasks 

The high-level control task can be defined as the 
minimization of the average fuel consumption 
while providing the required torque and respect¬ 
ing the constraints on emissions (i.e., nitrogen 
oxides and dioxides (NO x ) and particulate matter 
(PM)), noise, temperature, etc. The legislators in 
different countries have defined test procedure, 
including a specified road profile and correspond¬ 
ing emission limits. Figure 3 shows the progres¬ 
sive reduction of the limits and the speed profile 
used to assess this value. 

Even if fuel consumption is not yet limited by 
law, the control problem associated can be stated 
as an optimal constrained control problem: 



1120 

n 



min / dfdt 

u(t) J 

0 

(2) 

so that 

v(0 = Vdem(0 ± Av 

(3) 

and 

1120 
f » 



/ qidt < Qi 

(4) 


o 

where u{t) are all available control inputs, 1120 
is the duration of the European cycle, Vd em (t) 
the corresponding speed, Av the speed tolerance, 
is the instantaneous fuel consumption, g ? each 
limited quantity (e.g., NOx), and Qi the corre¬ 
sponding limit for the whole test. In practice, 
other criteria must be considered as well, like 
NVH, but even this problem is never solved 
using the standard tools of optimal control es¬ 
sentially for the nonlinearity (and following non¬ 
convexity) of the problem, but even more for the 
lack of explicit models of sufficient quality relat¬ 
ing the inputs to the target quantities, especially 
combustion depending quantities like emissions. 

In practice, different simpler subproblems are 
solved separately and tuned to achieve sufficient 
results also in terms of the general problem to 
achieve the required performance. In the follow¬ 
ing, we concentrate on the main high-level tasks, 
omitting many others, e.g., all the control loops 
required for the correct operation of the single 
actuators. 
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Engine Control, Fig. 1 Light duty engine test bench with ECU and calibration system 


la 



Engine Control, Fig. 2 Basic system scheme of Cl (left) 
and SI (right) engines: la Control of the injector opening, 
lb Injection premixing with air; 2 Measurement of the 
engine temperature; 3 Measurement of the engine rational 
speed; 4 Measurement of oxygen concentration in the 


lb 



exhaust gases; 5a EGR valve, 5b throttle valve, 6a low- 
pressure EGR valve, 7a Diesel exhaust after treatment 
(DOC, SCR, DPF), 7b SI engine after treatment (3 way 
catalyst), 8 SCR dosing control 
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Engine Control, Fig. 3 Left : different steps of limits of emissions per km as defined by the European Union (Euro 1 
introduced in 1991 and Euro 6 from 2014). Right : New European Driving Cycle (NEDC) 


Air Path Control 

The main source of oxygen for the reaction of 
Eq. (1) is ambient air which contains about 21 % 
oxygen. The engine - essentially a volumetric air 
pump - aspires air flow roughly proportional to 
the cylinder volume and the revolution speed of 
the engine. The amount of oxygen entering the 
combustion chamber, however, will depend also 
on temperature, pressure, and moisture. This flow 
can be reduced (as in the standard SI engines) 
by throttling, e.g., by adding an additional flow 
resistance between the air intake and the com¬ 
bustion chamber, or increased by compressing 
the fresh air, most commonly by turbocharging 
(especially in Cl engines). A turbocharger con¬ 
sists essentially of a turbine, which transforms 
part of the enthalpy of the exhaust gas into me¬ 
chanical power, and a compressor, driven by this 
power to compress the fresh air on its way to 
the combustion chamber, thus increasing both 
its density and temperature. Turbocharger oper¬ 
ation is typically controlled either directly (for 
instance, with variable vane angles) or indirectly, 
by bypass valves which deviate the gas flows in 
parallel to the turbine. 

If only ambient air is fed to the combus¬ 
tion chamber, a proportional amount of the other 
gases present in the atmosphere will enter the 


combustion chamber and be available for com¬ 
bustion side reactions as well. In the case of 
nitrogen, these reactions lead to the undesired for¬ 
mation of nitrogen oxides (NOx). Therefore, in 
some engines, especially in Cl engines, part of the 
combusted gases are recirculated to the combus¬ 
tion chamber (“exhaust gas recirculation”, EGR), 
providing advantages in terms of NOx reduction. 
While EGR is typically realized at high pressures 
(path HP in Fig. 1), it is realized also at low 
pressure (path LP), even though less frequently. 
Typically, the air path includes some coolers 
designed to increase gas densities. 

Air path control is designed to track dynamical 
references, for instance, the total fresh air mass 
(MAF) entering the cylinder and the correspond¬ 
ing pressure (MAP), but also other quantities are 
possible. The references are typically generated 
by the calibration engineers on the basis of tests. 
The control inputs of the air path are mostly the 
turbine (and possibly compressor) steering angle, 
the EGR, and - if available - throttle(s) setpoints. 
Most commonly used sensors include a mass 
flow meter (hot film sensor), rather slow and dy¬ 
namically not reliable, pressure, and temperature 
sensors, and sometimes the actual position of the 
valves is measured as well and the turbocharger 
speed. 
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Engine Control, Fig. 4 

Pressure trace in a fired 
cylinder of a Cl engine 
triggered by a pilot and a 
main injection 




Fuel Path Control 

The fuel path delivers the correct amount of fuel 
for the reaction (1). In almost every ICE, a rail 
is filled with fuel at a given pressure (from few 
bars for SI to about 2000 bars for Cl), from 
which the required amount of fuel is injected into 
the cylinder. The injection can occur inside the 
combustion chamber (as for Cl and GDI engines) 
or near to the intake valve (“port injection”) for 
standard SI engines. 

The injection amount is always set taking in 
account the available oxygen mass. In SI engines 
with three-way catalyst, the fuel injection is given 
by the stoichiometric condition. X control uses 
an oxygen sensor in the exhaust to determine the 
actual fuel/oxygen ratio and if appropriate correct 
the injection tables. In Cl and GDI the maximum 
fuel injection is limited to prevent smoke forma¬ 
tion, typically by tables, even though X control 
can be and is partly used (Amstutz and del Re 
1995). 

In SI engines with port injection, the liquid 
fuel is injected near to the inlet valve and is 
expected to vaporize due to the local temperature 
and pressure conditions. During load changes, 
however, it can happen that part of the fuel 
is not vaporized, remains on the duct wall 
(“wall wetting”), and vaporizes at a later 
time, leading in both cases to a deviation 
from the expected values (Turin et al. 1995), 


which must be compensated by the injection 
control. 

Injection in Cl engines is typically splitted in 
a main injection for torque and a pilot injection 
for NVH control and sometimes also a post¬ 
injection for emission control or regeneration of 
aftertreatment devices. Figure 4 shows the typical 
effect of a pilot injection on the pressure trace of 
a Cl engine. 

Differences between injectors of different 
cylinders are compensated by cylinder balancing 
control (typically using irregularities in the 
engine acceleration). Rail pressure is also an 
important control variable for the direct injection. 

Ignition 

Once the combustion chamber is filled, the 
combustion can be started. In SI and GDI 
combustion is started by a spark) leading to 
a flame front which propagates through the 
whole combustion chamber. Very few SI engines 
have a second spark plug to better control the 
combustion. Under some circumstances, e.g., 
high temperature, an undesired auto-ignition 
(“knock”) can occur with potentially catastrophic 
consequences for the engine durability but also 
unconventional NVH. To cope with this, SI 
engines have vibration sensors whose output is 
used to modify the engine operation, in particular 
the spark timing, to prevent it. 
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In Cl engines, the injection leads almost im¬ 
mediately to the combustion which has more the 
character of an explosion and starts typically at 
several undefined locations. 

Additional control during the combustion is up 
to now only theoretically feasible, as the combus¬ 
tion takes place in an extremely short time, but 
also because adequate actuators are not available. 

Aftertreatment 

As the combustion mixture will always contain 
more potential reactants than oxygen and fuel, 
side reactions will always take place, yielding 
toxic products, in particular NOx, incompletely 
burnt fuel (HC), carbon monoxide (CO), and par¬ 
ticulate matter (PM). Even if much effort is spent 
on reducing their formation, this is almost never 
sufficient, so additional aftertreatment equipment 
is used. Table 1 gives an overview over the most 
common aftertreatment systems as well as over 
their control aspects. 

Thermal Management 

All main properties of engines are strongly af¬ 
fected by its temperature, which depends on the 
varying load conditions. Engine operation is typ¬ 
ically optimal for a relatively narrow temperature 
range, the same is even more critical for the 
exhaust aftertreatment system. Engine heat is also 
required for other purposes (like defrosting of 
windshields in cold climates). 

Thus the engine control system has two main 
tasks: bringing the engine and the exhaust af¬ 
tertreatment system as fast as possible into the 
target temperature range and taking in account 


deviation from this target. The first task is per¬ 
formed both by control of the cooling circuit 
and by specific combustion-related measures, the 
second one by taking the measured or estimated 
temperature as input for the controllers. 

Fast heating is especially important for SI 
engines, because almost all toxic emissions are 
produced when the three-way catalyst is cold. To 
achieve faster heating, SI engines tend to operate 
in a less fuel efficient, but “hotter” operation 
mode during this warm-up phase, one of the 
causes of increased consumption of cold engines 
and short trips. 


Cranking Idle Speed and Gear Shifting 
Control 

Initially, the engine is cranked by the starter until 
a relatively low speed and then injection starts 
bringing the engine to the minimum operational 
speed. If the injected fuel is not immediately 
burnt, very high emissions will arise. At cranking, 
the cylinder walls are typically very cold and 
combustion of a stoichiometric mixture is hardly 
possible. So engine control has the task to inject 
as little as possible but as much as needed only in 
the cylinder which is going to fire. 

Normally an ICE is expected to provide a 
torque to the driveline, speed being the result 
of the balance between it and the load. In idle 
control, no torque is transmitted to the driveline, 
but the engine speed is expected to remain stable 
in spite of possible changes of local loads (like 
cabin climate control). This boils down to a 
robust control problem (Hrovat and Sun 1997). 


Engine Control, Table 1 Main exhaust aftertreatment systems 


System 

Purpose 

Control targets 

Three-way catalyst 

Reduction of HC, CO, and NOx by more 
than 98 % 

Achieve fast and maintain operating 
temperature and keep X = 1 

Oxydation catalyst 

Reduction of HC and CO, partly of PM 

Achieve fast and maintain operating 
temperature and keep X > 1 

Particulate filter 

Traps PM 

Check trap state and regenerate by 
increasing exhaust temperature for short 
time if needed 

NOx lean trap 

Traps NOx 

Estimate trap state and shift combustion to 
CO rich when required 

Selective catalyst reaction 

Reduces NOx 

Estimate required quantity of additional 
reactant (urea) and dose it 
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Gear shifting requires several steps. Smooth¬ 
ness and speed of the shifting depend on the 
coordination of engine operating point change. 
Actual hardware developments (double clutches, 
automated gear boxes) make a better operation, 
but require precise control. 

New Trends 

The utilization environment of engine control is 
changing. On one side, customer and legisla¬ 
tor expectations continue producing pressure, but 
there is a shift in priority from emissions to fuel 
efficiency and safety. Driver support systems, for 
instance, automated parking, are becoming the 
longer the more pervasive, and many functions 
must be included or affect immediately the ECU, 
even though they are frequently hosted on own 
control hardware. Hybrid vehicles are gaining 
popularity, and this implies a different operation 
mode for the engine, for instance, thermal man¬ 
agement becomes much more complex for range 
extender vehicles with long “cold” phases. 

Maybe even more important is the diffusion 
of new devices and communication possibilities, 
so that, for instance, fuel saving preview-based 
gear shifting can be easily implemented using 
infrastructure-to-vehicle information, or even just 
navigation data. Further extensions, like cooper¬ 
ative adaptive cruise control (CACC), plan to use 
vehicle-to-vehicle information to increase both 
safety and efficiency. 

Against this background, there is a growing 
consciousness that the actual industrial approach 
based on huge calibration work is becoming the 
longer the less viable and bears a steadily increas¬ 
ing risk of wasting potential performance. Some 
model-based controls have already found their 
way into the ECU, and the academy has shown 
in several occasions that model-based control is 
able to achieve better performance, but it has not 
yet been shown how this could comply with other 
industrialization requirements. 

Actually, new faster sensors (e.g., pressure 
sensors in the combustion chambers) are being 
introduced; the interest in model-based control 
(Alberer et al. 2012) and in system identification 


techniques (del Re et al. 2010) are increasing, but 
they are not yet widespread. 


Cross-References 

► Powertrain Control for Hybrid-Electric and 
Electric Vehicles 

► Transmission 
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Abstract 

Estimation and control of systems when 
data is being transmitted across nonideal 
communication channels has now become an 
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important research topic. While much progress 
has been made in the area over the last few 
years, many open problems still remain. This 
entry summarizes some results available for such 
systems and points out a few open research 
directions. Two popular channel models are 
considered - the analog erasure channel model 
and the digital noiseless model. Results are pre¬ 
sented for both the multichannel and multisensor 
settings. 

Keywords 

Analog erasure channel; Digital noiseless chan¬ 
nel; Networked control systems; Sensor fusion 

Introduction 

Networked control systems refer to systems in 
which estimation and control is done across 
communication channels. In other words, these 
systems feature data transmission among the 
various components - sensors, estimators, con¬ 
trollers, and actuators - across communication 
channels that may delay, erase, or otherwise 
corrupt the data. It has been known for a long time 
that the presence of communication channels 
has deep and subtle effects. As an instance, an 
asymptotically stable linear system may display 
chaotic behavior if the data transmitted from 
the sensor to the controller and the controller 
to the actuator is quantized. Accordingly, the 
impact of communication channels on the 
estimation/control performance and design of 
estimation/control algorithms to counter any 
performance loss due to such channels have both 
become areas of active research. 

Preliminaries 

It is not possible to provide a detailed overview 
of all the work in the area. This entry attempts to 
summarize the flavor of the results that are avail¬ 
able today. We focus on two specific communi¬ 
cation channel models - analog erasure channel 
and the digital noiseless channel. Although other 


channel models, e.g., channels that introduce de¬ 
lays or additive noise, have been considered in 
the literature, these models are among the ones 
that have been studied the most. Moreover, the 
richness of the field can be illustrated by concen¬ 
trating on these models. 

An analog erasure channel model is defined 
as follows. At every time step k , the channel 
supports as its input a real vector i(k) e R 1 
with a bounded dimension t. The output o(k ) 
of the channel is determined stochastically. The 
simplest model of the channel is when the output 
is determined by a Bernoulli process with proba¬ 
bility p. In this case, the output is given by 

. \i(k — 1) with probability 1 — p 

o(k ) = < 

/ <p otherwise, 

where the symbol 0 denotes the fact that the 
receiver does not obtain any data at that time 
step and, importantly, recognizes that the channel 
has not transmitted any data. The probability p 
is termed the erasure probability of the channel. 
More intricate models in which the erasure pro¬ 
cess is governed by a Markov chain, or by a 
deterministic process, have also been proposed 
and analyzed. In our subsequent development, we 
will assume that the erasure process is governed 
by a Bernoulli process. 

A digital noiseless channel model is defined 
as follows. At every time step k , the channel 
supports at its input one out of 2 m symbols. 
The output of the channel is equal to the input. 
The symbol that is transmitted may be generated 
arbitrarily; however, it is natural to consider the 
channel as supporting m bits at every time step 
and the specific symbol transmitted as being 
generated according to an appropriately design 
quantizer. Once again, additional complications 
such as delays introduced by the channel have 
been considered in the literature. 

A general networked control problem consists 
of a process whose states are being measured 
by multiple sensors that transmit data to multi¬ 
ple controllers. The controllers generate control 
inputs that are applied by different actuators. 
All the data is transmitted across communication 
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channels. Design of control inputs when multiple 
controllers are present, even without the pres¬ 
ence of communication channels, is known to be 
hard since the control inputs in this case have 
dual effect. It is, thus, not surprising that not 
many results are available for networked control 
systems with multiple controllers. We will thus 
concentrate on the case when only one controller 
and actuator is present. However, we will review 
the known results for the analog erasure channel 
and the digital noiseless channel models when 
(i) multiple sensors observe the same process 
and transmit information to the controller and (ii) 
the sensor transmits information to the controller 
over a network of communication channels with 
an arbitrary topology. 

An important distinction in the networked 
control system literature is that of one-block 
versus two-block designs. Intuitively, the 
one-block design arises from viewing the 
communication channel as a perturbation to 
a control system designed without a channel. 
In this paradigm, the only block that needs 
to be designed is the receiver. Thus, for 
instance, if an analog erasure channel is present 
between the sensor and the estimator, the sensor 
continues to transmit the measurements as if 
no channel is present. However, the estimator 
present at the output of the channel is now 
designed to compensate for any imperfections 
introduced by the communication channel. 
On the other hand, in the two-block design 
paradigm, both the transmitter and the receiver 
are designed to optimize the estimation or control 
performance. Thus, if an analog erasure channel 
is present between the sensor and the estimator, 
the sensor can now transmit an appropriate 
function of the information it has access to. 
The transmitted quantity needs to satisfy the 
constraints introduced by the channel in terms of 
the dimensions, bit rate, power constraints, and 
so on. It is worth remembering that while the 
two-block design paradigm follows in spirit from 
communication theory where both the transmitter 
and the receiver are design blocks, the specific 
design of these blocks is usually much more 
involved than in communication theory. It is 
not surprising that in general performance with 


two-block designs is better than the one-block 
designs. 

Analog Erasure Channel Model 

Consider the usual LQG formulation. A linear 
process of the form 

x(k + 1) = Ax(k ) + Bu(k ) + w(k), 

with state x(k) e R d and process noise w(k) 
is controlled using a control input u(k ) e R m . 
The process noise is assumed to be white, Gaus¬ 
sian, zero mean, with covariance X w . The initial 
condition x(0) is also assumed to be Gaussian 
and zero mean with covariance no. The process 
is observed by n sensors, with the / -th sensor 
generating measurements of the form 

yt(k) = Qx(k) + Vi(k), 

with the measurement noise Vf (k) assumed to be 
white, Gaussian, zero mean, with covariance X z v . 
All the random variables in the system are as¬ 
sumed to be mutually independent. We consider 
two cases: 

• If n = 1, the sensor communicates with 

the controller across a network consisting of 
multiple communication channels connected 
according to an arbitrary topology. Every 
communication channel is modeled as an 
analog erasure channel with possibly a 
different erasure probability. The erasure 
events on the channels are assumed to be 
independent of each other, for simplicity. The 
sensor and the controller then form two nodes 
of a network each edge of which represents a 
communication channel. 

• If n > 1, then every sensor communicates 
with the controller across an individual com¬ 
munication channel that is modeled as an ana¬ 
log erasure channel with possibly a different 
erasure probability. The erasure events on the 
channels are assumed to be independent of 
each other, for simplicity. 

The controller calculates the control input to 
optimize a quadratic cost function of the form 
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Jk = E 


K-\ 


J2 (; x T (k)Qx(k ) + u T {k)Ru{k)) 


L k=0 


+ x t (K)P k x(K) 


All the covariance matrices and the cost matri¬ 
ces Q , R , and Pk are assumed to be positive 
definite. The pair (A, B ) is controllable and the 
pair ( A,C ) is observable, where C is formed 
by stacking the matrices C/’s. The system is 
said to be stabilizable if there exists a design 
(within the specified one-block or two-block de¬ 
sign framework) such that the cost lim^^oo 
is bounded. 

A Network of Communication Channels 

We begin with the case when N = 1 as men¬ 
tioned above. The one-block design problem in 
the presence of a network of communication 
channels is identical to the one-block design as 
if only one channel were present. This is be¬ 
cause the network can be replaced by an “equiv¬ 
alent” communication channel with the erasure 
probability as some function of the reliability 
of the network. This can lead to poor perfor¬ 
mance, since the reliability may decrease quickly 
as the network size increases. For this reason, 
we will concentrate on the two-block design 
paradigm. 

The two-block design paradigm permits the 
nodes of the network to process the data prior 
to transmission and hence achieve much better 
performance. The only constraint imposed on the 
transmitter is that the quantity that is transmitted 
is a causal function of the information that the 
node has access to, with a bounded dimension. 
The design problem can be solved using the 
following steps. The first step is to prove that a 
separation principle holds if the controller knows 
the control input applied by the actuator at ev¬ 
ery time step. This can be the case if the con¬ 
troller transmits the control input to the actuator 
across a perfect channel or if the control input is 
transmitted across an analog erasure channel but 
the actuator can transmit an acknowledgment to 
the controller. For simplicity, we assume that the 


controller transmits the control input to the ac¬ 
tuator across a perfect channel. The separation 
principle states that the optimal performance is 
achieved if the control input is calculated using 
the usual LQR control law, but the process state 
is replaced by the minimum mean squared error 
(MMSE) estimate of the state. Thus, the two- 
block design problem needs to be solved now for 
an optimal estimation problem. 

The next step is to realize that for any allowed 
two-block design, an upper bound on estima¬ 
tion performance is provided by the strategy of 
every node transmitting every measurement it 
has access to at each time step. Notice that this 
strategy is not in the set of allowed two-block 
designs since the dimension of the transmitted 
quantity is not bounded with time. However, the 
same estimate is calculated at the decoder if the 
sensor transmits an estimate of the state at every 
time step and every other node (including the 
decoder) transmits the latest estimate it has access 
to from either its neighbors or its memory. This 
algorithm is recursive and involves every node 
transmitting a quantity with bounded dimension, 
however, since it leads to calculation of the same 
estimate at the decoder, and is, thus, optimal. It 
is worth remarking that the intermediate nodes 
do not require access to the control inputs. This 
is because the estimate at the decoder is a linear 
function of the control inputs and the measure¬ 
ments: thus, the effect of control inputs in the 
estimate can be separated from the effect of 
the measurements and included at the controller. 
Moreover, as long as the closed loop system is 
stable, the quantities transmitted by various nodes 
are also bounded. Thus, the two-block design 
problem can be solved. 

The stability and performance analysis with 
the optimal design can also be performed. As an 
example, a necessary and sufficient stabilizability 
condition is that the inequality 


PmaxcutPC^) ^ 1 5 


holds, where p(A) is the spectral radius of A 
and Pmaxcut is the max-cut probability evaluated 
as follows. Generate cut-sets from the network 
by dividing the nodes into two sets - a source 
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set containing the sensor and a sink set con¬ 
taining the controller. For each cut-set, obtain 
the cut-set probability by multiplying the erasure 
probabilities of the channels from the source set 
to the sink set. The max-cut probability is the 
maximum such cut-set probability. The necessity 
of the condition follows by recognizing that the 
channels from the source set to the sink set need 
to transmit data at a high enough rate even if 
the channels within each set are assumed not to 
erase any data. The sufficiency of the condition 
follows by using the Ford-Fulkerson algorithm to 
reduce the network into a collection of parallel 
paths from the sensor to the controller such that 
each path has links with equal erasure probability 
and the product of these probabilities for all paths 
is the max-cut probability. More details can be 
found in Gupta et al. (2009a). 

Multiple Sensors 

Let us now consider the case when the process 
is observed using multiple sensors that transmit 
data to a controller across an individual analog 
erasure channel. A separation principle to reduce 
the control design problem into the combination 
of an LQR control law and an estimation prob¬ 
lem can once again be proven. Thus, the two- 
block design for the estimation problem asks 
the following question: what quantity should the 
sensors transmit such that the decoder is able 
to generate the optimal MMSE estimate of the 
state at every time step, given all the information 
the decoder has received till that time step. This 
problem is similar to the track-to-track fusion 
problem that has been studied since the 1980s 
and is still open for general cases (Chang et al. 
1997). Suppose that at time k , the last successful 
transmission from sensor i happened at time 
ki < k. The optimal estimate that the decoder 
can ever hope to achieve is the estimate of the 
state x(k) based on all measurements from the 
sensor 1 till time k\, from sensor 2 till time £ 2 , 
and so on. However, it is not known whether 
this estimate is achievable if the sensors are con¬ 
strained to transmit real vectors with a bounded 
dimension. A fairly intuitive encoding scheme is 
if the sensors transmit the local estimates of the 
state based on their own measurements. However, 


it is known that the global estimate cannot, in 
general, be obtained from local estimates because 
of the correlation introduced by the process noise. 
If erasure probabilities are zero, or if the process 
noise is not present, then the optimal encoding 
schemes are known. Another case for which the 
optimal encoding schemes are known is when 
the estimator sends back acknowledgments to the 
encoders. 

Transmitting local estimates does, however, 
achieve optimal stability conditions as compared 
to the conditions obtained from the optimal (un¬ 
known) two-block design (Gupta et al. 2009b). 
As an example, the necessary and sufficient sta¬ 
bility conditions for the two sensor cases are 
given by 

< 1 

P 2 P(A 2 ) 2 < 1 

P\p 2 p(A 3 ) 2 < 1, 

where p\ and P 2 are erasure probabilities from 
sensors 1 and 2, respectively, p(A\) is the spectral 
radius of the unobservable part of the matrix A 
from the second sensor, p(A?) is the spectral 
radius of the unobservable part of the matrix A 
from the first sensor, and p(A$) is the spectral 
radius of the observable part of the matrix A 
from both the sensors. The conditions are fairly 
intuitive. For instance, the first condition provides 
a bound on the rate of increase of modes for 
which only sensor 1 can provide information 
to the controller, in terms of how reliable the 
communication channel from the sensor 1 is. 


Digital Noiseless Channels 

Similar results as above can be derived for the 
digital noiseless channel model. For the digital 
noiseless channel model, it is easier to consider 
the system without either measurement or pro¬ 
cess noises (although results with such noises are 
available). Moreover, since quantization is inher¬ 
ently highly nonlinear, results such as separation 
between estimation and control are not available. 
Thus, encoders and controllers that optimize a 
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cost function such as a quadratic performance 
metric are not available even for the single sen¬ 
sor or channel case. Most available results thus 
discuss stabilizability conditions for a given data 
rate that the channels can support. 

While early works used the one-block design 
framework to model the digital noiseless channel 
as introducing an additive white quantization 
noise, that framework obscures several crucial 
features of the channel. For instance, such an 
additive noise model suggests that at any bit 
rate, the process can be stabilized by a suitable 
controller. However, a simple argument can show 
that is not true. Consider a scalar process in which 
at time k , the controller knows that the state is 
within a set of length l(k). Then, stabilization is 
possible only if / ( k ) remains bounded as k —> oo. 
Now, the evolution of l(k) is governed by two 
processes: at every time step, this uncertainty 
can be (i) decreased by a factor of at most 2 m 
due to the data transmission across the channel 
and (ii) increased by a factor of a (where a is 
the process matrix governing the evolution of the 
state) due to the process evolution. This implies 
that for stabilization to be possible, the inequality 
m > log 2 (< 2 ) must hold. Thus, the additive 
noise model is inherently wrong. Most results 
in the literature formalize this basic intuition 
above (Nair et al. 2007). 

A Network of Communication Channels 

For the case when there is only one sensor that 
transmits information to the controller across a 
network of communication channels connected 
in arbitrary topology, an analysis similar to that 
done for analog erasure channels can be per¬ 
formed (Tatikonda 2003). A max-flow min-cut 
like theorem again holds. The stability condition 
now becomes that for any cut-set 

J2 R I > J2 | 0g 2 (A, ), 

all unstable eigenvalues 

where J2 Rj * s the sum of data rates supported 
by the channels joining the source set to sink 
set for any cut-set and A/ are the eigenvalues of 
the process matrix A. Note that the summation 
on the right hand side is only over the unstable 


eigenvalues, since no information needs to be 
transmitted about the modes that are stable in 
open loop. 

Multiple Sensors 

The case when multiple sensors transmit 
information across an individual digital noiseless 
channel to a controller can also be considered. 
For every sensor i , define a rate vector 
{ Rj x , Ri 2 r" , Ri d } corresponding to the d modes 
of the system. If a mode j cannot be observed 
from the sensor /, set Rjj = 0. For stability, the 
condition 

> max(0, Ay), 

i 

for every mode j must be satisfied. All such rate 
vectors stabilize the system. 

Summary and Future Directions 

This entry provided a brief overview of some 
results available in the field of networked 
control systems. Although the area is seeing 
intense research activity, many problems 
remain open. For control across analog erasure 
channels, most existing results break down 
if a separation principle cannot be proved. 
Thus, for example, if control packets are also 
transmitted to the actuator across an analog 
erasure channel, the LQG optimal two-block 
design is unknown. There is some recent 
work on analyzing the stabilizability under 
such conditions (Gupta and Martins 2010), 
but the problem remains open in general. 
For digital noiseless channels, controllers that 
optimize some performance metric are largely 
unknown. Considering more general channel 
models is also an important research direction 
(Martins and Dahleh 2008; Sahai and Mitter 
2006). 

Cross-References 

► Averaging Algorithms and Consensus 

► Oscillator Synchronization 
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Abstract 

The random set (RS) concept generalizes that 
of a random vector. It permits the mathematical 
modeling of random systems that can be inter¬ 
preted as random patterns. Algorithms based on 
RSs have been extensively employed in image 
processing. More recently, they have found appli¬ 
cation in multitarget detection and tracking and in 
the modeling and processing of human-mediated 
information sources. The purpose of this entry 
is to briefly summarize the concepts, theory, and 
practical application of RSs. 


Keywords 

Image processing; Multitarget processing; Ran¬ 
dom finite sets; Stochastic geometry 


Introduction 

In ordinary signal processing, one models 
physical phenomena as “sources,” which generate 
“signals” obscured by random “noise.” The 
sources are to be extracted from the noise 
using optimal-estimation algorithms. Random 
set (RS) theory was devised about 40 years 
ago by mathematicians who also wanted to 
construct optimal-estimation algorithms. The 
“signals” and “noise” that they had in mind, 
however, were geometric patterns in images. 
The resulting theory, stochastic geometry , is 
the basis of the “morphological operators” 
commonly employed today in image-processing 
applications. It is also the basis for the theory of 
RSs. An important special case of RS theory, the 
theory of random finite sets (RFSs), addresses 
problems in which the patterns of interest consist 
of a finite number of points. It is the theoretical 
basis of many modern medical and other image- 
processing algorithms. In recent years, RFS 
theory has found application to the problem 
of detecting, localizing, and tracking unknown 
numbers of unknown, evasive point targets. 
Most recently and perhaps most surprisingly, 
RS theory provides a theoretically rigorous 
way of addressing “signals” that are human- 
mediated, such as natural-language statements 
and inference rules. The breadth of RS theory 
is suggested in the various chapters of Goutsias 
etal. (1997). 

The purpose of this entry is to summarize the 
RS and RFS theories and their applications. It is 
divided in to the following sections: A Simple 
Example, Mathematics of Random Sets, Ran¬ 
dom Sets and Image Processing, Random Sets 
and Multitarget Processing, Random Sets and 
Human-Mediated Data, Summary and Future Di¬ 
rections, Cross-References, and Recommended 
Reading. 
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A Simple Example 

To illustrate the concept of a RS, let us begin by 
examining a simple example: locating stars in the 
nighttime sky. We will proceed in successively 
more illustrative steps: 

Locating a single non-dim star (estimating 
a random point). When we try to locate a star, 
we are trying to estimate its actual position - its 
“state” x = (ao,6o) - in terms of its azimuth 
angle oto and elevation angle do. When the star 
is dim but not too dim, its apparent position will 
vary slightly. We can estimate its position by 
averaging many measurements - i.e., by applying 
a point estimator. 

Locating a very dim star (estimating an RS 
with at most one element). Assume that the star is 
so dim that, when we see it, it might be just a mo¬ 
mentary visual illusion. Before we can estimate 
its position, we must first estimate whether or not 
it exists. We must record not only its apparent 
position z = (a, 9) (if we see it) but its apparent 
existence £, with s = 1 (we saw it) or £ = 0 (we 
did not). Averaging s over many observations, we 
get a number q between 0 and 1. If q > ^ (say), 
we could declare that the star probably actually 
is a star; and then we could average the non-null 
observations to estimate its position. 

Locating multiple stars (estimating an RFS). 
Suppose that we are trying to locate all of the 
stars in some patch of sky. In some cases, two 
dim stars may be so close that they are difficult 
to distinguish. We will then collect three kinds 
of measurements from them: Z = 0 (did not 
see either star), Z = {(a, 6)} (we saw one or 
the other), or Z = {(c^i, ^i), (<^ 2 , ^ 2 )} (saw both). 
The total collected measurement in the patch of 
sky is a finite set Z = {zi,..., z m } of point 
measurements with z j = (6j , aj), where each z 7 
is random, where m is random, and where m = 0 
corresponds to the null measurement Z = 0. 

Locating multiple stars in a quantized sky 
(estimation using imprecise measurements). Sup¬ 
pose that, for computational reasons, the patch 
of sky must be quantized into a finite number 
of hexagonal-shaped cells, c\, ... ,cm- Then, the 
measurement from any star is not a specific point 
z, but instead the cell c that contains z. The 


measurement c is imprecise - a randomly varying 
hexagonal cell c. There are two ways of thinking 
about the total measurement collection. First, it is 
a finite set Z = c {c\, ..., cm} of 

cells. Second, it is the union Z = c[ U ... U c' m of 
all of the observed cells - i.e., it is a geometrical 
pattern. 

Locating multiple stars over an extended 
period of time (estimating multiple moving tar¬ 
gets). As the night progresses, we must contin¬ 
ually redetermine the existence and positions of 
each star - a process called multitarget tracking. 
We must also account for appearances and disap¬ 
pearances of the stars in the patch - i.e., for target 
death and birth. 


Mathematics of Random Sets 

The purpose of this section is to sketch the ele¬ 
ments of the theory of random sets. It is organized 
as follows: General Theory of Random Sets, Ran¬ 
dom Finite Sets (Random Point Processes), and 
Stochastic Geometry. Of necessity, the material 
is less elementary than in later sections. 

General Theory of Random Sets 

Let 2) be a topological space - for example, an 
A-dimensional Euclidean space R N . The power 
set 2^ of 2) is the class of all possible sub¬ 
sets S c 2). Any subclass of 2^ is called 
a “hyperspace.” The “elements” or “points” of 
a hyperspace are thus actually subsets of some 
other space. For a hyperspace to be of interest, 
one must extend the topology on 2) to it. There 
are many possible topologies for hyperspaces 
(Michael 1950). The most well studied is the 
Fell-Matheron topology , also called the “hit-and- 
miss” topology (Matheron 1975). It is applicable 
when 2) is Hausdorff, locally compact, and com¬ 
pletely separable. It topologizes only the hyper¬ 
space c(2^) of all closed subsets C of 2). In this 
case, a random (closed) set © is a measurable 
mapping from some probability space into c(2^). 

The Fell-Matheron topology’s major strength 
is its relative simplicity. Let “Pr(£)” denote the 
probability of a probabilistic event £. Then, 
normally, the probability law of © would be 
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described by a very abstract probability measure 
p®(0) = Pr(© G O). This measure must 
be defined on the Borel-measurable subsets 
O c c(2^), with respect to the Fell-Matheron 
topology, where O is itself a class of subsets 
of 2). However, define the Choquet capacity 
functional by c©(G) = Pr(© PI G ^ 0) for 
all open subsets G c 2). Then, the Choquet- 
Matheron theorem states that the probability law 
of © is completely described by the simpler, 
albeit nonadditive, measure c©(G). 

The theory of random sets has evolved 
into a substantial subgenre of statistical theory 
(Molchanov 2005). For estimation theory, the 
concept of the expected value E[©] of a random 
set © is of particular interest. Most definitions 
of E[©] are very abstract (Molchanov 2005, 
Chap. 2). In certain circumstances, however, 
more conventional-looking definitions are 
possible. Suppose that 2) is a Euclidean space 
and that c(2^) is restricted to £(2^), the 
bounded, convex, closed subsets of 2). If C,C' 
are two such subsets, their Minkowski sum is 
C + C’ = {c + c'\ c G C, c' G C'}. Endowed 
with this definition of addition, &(2^) can 
be homeomorphically and homomorphic ally 
embedded into a certain space of functions 
(Molchanov 2005, pp. 199-200). Denote this 
embedding by C i—> 0c- Then, the expected 
value E[©] of ©, defined in terms of Minkowski 
addition, corresponds to the conventional 
expected value E[0@] of the random function 0©. 


Random Finite Sets (Random Point 
Processes) 

Suppose that the c(2^) is restricted to f(2^), 
the class of finite subsets of 2). (In many for¬ 
mulations, f(2^) is taken to be the class of 
locally finite subsets of 2) - i.e., those whose 
intersection with compact subsets is finite.) A 
random finite set (RFS) is a measurable mapping 
from a probability space into f(2^). An example: 
the field of twinkling stars in some patch of a 
night sky. RFS theory is a particular mathematical 
formulation of point process theory (Daley and 
Vere-Jones 1998; Snyder and Miller 1991; Stoyan 
etal. 1995). 


A Poisson RFS tp is perhaps the simplest 
nontrivial example of a random point pattern. It 
is specified by a spatial distribution s(y) and an 
intensity /x. At any given instant, the probability 
that there will be n points in the pattern is p(n) = 
e~^/x n /n\ (the value of the Poisson distribution). 
The probability that one of these n points will be 
y is s(y). The function D^(y) = ji-s(y) is called 
the intensity function of tp. 

At any moment, the point pattern produced 
by T* is a finite set Y = {yi,..., y w } of points 
yi, ..., y n in 2), where n = 0, 1 ,... and where 

Y = & if n = 0. If n = 0 then Y represents 
the hypothesis that no objects at all are present. If 
n = 1 then Y = {yi} represents the hypothesis 
that a single object yi is present. If n = 2 then 

Y = {yi,y 2 } represents the hypothesis that there 
are two distinct objects yi ^ y 2 - And so on. 

The probability distribution of tp - i.e., the 
probability that will have Y = {yi,... ,y n } 
as an instantiation - is entirely determined by its 
intensity function D^(y): 

MY) = M{yu...,y n }) 

= e~ mu ■ Dq,(yi) ■ ■ ■ Dy(y„) 

Every suitably well-behaved RFS & has a 
probability distribution fp(Y) and an intensity 
function D^(y) (a.k.a .first-moment density ). A 
Poisson RFS is unique in that fe(Y) is com¬ 
pletely determined by D^(y). 

Conventional signal processing is often con¬ 
cerned with single-object random systems that 
have the form 

Z = rj(x) + V 

where x is the state of the system; r](x) is the 
“signal” generated by the system; the zero-mean 
random vector V is the random “noise” associ¬ 
ated the sensor; and Z is the random measurement 
that is observed. The purpose of signal processing 
is to construct an estimate x(zi,... .,z&) of x, 
using the information contained in one or more 
draws Z \,_, from the random variable Z. 

RFS theory is analogously concerned with 
random systems that have the form 
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£ = T(X) U Q 

where a random finite point pattern Y(X) is the 
“signal” generated by the point pattern X (which 
is an instantiation of a random point pattern S); 
Q is a random finite point “noise” pattern; £ is 
the total random finite point pattern that has been 
observed; and “U” denotes set-theoretic union. 
One goal of RFS theory is to devise algorithms 
that can construct an estimate X(Z\Zjfi of 
X, using multiple point patterns Z \c 
2) drawn from £. One approximate approach is 
that of estimating only the first-moment density 
D s (x) of E. 

Stochastic Geometry 

Stochastic geometry addresses more complicated 
random patterns. An example: the field of twin¬ 
kling stars in a quantized patch of the night 
sky, in which case the measurement is the union 
c\ U ... U c m of a finite number of hexagonally 
shaped cells. 

This is one instance of a germ-grain process 
(Stoyan et al. 1995, pp. 59-64). Such a process is 
specified by two items: an RFS 4* and a function 
c y that associates with each y in 2) a closed subset 
c y c 3- For example, if 2) = M 2 is the real¬ 
valued plane, then c y could be the disk of radius 
r centered at y = (x, y ). Let Y = {y 1? — , y„} 
be a particular random draw from 4'. The points 
yi ,..., y n are the “germs,” and c yi ,..., c yn are 
the “grains” of this random draw from the germ- 
grain process 0. The total pattern in 2) is the 
union c yi U ... U c yn of the grains - a random 
draw from 0. Germ-grain processes can be used 
to model many kinds of natural processes. One 
example is the distribution of graphite particles 
in a two-dimensional section of a piece of iron, 
in which case the c y could be chosen to be line 
segments rather than disks. 

Stochastic geometry is concerned with ran¬ 
dom binary images that have observation struc¬ 
tures such as 

0 = (S n A) U Q 

where S is a “signal” pattern; A is a random 
pattern that models obscurations; £2 is a random 


pattern that models clutter; and 0 is the total 
pattern that has been observed. A common sim¬ 
plifying assumption is that Q and A c are germ- 
grain processes. One goal of stochastic geometry 
is to devise algorithms that can construct an opti¬ 
mal estimate S(T\ ,...., Tk) of S, using multiple 
patterns 7),_, Tk c 2) drawn from 0. 

Random Sets and Image Processing 

Both point process theory and stochastic 
geometry have found extensive application 
to image-processing applications. These are 
considered briefly in turn. 

Stochastic Geometry and Image Processing. 

Stochastic geometry methods are based on the 
use of a “structuring element” B (a geometrical 
shape, such as a disk, sphere, or more complex 
structure) to modify an image. 

The dilation of a set S by B is S 0 B where 
“0” is Minkowski addition (Stoyan et al. 1995). 
Dilation tends to fill in cavities and fissures in 
images. The erosion of S is S 0 B = ( S c 0 B c ) c 
where “ c ” indicates set-theoretic complement. 
Erosion tends to create and increase the size of 
cavities and fissures. Morphological filters are 
constructed from various combinations of dila¬ 
tion and erosion operators. 

Suppose that a binary image £ = S has been 
degraded by some measurement process - for 
example, the process 0 = (S Pi A) U Q . Then, 
image restoration refers to the construction of an 
estimate S(T ) of the original image S from a 
single degraded image 0 = 7. The restoration 
operator S(T) is optimal if it can be shown to 
be optimally close to S , given some concept of 
closeness. The symmetric difference 

T X UT 2 = ( Ti U T 2 ) - ( Ti n t 2 ) 

is a commonly used method for measuring the 
dissimilarity of binary images. It can be used to 
construct measures of distance between random 
images. One such distance is 

= E [|0! U 0 2 |] 
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where \S\ denotes the size of the set S and E[A] 
is the expected value of the random number A. 
Other distances require some definition of the 
expected value E[0] of a random set 0. It has 
been shown that, under certain circumstances, 
certain morphological operators can be viewed as 
consistent maximum a posteriori (MAP) estima¬ 
tors of S (Goutsias et al. 1997, p. 97). 

RFS Theory and Image Processing. Positron- 
emission tomography (PET) is one example of 
the application of RFS theory. In PET, tissues 
of interest are suffused with a positron-emitting 
radioactive isotope. When a positron annihilates 
an electron in a suitable fashion, two photons 
are emitted in opposite directions. These photons 
are detected by sensors in a ring surrounding the 
radiating tissue. The location of the annihilation 
on the line can be estimated by calculating time 
difference of arrival. 

Because of the physics of radioactive decay, 
the annihilations can be accurately modeled 
as a Poisson RFS 'P. Since a Poisson RFS is 
completely determined by its intensity function 
D^/(x), it is natural to try to estimate D^(x). 
This yields the spatial distribution sy(y) of 
annihilations - which, in turn, is the basis of the 
PET image (Snyder and Miller 1991, pp. 115— 
119). 

Random Sets and Multitarget 
Processing 

The purpose of this section is to summarize 
the application of RFS theory to multitarget de¬ 
tection, tracking, and localization. An example: 
tracking the positions of stars in the night sky 
over an extended period of time. 

Suppose that at time 4 there are an unknown 
number n of targets with unknown states 
xi,...,x„. The state of the entire multitarget 
system is a finite set A = {xi,...,x w } with 
n > 0. When interrogating a scene, many 

sensors (such as radars) produce a measurement 
of the form Z = {zi,...,z m } - i.e., a 

finite set of measurements. Some of these 
measurements are generated by background 


clutter Qk. Others are generated by the targets, 
with some targets possibly not having generated 
any. Mathematically speaking, Z is a random 
draw from an RFS 54 that can be decomposed 
as 54 = T(Afc) U Oik, where T(A^) is the set of 
target-generated measurements. 

Conventional Multitarget Detection and 
Tracking. This is based on a “divide and 
conquer” strategy with three basic steps: time 
update , data association , and measurement 
update. At time 4 we have n “tracks” 4 ,..., x n 
(hypothesized targets). In the time update, an 
extended Kalman filter (EKF) is used to time- 
predict the tracks 4 to predicted tracks 
at the time 4+1 of the next measurement set 

Zk-\ 1-1 — {Zl,..., Z m }. 

Given Z^+i, we can construct the following 
data-association hypothesis H : for each i = 
1 ,.,.,/z, the predicted track x+ generated the 
detection z j ., for some index j [, or, alternatively, 
this track was not detected at all. If we remove 
from Zk+ 1 all of the z ;i ,..., z Jn , the remaining 
measurements are interpreted either as being clut¬ 
ter or as having been generated by new targets. 
Enumerating all possible association hypotheses 
(which is a combinatorily complex procedure), 
we end up with a “hypothesis table” H \,..., H v . 

Given Hi , let z j. be the measurement that is 
hypothesized to have been generated by predicted 
track 4 + . Then, the measurement-update step of 
an EKF is used to construct a measurement- 
updated track x i j i from and Zj r Attached 
to each Hi is a hypothesis probability pi - the 
probability that the particular hypothesis H t is the 
correct one. The hypothesis with largest pi yields 
the multitarget estimate X = {xi,..., x^}. 

RFS Multitarget Detection and Tracking. In 

the place of tracks and hypothesis tables, this uses 
multitarget state sets and multitarget probability 
distributions. In place of the conventional time 
update, data association, and measurement 
update, it uses a recursive Bayes filter. A random 
multitarget state set is an RFS E^ whose 
points are target states. A multitarget probability 
distribution is the probability distribution 
f(X k \Z uk ) = h kV: (X) of the RFS E*,*, 
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where Z\± : Zi,..., Z& is the time sequence 
of measurement sets at time 4 . 

RFS Time Update. The Bayes filter time- 
update step f(X k \Z l:k ) -> f(X k +i\Z l:k ) 
requires a multitarget Markov transition function 
f(Xk+i\Xk). It is the probability that the 
multitarget system will have multitarget state set 
Xk +1 at time 4 + 1 , if it had multitarget state set 
Xk at time 4 . It takes into account all pertinent 
characteristics of the targets: individual target 
motion, target appearance, target disappearance, 
environmental constraints, etc. It is explicitly 
constructed from an RFS multitarget motion 
model using a multitarget integrodifferential 
calculus. 

RFS Measurement Update. The Bayes filter 
measurement-update step f(Xk+\\Z\ : k) 
/(Zjfc+i|Zi : fc+i) is just Bayes rule. It requires 
a multitarget likelihood function ff+\(Z\X) - 
the likelihood that a measurement set Z will be 
generated, if a system of targets with state set 
X is present. It takes into account all pertinent 
characteristics of the sensor(s): sensor noise, 
fields of view and obscurations, probabilities 
of detection, false alarms, and/or clutter. It is 
explicitly constructed from an RFS measurement 
model using multitarget calculus. 

RFS State Estimation. Determination of the 
number n and states xi,... ,x n of the targets is 
accomplished using a Bayes-optimal multitarget 
state estimator. The idea is to determine the Xk+\ 
that maximizes / {Xk+\ |Zi : ^h-i) in some sense. 

Approximate Multitarget RFS Filters. 

The multitarget Bayes filter is, in general, 
computationally intractable. Central to the RFS 
approach is a toolbox of techniques - including 
the multitarget calculus - designed to produce 
statistically principled approximate multitarget 
filters. The two most well studied are the 
probability hypothesis density (PHD) filter and 
its generalization the cardinalized PHD (CPHD) 
filter. In such filters, f(Xk\Zi±) is replaced by 
the first-moment density D(xk\Zi±) of S k\k- 
These filters have been shown to be faster and 


perform better than conventional approaches in 
some applications. 

Random Sets and Human-Mediated Data 


Random Sets and Human-Mediated 
Data 

Natural-language statements and inference rules 
have already been mentioned as examples of 
human-mediated information. Expert-systems 
theory was introduced in part to address 
situations - such as this - that involve 
uncertainties other than randomness. Expert- 
system methodologies include fuzzy set theory , 
the Dempster-Shafer (D-S) theory of uncertain 
evidence , and rule-based inference. RS theory 
provides solid Bayesian foundations for them 
and allows human-mediated data to be processed 
using standard Bayesian estimation techniques. 
The purpose of this section is to briefly 
summarize this aspect of the RS approach. 

The relationships between expert-systems 
theory and random set theory were first 
established by researchers such as Orlov (1978), 
Hohle (1982), Nguyen (1978), and Goodman and 
Nguyen (1985). At a relatively early stage, it 
was recognized that random set theory provided 
a potential means of unifying much of expert- 
systems theory (Goodman and Nguyen 1985; 
Kruse et al. 1991). 

A conventional sensor measurement at time 4 
is typically represented as Z k = h( x k) + Vfc - 
equivalently formulated as a likelihood function 
f(zk\x k ). It is conventional to think of z k as 
the actual “measurement” and of /(z k |x fc ) as the 
full description of the uncertainty associated with 
it. In actuality, z k is just a mathematical model 
z^ of some real-world measurement £&. Thus, 
the likelihood actually has the form f(£k\ x k) = 

This observation assumes crucial importance 
when one considers human-mediated data. Con¬ 
sider the simple natural-language statement 

£ = “The target is near the tower” 
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where the tower is a landmark, located at a known 
position (xo, Jo), and where the term “near” is 
assumed to have the following specific meaning: 
(.x,y) is near (xo,yo) means that (x, y) e T 5 
where T 5 is a disk of radius 5 m, centered at 
(xo, jo)- If z = (x, j) is the actual measurement 
of the target’s position, then £ is equivalent to the 
formula z 6 T5. Since z is just one possible draw 
from Zfc, we can say that £ - or, equivalently, 
T 5 - is actually a constraint on the underlying 
measurement process: Z k e T 5 . 

Because the word “near” is rather vague, we 
could just as well say that z e T 5 is the best 
choice, with confidence W 5 = 0.7; that z e is 
the next best choice, with confidence W 4 = 0.2; 
and that z e is the least best, with confidence 
W 6 = 0.1. Let 0 be the random subset of 3 
defined by Pr(0 = 7}) = vv/ for i = 4, 5, 6 . In 
this case, £ is equivalent to the random constraint 

Z k e 0. 

The probability 

At(@|xfc) = Pr(J7(x^) + V* e 0) 

= Pr(Z fc € 0|X fc = x k ) 

is called a generalized likelihood function (GLF). 
GLFs can be constructed for more complex 
natural-language statements, for inference rules, 
and more. Using their GLF representations, 
such “nontraditional measurements” can be 
processed using single- and multi-object 
recursive Bayes filters and their approximations. 
As a consequence, it can be shown that fuzzy 
logic, the D-S theory, and rule-based inference 
can be subsumed within a single Bayesian- 
probabilistic paradigm. 

Summary and Future Directions 

In the engineering world, the theory of random 
sets has been associated primarily with certain 
specialized image-processing applications, such 
as morphological filters and tomographic imag¬ 
ing. It has more recently found application in 


fields such as multitarget tracking and in expert- 
systems theory. All of these fields of application 
remain areas of active research. 

Cross-References 

► Estimation, Survey on 

► Extended Kalman Filters 

► Nonlinear Filters 

Recommended Reading 

Molchanov (2005) provides a definitive exposi¬ 
tion of the general theory of random sets. Two 
excellent references for stochastic geometry are 
Stoyan et al. (1995) and Barndorff-Nielsen and 
van Lieshout (1999). The books by Kingman 
(1993) and Daley and Vere-Jones (1998) are 
good introductions to point process theory. The 
application of point process theory and stochas¬ 
tic geometry to image processing is addressed 
in, respectively, Snyder and Miller (1991) and 
Stoyan et al. (1995). The application of RFSs to 
multitarget estimation is addressed in the tutorials 
Mahler (2004,2013) and the book Mahler (2007). 
Introductions to the application of random sets to 
expert systems can be found in Kruse et al. (1991) 
and Mahler (2007), Chaps. 3-6. 
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Abstract 

This entry discusses the history and describes 
the multitude of methods and applications of this 
important branch of stochastic process theory. 


Keywords 

Linear stochastic filtering; Markov step pro¬ 
cesses; Maximum likelihood estimation; Riccati 
equation; Stratonovich-Kushner equation 

Estimation is the process of inferring the value of 
an unknown given quantity of interest from noisy, 
direct or indirect, observations of such a quantity. 
Due to its great practical relevance, estimation 
has a long history and an enormous variety 
of applications in all fields of engineering and 


science. A certainly incomplete list of possible 
application domains of estimation includes the 
following: statistics (Bard 1974; Ghosh et al. 
1997; Koch 1999; Lehmann and Casella 1998; 
Tsybakov 2009; Wertz 1978), telecommunication 
systems (Sage and Melsa 1971; Schonhoff 
and Giordano 2006; Snyder 1968; Van Trees 
1971), signal and image processing (Barkat 
2005; Biemond et al. 1983; Elliott et al. 2008; 
Itakura 1971; Kay 1993; Kim and Woods 1998; 
Levy 2008; Najim 2008; Poor 1994; Tuncer 
and Friedlander 2009; Wakita 1973; Woods and 
Radewan 1977), aerospace engineering (McGee 
and Schmidt 1985), tracking (Bar-Shalom and 
Fortmann 1988; Bar-Shalom et al. 2001, 2013; 
Blackman and Popoli 1999; Farina and Studer 
1985, 1986), navigation (Dissanayake et al. 
2001 ; Durrant-Whyte and Bailey 2006a,b; Farrell 
and Barth 1999; Grewal et al. 2001; Mullane 
et al. 2011; Schmidt 1966; Smith et al. 1986; 
Thrun et al. 2006), control systems (Anderson 
and Moore 1979; Athans 1971; Goodwin et al. 
2005; Joseph and Tou 1961; Kalman 1960a; 
Maybeck 1979, 1982; Soderstrom 1994; Stengel 
1994), econometrics (Aoki 1987; Pindyck 
and Roberts 1974; Zellner 1971), geophysics 
(e.g., seismic deconvolution) (Bay less and 
Brigham 1970; Flinn et al. 1967; Mendel 
1977, 1983, 1990), oceanography (Evensen 
1994a; Ghil and Malanotte-Rizzoli 1991), 
weather forecasting (Evensen 1994b, 2007; 
McGarty 1971), environmental engineering 
(Dochain and Vanrolleghem 2001; Heemink 
and Segers 2002; Nachazel 1993), demographic 
systems (Leibungudt et al. 1983), automotive 
systems (Barbarisi et al. 2006; Stephant et al. 
2004), failure detection (Chen and Patton 
1999; Mangoubi 1998; Willsky 1976), power 
systems (Abur and Gomez Esposito 2004; 
Debs and Larson 1970; Miller and Lewis 
1971; Monticelli 1999; Toyoda et al. 1970), 
nuclear engineering (Robinson 1963; Roman 
et al. 1971; Sage and Masters 1967; Venerus 
and Bullock 1970), biomedical engineering 
(Bekey 1973; Snyder 1970; Stark 1968), pattern 
recognition (Andrews 1972; Ho and Agrawala 
1968; Lainiotis 1972), social networks (Snijders 
et al. 2012), etc. 
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Chapter Organization 

The rest of the chapter is organized as follows. 
Section “Historical Overview on Estimation” 
will provide a historical overview on estimation. 
The next section will discuss applications of 
estimation. Connections between estimation and 
information theories will be explored in the sub¬ 
sequent section. Finally, the section “Conclusions 
and Future Trends” will conclude the chapter 
by discussing future trends in estimation. An 
extensive list of references is also provided. 


Historical Overview on Estimation 

A possibly incomplete, list of the major achieve¬ 
ments on estimation theory and applications is 
reported in Table 1. The entries of the table, 
sorted in chronological order, provide for each 
contribution the name of the inventor (or inven¬ 
tors), the date, and a short description with main 
bibliographical references. 

Probably the first important application of 
estimation dates back to the beginning of 
the nineteenth century whenever least-squares 
estimation (FSE), invented by Gauss in 1795 
(Gauss 1995; Eegendre 1810), was successfully 
exploited in astronomy for predicting planet 
orbits (Gauss 1806). Feast-squares estimation 
follows a deterministic approach by minimizing 
the sum of squares of residuals defined as 
differences between observed data and model- 
predicted estimates. A subsequently introduced 
statistical approach is maximum likelihood 
estimation (MFE), popularized by R. A. Fisher 
between 1912 and 1922 (Fisher 1912, 1922, 
1925). MFE consists of finding the estimate of 
the unknown quantity of interest as the value 
that maximizes the so-called likelihood function, 
defined as the conditional probability density 
function of the observed data given the quantity to 
be estimated. In intuitive terms, MFE maximizes 
the agreement of the estimate with the observed 
data. Whenever the observation noise is assumed 
Gaussian (Kim and Shevlyakov 2008; Park et al. 
2013), MFE coincides with FSE. 


While estimation problems had been 
addressed for several centuries, it was not 
until the 1940s that a systematic theory of 
estimation started to be established, mainly 
relying on the foundations of the modern theory 
of probability (Kolmogorov 1933). Actually, the 
roots of probability theory can be traced back to 
the calculus of combinatorics (the Stomachion 
puzzle invented by Archimedes (Netz and Noel 
2011)) in the third century B.C. and to the 
gambling theory (work of Cardano, Pascal, de 
Fermat, Huygens) in the sixteenth-seventeenth 
centuries. 

Differently from the previous work devoted to 
the estimation of constant parameters, in the 
period 1940-1960 the attention was mainly 
shifted toward the estimation of signals. In 
particular, Wiener in 1940 (Wiener 1949) 
and Kolmogorov in 1941 (Kolmogorov 1941) 
formulated and solved the problem of linear 
minimum mean-square error (MMSE) estimation 
of continuous-time and, respectively, discrete¬ 
time stationary random signals. In the late 1940s 
and in the 1950s, Wiener-Kolmogorov’s theory 
was extended and generalized in many directions 
exploiting both time-domain and frequency- 
domain approaches. At the beginning of the 
1960s Rudolf E. Kalman made pioneering 
contributions to estimation by providing the 
mathematical foundations of the modern theory 
based on state-variable representations. In 
particular, Kalman solved the linear MMSE 
filtering and prediction problems both in discrete¬ 
time (Kalman 1960b) and in continuous-time 
(Kalman and Bucy 1961); the resulting optimal 
estimator was named after him, Kalman filter 
(KF). As a further contribution, Kalman also 
singled out the key technical conditions, i.e., 
observability and controllability, for which the 
resulting optimal estimator turns out to be 
stable. Kalman’s work went well beyond earlier 
contributions of A. Kolmogorov, N. Wiener, and 
their followers (“frequency-domain” approach) 
by means of a general state-space approach. From 
the theoretical viewpoint, the KF is an optimal 
estimator, in a wide sense, of the state of a linear 
dynamical system from noisy measurements; 
specifically it is the optimal MMSE estimator in 
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Estimation, Survey on. Table 1 Major developments on estimation 

Archimedes 

Third century B.C. 

Combinatorics (Netz and Noel 2011) as the basis of 
probability 

G. Cardano, B. Pascal, P. de 
Fermat, C. Huygens 

Sixteenth-seventeenth cen¬ 
turies 

Roots of the theory of probability (Devlin 2008) 

J. F. Riccati 

1722-1723 

Differential Riccati equation (Riccati 1722, 1723), sub¬ 
sequently exploited in the theory of linear stochastic 
filtering 

T. Bayes 

1763 

Bayes’ formula on conditional probability (Bayes 
1763; McGrayne 2011) 

C. F. Gauss, A. M. Legendre 

1795-1810 

Least-squares estimation and its applications to the pre¬ 
diction of planet orbits (Gauss 1806, 1995; Legendre 
1810) 

P S. Laplace 

1814 

Theory of probability (Laplace 1814) 

R. A. Fisher 

1912-1922 

Maximum likelihood estimation (Fisher 1912, 1922, 
1925) 

A. N. Kolmogorov 

1933 

Modem theory of probability (Kolmogorov 1933) 

N. Wiener 

1940 

Minimum mean-square error estimation of continuous¬ 
time stationary random signals (Wiener 1949) 

A. N. Kolmogorov 

1941 

Minimum mean-square error estimation of discrete¬ 
time stationary random signals (Kolmogorov 1941) 

H. Cramer, C. R. Rao 

1945 

Theoretical lower bound on the covariance of estima¬ 
tors (Cramer 1946; Rao 1945) 

S. Ulam, J. von Neumann, 
N. Metropolis, E. Fermi 

1946-1949 

Monte Carlo method (Los Alamos Scientific Labora¬ 
tory 1966; Metropolis and Ulam 1949; Ulam 1952; 
Ulam et al. 1947) 

J. Sklansky, T. R. Benedict, 
G. W. Bordner, H. R. Simp¬ 
son, S. R. Neal 

1957-1967 

a — and a — p — y filters (Benedict and Bordner 
1962; Neal 1967; Painter et al. 1990; Simpson 1963; 
Sklansky 1957) 


R. L. Stratonovich, 1959-1964 Bayesian approach to stochastic nonlinear filtering of 

H. J. Kushner continuous-time systems, i.e., Stratonovich-Kushner 


equation for the evolution of the state conditional 
probability density (Jazwinski 1970; Kushner 1962, 
1967; Stratonovich 1959, 1960) 


R. E. Kalman 

1960 

Linear filtering and prediction for discrete-time sys¬ 
tems (Kalman 1960b) 

R. E. Kalman 

1961 

Observability of linear dynamical systems (Kalman 
1960a) 

R. E. Kalman, R. S. Bucy 

1961 

Linear filtering and prediction for continuous-time sys¬ 
tems (Kalman and Bucy 1961) 

A. E. Bryson, M. Frazier, 
H. E. Rauch, F. Tung, C. T. 
Striebel, D. Q. Mayne, J. S. 
Meditch, D. C. Fraser, L. E. 
Zachrisson, B. D. O. Ander¬ 
son, etc. 

Since 1963 

Smoothing of linear and nonlinear systems (Anderson 
and Chirarattananon 1972; Bryson and Frazier 1963; 
Mayne 1966; Meditch 1967; Rauch 1963; Rauch et al. 
1965; Zachrisson 1969) 

D. G. Luenberger 

1964 

State observer for a linear system (Luenberger 1964) 

Y. C. Ho, R. C. K. Lee 

1964 

Bayesian approach to recursive nonlinear estimation 
for discrete-time systems (Ho and Lee 1964) 

W. M. Wonham 

1965 

Optimal filtering for Markov step processes (Wonham 
1965) 

A. H. Jazwinski 

1966 

Bayesian approach to stochastic nonlinear filtering for 
continuous-time stochastic systems with discrete-time 
observations (Jazwinski 1966) 


(continued) 
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Combinatorics (Netz and Noel 2011) as the basis of 

Archimedes 

Third century B.C. 

probability 

S. F. Schmidt 

1966 

Extended Kalman filter and its application for the 
manned lunar missions (Schmidt 1966) 

P. L. Falb, A. V. Balakr- 
ishnan, J. L. Lions, S. G. 
Tzafestas, J. M. Nightin¬ 
gale, H. J. Kushner, J. S. 
Meditch, etc. 

Since 1967 

State estimation for infinite-dimensional (e.g., dis¬ 
tributed parameter, partial differential equation (PDE), 
delay) systems (Balakrishnan and Lions 1967; Falb 
1967; Kushner 1970; Kwakemaak 1967; Meditch 
1971; Tzafestas and Nightingale 1968) 

T. Kailath 

1968 

Principle of orthogonality and innovation approach 
to estimation (Frost and Kailath 1971; Kailath 1968, 
1970; Kailath and Frost 1968; Kailath et al. 2000) 

A. H. Jazwinski, B. Rawl¬ 
ings, etc. 

Since 1968 

Limited memory (receding-horizon, moving-horizon) 
state estimation with constraints (Alessandri et al. 
2005, 2008; Jazwinski 1968; Rao et al. 2001, 2003) 

F. C. Schweppe, D. P. Bert- 
sekas, I. B. Rhodes, M. Mi¬ 
lanese, etc. 

Since 1968 

Set-membership recursive state estimation with sys¬ 
tems with unknown but bounded noises (Alamo et al. 
2005; Bertsekas and Rhodes 1971; Chisci et al. 1996; 
Combettes 1993; Milanese and Belforte 1982; Mi¬ 
lanese and Vicino 1993; Schweppe 1968; Vicino and 
Zappa 1996) 

J. E. Potter, G. Golub, S. 

F. Schmidt, P. G. Kaminski, 
A. E. Bryson, A. Andrews, 

G. J. Bierman, M. Morf, T. 
Kailath, etc. 

1968-1975 

Square-root filtering (Andrews 1968; Bierman 1974, 
1977; Golub 1965; Kaminski and Bryson 1972; Morf 
and Kailath 1975; Potter and Stem 1963; Schmidt 
1970) 

C. W. Helstrom 

1969 

Quantum estimation (Helstrom 1969, 1976) 

D. L. Alspach, H. W. Soren¬ 
son 

1970-1972 

Gaussian-sum filters for nonlinear and/or non- 
Gaussian systems (Alspach and Sorenson 1972; 
Sorenson and Alspach 1970, 1971) 

T. Kailath, M. Morf, G. S. 
Sidhu 

1973-1974 

Fast Chandrasekhar-type algorithms for recursive state 
estimation of stationary linear systems (Kailath 1973; 
Morf etal. 1974) 

A. Segall 

1976 

Recursive estimation from point processes (Segall 
1976) 

J. W. Woods and C. Rade- 

wan 

1977 

Kalman filter in two dimensions (Woods and Radewan 
1977) for image processing 

J. H. Taylor 

1979 

Cramer-Rao lower bound (CRLB) for recursive state 
estimation with no process noise (Taylor 1979) 

D. Reid 

1979 

Multiple Hypothesis Tracking (MHT) filter for multi¬ 
target tracking (Reid 1979) 

L. Servi, Y. Ho 

1981 

Optimal filtering for linear systems with uniformly 
distributed measurement noise (Servi and Ho 1981) 

V. E. Benes 

1981 

Exact finite-dimensional optimal MMSE filter for a 
class of nonlinear systems (Benes 1981) 

H. V. Poor, D. Looze, J. Dar- 
ragh, S. Verdu, M. J. Grim- 
ble, etc. 

1981-1988 

Robust (e.g., Hoq) filtering (Darragh and Looze 1984; 
Grimble 1988; Hassibi et al. 1999; Poor and Looze 
1981; Simon 2006; Verdu and Poor 1984) 

V. J. Aidala, S. E. Hammel 

1983 

Bearings-only tracking (Aidala and Hammel 1983; 
Farina 1999) 

F. E. Daum 

1986 

Extension of the Benes filter to a more general class of 
nonlinear systems (Daum 1986) 


(continued) 
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Archimedes 

Third century B.C. 

Combinatorics (Netz and Noel 2011) as the basis of 
probability 

L. Dai and others 

Since 1987 

State estimation for linear descriptor (singular, im¬ 
plicit) stochastic systems (Chisci and Zappa 1992; Dai 
1987, 1989; Nikoukhah et al. 1992) 

N. J. Gordon, D. J. Salmond, 
A. M. F. Smith 

1993 

Particle (sequential Monte Carlo) filter (Doucet et al. 
2001; Gordon et al. 1993; Ristic et al. 2004) 

K. C. Chou, A. S. Willsky, 
A. Benveniste 

1994 

Multiscale Kalman filter (Chou et al. 1994) 

G. Evensen 

1994 

Ensemble Kalman filter for data assimilation in meteo¬ 
rology and oceanography (Evensen 1994b, 2007) 

R. P. S. Mahler 

1994 

Random set filtering (Mahler 1994, 2007a; Ristic et al. 
2013) 

S. J. Julier, J. K. Uhlmann, 
H. Durrant-Whyte 

1995 

Unscented Kalman filter (Julier and Uhlmann 2004; 
Julier et al. 1995) 

A. Germani et al. 

Since 1996 

Polynomial extended Kalman filter for nonlinear 
and/or non-Gaussian systems (Carravetta et al. 1996; 
Germani et al. 2005) 

P. Tichavsky, C. H. Mu- 
ravchik, A. Nehorai 

1998 

Posterior Cramer-Rao lower bound (PCRLB) for recur¬ 
sive state estimation (Tichavsky et al. 1998; van Trees 
and Bell 2007) 

R. Mahler 

2003, 2007 

Probability hypothesis density (PHD) and cardinalized 
PHD (CPHD) filters (Mahler 2003, 2007b; Ristic 2013; 
Vo and Ma 1996; Vo et al. 2007) 

A.G. Ramm 

2005 

Estimation of random fields (Ramm 2005) 

M. Hernandez, A. Farina, B. 
Ristic 

2006 

PCRLB for tracking in the case of detection probability 
less than one and false alarm probability greater than 
zero (Hernandez et al. 2006) 

Olfati-Saber and others 

Since 2007 

Consensus filters (Olfati-Saber et al. 2007; Calafiore 
and Abrate 2009; Xiao et al. 2005; Alriksson and 
Rantzer 2006; Olfati-Saber 2007; Kamgarpour and 
Tomlin 2007; Stankovic et al. 2009; Battistelli et al. 
2011, 2012, 2013; Battistelli and Chisci 2014) for 
networked estimation 


the Gaussian case (e.g., for normally distributed 
noises and initial state) and the best linear 
unbiased estimator irrespective of the noise 
and initial state distributions. From the practical 
viewpoint, the KF enjoys the desirable properties 
of being linear and acting recursively, step-by- 
step, on a noise-contaminated data stream. This 
allows for cheap real-time implementation on 
digital computers. Further, the universality of 
“state-variable representations” allows almost 
any estimation problem to be included in the 
KF framework. For these reasons, the KF is, 
and continues to be, an extremely effective and 
easy-to-implement tool for a great variety of 
practical tasks, e.g., to detect signals in noise 
or to estimate unmeasurable quantities from 


accessible observables. Due to the generality 
of the state estimation problem, which actually 
encompasses parameter and signal estimation as 
special cases, the literature on estimation since 
1960 till today has been mostly concentrated 
on extensions and generalizations of Kalman’s 
work in several directions. Considerable efforts, 
motivated by the ubiquitous presence of 
nonlinearities in practical estimation problems, 
have been devoted to nonlinear and/or non- 
Gaussian filtering, starting from the seminal 
papers of Stratonovich (1959,1960) and Kushner 
(1962, 1967) for continuous-time systems, 
Ho and Lee (1964) for discrete-time systems, 
and Jazwinski (1966) for continuous-time 
systems with discrete-time observations. In these 
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papers, state estimation is cast in a probabilistic 
(Bayesian) framework as the problem of evolving 
in time the state conditional probability density 
given observations (Jazwinski 1970). Work on 
nonlinear filtering has produced over the years 
several nonlinear state estimation algorithms, 
e.g., the extended Kalman filter (EKF) (Schmidt 
1966), the unscented Kalman filter (UKF) (Julier 
and Uhlmann 2004; Julier et al. 1995), the 
Gaussian-sum filter (Alspach and Sorenson 
1972; Sorenson and Alspach 1970, 1971), the 
sequential Monte Carlo (also called particle) filter 
(SMCF) (Doucet et al. 2001; Gordon et al. 1993; 
Ristic et al. 2004), and the ensemble Kalman filter 
(EnKF) (Evensen 1994a,b, 2007) which have 
been, and are still now, successfully employed 
in various application domains. In particular, 
the SMCF and EnKF are stochastic simulation 
algorithms taking inspiration from the work in the 
1940s on the Monte Carlo method (Metropolis 
and Ulam 1949) which has recently got renewed 
interest thanks to the tremendous advances in 
computing technology. A thorough review on 
nonlinear filtering can be found, e.g., in Daum 
(2005) and Crisan and Rozovskii (201 1). 

Other interesting areas of investigation have 
concerned smoothing (Bryson and Frazier 1963), 
robust filtering for systems subject to model¬ 
ing uncertainties (Poor and Fooze 1981), and 
state estimation for infinite-dimensional (i.e., dis¬ 
tributed parameter and/or delay) systems (Bal- 
akrishnan and Fions 1967). Further, a lot of 
attention has been devoted to the implementa¬ 
tion of the KF, specifically square-root filtering 
(Potter and Stern 1963) for improved numerical 
robustness and fast KF algorithms (Kailath 1973; 
Morf et al. 1974) for enhancing computational 
efficiency. Worth of mention is the work over 
the years on theoretical bounds on the estimation 
performance originated from the seminal papers 
of Rao (1945) and Cramer (1946) on the lower 
bound of the MSE for parameter estimation and 
subsequently extended in Tichavsky et al. (1998) 
to nonlinear filtering and in Hernandez et al. 
(2006) to more realistic estimation problems with 
possible missed and/or false measurements. An 
extensive review of this work on Bayesian bounds 
for estimation, nonlinear filtering, and tracking 


can be found in van Trees and Bell (2007). A brief 
review of the earlier (until 1974) state of art in 
estimation can be found in Fainiotis (1974). 

Applications 

Astronomy 

The problem of making estimates and predictions 
on the basis of noisy observations originally at¬ 
tracted the attention many centuries ago in the 
field of astronomy. In particular, the first attempt 
to provide an optimal estimate, i.e., such that a 
certain measure of the estimation error be min¬ 
imized, was due to Galileo Galilei that, in his 
Dialogue on the Two World Chief Systems (1632) 
(Galilei 1632), suggested, as a possible criterion 
for estimating the position of Tycho Brahe’s su¬ 
pernova, the estimate that required the “mini¬ 
mum amendments and smallest corrections” to 
the data. Eater, C. F. Gauss mathematically speci¬ 
fied this criterion by introducing in 1795 the least- 
squares method (Gauss 1806, 1995; Fegendre 
1810) which was successfully applied in 1801 
to predict the location of the asteroid Ceres. 
This asteroid, originally discovered by the Italian 
astronomer Giuseppe Piazzi on January 1, 1801, 
and then lost in the glare of the sun, was in 
fact recovered 1 year later by the Hungarian 
astronomer F. X. von Zach exploiting the least- 
squares predictions of Ceres’ position provided 
by Gauss. 

Statistics 

Starting from the work of Fisher in the 1920s 
(Fisher 1912, 1922, 1925), maximum likelihood 
estimation has been extensively employed 
in statistics for estimating the parameters of 
statistical models (Bard 1974; Ghosh et al. 
1997; Koch 1999; Fehmann and Casella 1998; 
Tsybakov 2009; Wertz 1978). 

Telecommunications and Signal/lmage 
Processing 

Wiener-Kolmogorov’s theory on signal esti¬ 
mation, developed in the period 1940-1960 
and originally conceived by Wiener during 
the Second World War for predicting aircraft 
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trajectories in order to direct the antiaircraft fire, 
subsequently originated many applications in 
telecommunications and signal/image processing 
(Barkat 2005; Biemond et al. 1983; Elliott 
et al. 2008; Itakura 1971; Kay 1993; Kim 
and Woods 1998; Levy 2008; Najim 2008; 
Poor 1994; Tuncer and Friedlander 2009; 
Van Trees 1971; Wakita 1973; Woods and 
Radewan 1977). For instance, Wiener filters have 
been successfully applied to linear prediction, 
acoustic echo cancellation, signal restoration, and 
image/video de-noising. But it was the discovery 
of the Kalman filter in 1960 that revolutionized 
estimation by providing an effective and powerful 
tool for the solution of any, static or dynamic, 
stationary or adaptive, linear estimation problem. 
A recently conducted, and probably non- 
exhaustive, search has detected the presence 
of over 16,000 patents related to the “Kalman 
filter,” spreading over all areas of engineering 
and over a period of more than 50 years. What 
is astonishing is that even nowadays, more than 
50 years after its discovery, one can see the 
continuous appearance of lots of new patents and 
scientific papers presenting novel applications 
and/or novel extensions in many directions (e.g., 
to nonlinear filtering) of the KF. Since 1992 
the number of patents registered every year and 
related to the KF follows an exponential law. 

Space Navigation and Aerospace 
Applications 

The first important application of the Kalman 
filter was in the NASA (National Aeronautic 
and Space Administration ) space program. As 
reported in a NASA technical report (McGee and 
Schmidt 1985), Kalman presented his new ideas 
while visiting Stanley F. Schmidt at the NASA 
Ames Research Center in 1960, and this meeting 
stimulated the use of the KF during the Apollo 
program (in particular, in the guidance system of 
Saturn V during Apollo 11 flight to the Moon), 
and, furthermore, in the NASA Space Shuttle 
and in Navy submarines and unmanned aerospace 
vehicles and weapons, such as cruise missiles. 
Further, to cope with the nonlinearity of the space 
navigation problem and the small word length 
of the onboard computer, the extended Kalman 


filter for nonlinear systems and square-root filter 
implementations for enhanced numerical robust¬ 
ness have been developed as part of the NASA’s 
Apollo program. The aerospace field was only 
the first of a long and continuously expanding 
list of application domains where the Kalman 
filter and its nonlinear generalizations have found 
widespread and beneficial use. 

Control Systems and System Identification 

The work on Kalman filtering (Kalman 1960b; 
Kalman and Bucy 1961) had also a significant 
impact on control system design and implemen¬ 
tation. In Kalman (1960a) duality between esti¬ 
mation and control was pointed out, in that for a 
certain class of control and estimation problems 
one can solve the control (estimation) problem 
for a given dynamical system by resorting to a 
corresponding estimation (control) problem for 
a suitably defined dual system. In particular, the 
Kalman filter has been shown to be dual of 
the linear-quadratic (LQ) regulator, and the two 
dual techniques constitute the linear-quadratic- 
Gaussian (LQG) (Joseph and Tou 1961) regula¬ 
tor. The latter consists of an LQ regulator feeding 
back in a linear way the state estimate provided 
by a Kalman filter, which can be independently 
designed in view of the separation principle. 
The KF as well as LSE and MLE techniques 
are also widely used in system identification 
(Ljung 1999; Soderstrom and Stoica 1989) for 
both parameter estimation and output prediction 
purposes. 

Tracking 

One of the major application areas for estimation 
is tracking (Bar-Shalom and Fortmann 1988; Bar- 
Shalom et al. 2001, 2013; Blackman and Popoli 
1999; Farina and Studer 1985, 1986), i.e., the 
task of following the motion of moving objects 
(e.g., aircrafts, ships, ground vehicles, persons, 
animals) given noisy measurements of kinematic 
variables from remote sensors (e.g., radar, sonar, 
video cameras, wireless sensors, etc.). The de¬ 
velopment of the Wiener filter in the 1940s was 
actually motivated by radar tracking of aircraft 
for automatic control of antiaircraft guns. Such 
filters began to be used in the 1950s whenever 
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computers were integrated with radar systems, 
and then in the 1960s more advanced and better 
performing Kalman filters came into use. Still 
today it can be said that the Kalman filter and 
its nonlinear generalizations (e.g., EKF (Schmidt 
1966), UKF (Julier and Uhlmann 2004), and 
particle filter (Gordon et al. 1993)) represent 
the workhorses of tracking and sensor fusion. 
Tracking, however, is usually much more compli¬ 
cated than a simple state estimation problem due 
to the presence of false measurements (clutter) 
and multiple objects in the surveillance region 
of interest, as well as for the uncertainty about 
the origin of measurements. This requires to use, 
besides filtering algorithms, smart techniques for 
object detection as well as for association be¬ 
tween detected objects and measurements. The 
problem of joint target tracking and classifica¬ 
tion has also been formulated as a hybrid state 
estimation problem and addressed in a number of 
papers (see, e.g., Smeth and Ristic (2004) and the 
references therein). 

Econometrics 

State and parameter estimation have been widely 
used in econometrics (Aoki 1987) for analyz¬ 
ing and/or predicting financial time series (e.g., 
stock prices, interest rates, unemployment rates, 
volatility etc.). 

Geophysics 

Wiener and Kalman filtering techniques are em¬ 
ployed in reflection seismology for estimating the 
unknown earth reflectivity function given noisy 
measurements of the seismic wavelet’s echoes 
recorded by a geophone. This estimation prob¬ 
lem, known as seismic deconvolution (Mendel 
1977, 1983, 1990), has been successfully ex¬ 
ploited, e.g., for oil exploration. 

Data Assimilation for Weather Forecasting 
and Oceanography 

Another interesting application of estimation 
theory is data assimilation (Ghil and Malanotte- 
Rizzoli 1991) which consists of incorporating 
noisy observations into a computer simulation 
model of a real system. Data assimilation has 
widespread use especially in weather forecasting 


and oceanography. A large-scale state-space 
model is typically obtained from the physical 
system model, expressed in terms of partial 
differential equations (PDEs), by means of 
a suitable spatial discretization technique so 
that data assimilation is cast into a state 
estimation problem. To deal with the huge 
dimensionality of the resulting state vector, 
appropriate filtering techniques with reduced 
computational load have been suitably developed 
(Evensen 2007). 

Global Navigation Satellite Systems 

Global Navigation Satellite Systems (GNSSs), 
such as GPS put into service in 1993 by the 
US Department of Defense, provide nowadays a 
commercially diffused technology exploited by 
millions of users all over the world for navigation 
purposes, wherein the Kalman filter plays a key 
role (Bar-Shalom et al. 2001). In fact, the Kalman 
filter not only is employed in the core of the 
GNSS to estimate the trajectories of all the satel¬ 
lites, the drifts and rates of all system clocks, and 
hundreds of parameters related to atmospheric 
propagation delay, but also any GNSS receiver 
uses a nonlinear Kalman filter, e.g., EKF, in order 
to estimate its own position and velocity along 
with the bias and drift of its own clock with 
respect to the GNSS time. 

Robotic Navigation (SLAM) 

Recursive state estimation is commonly em¬ 
ployed in mobile robotics (Thrun et al. 2006) in 
order to on-line estimate the robot pose, location 
and velocity, and, sometimes, also the location 
and features of the surrounding objects in the 
environment exploiting measurements provided 
by onboard sensors; the overall joint estimation 
problem is referred to as SLAM (simultaneous 
localization and mapping) (Dissanayake et al. 
2001; Durrant-Whyte and Bailey 2006a,b; 
Mullane et al. 2011; Smith et al. 1986; Thrun 
et al. 2006). 

Automotive Systems 

Several automotive applications of the Kalman 
filter, or of its nonlinear variants, are reported 
in the literature for the estimation of various 



Estimation, Survey on 


375 


quantities of interest that cannot be directly mea¬ 
sured, e.g., roll angle, sideslip angle, road-tire 
forces, heading direction, vehicle mass, state of 
charge of the battery (Barbarisi et al. 2006), 
etc. In general, one of the major applications of 
state estimation is the development of virtual sen¬ 
sors, i.e., estimation algorithms for physical vari¬ 
ables of interest, that cannot be directly measured 
for technical and/or economic reasons (Stephant 
et al. 2004). 


Miscellaneous Applications 

Other areas where estimation has found 
numerous applications include electric power 
systems (Abur and Gomez Esposito 2004; Debs 
and Larson 1970; Miller and Lewis 1971; 
Monticelli 1999; Toyoda et al. 1970), nuclear 
reactors (Robinson 1963; Roman et al. 1971; 
Sage and Masters 1967; Venerus and Bullock 
1970), biomedical engineering (Bekey 1973; 
Snyder 1970; Stark 1968), pattern recognition 
(Andrews 1972; Ho and Agrawala 1968; 
Lainiotis 1972), and many others. 


Connection Between Information and 
Estimation Theories 

In this section, the link between two fundamental 
quantities in information theory and estimation 
theory, i.e., the mutual information (MI) and 
respectively the minimum mean-square error 
(MMSE), is investigated. In particular, a 
strikingly simple but very general relationship 
can be established between the MI of the input 
and the output of an additive Gaussian channel 
and the MMSE in estimating the input given the 
output, regardless of the input distribution (Guo 
et al. 2005). Although this functional relation 
holds for general settings of the Gaussian channel 
(e.g., both discrete-time and continuous-time, 
possibly vector, channels), in order to avoid the 
heavy mathematical preliminaries needed to treat 
rigorously the general problem, two simple scalar 
cases, a static and a (continuous-time) dynamic 
one, will be discussed just to highlight the main 
concept. 


Static Scalar Case 

Consider two scalar real-valued random vari¬ 
ables, x and y, related by 

y = /ax + v (1) 


where v, the measurement noise, is a standard 
Gaussian random variable independent of v and 
cr can be regarded as the gain in the output 
signal-to-noise ratio (SNR) due to the channel. 
By considering the MI between x and y as a 
function of a, i.e., 1 (a) = I (x, /ax + v), it 
can be shown that the following relation holds 
(Guo et al. 2005): 


Y' {0) = 5 E 


(x-x(a)Y 


( 2 ) 


where x(a) = E [x | s/ax + u] is the minimum 
mean-square error estimate of x given y. Figure 1 
displays the behavior of both MI, in natural log¬ 
arithmic units of information (nats), and MMSE 
versus SNR. 

As mentioned in Guo et al. (2005), the above 
information-estimation relationship (2) has found 
a number of applications, e.g., in nonlinear filter¬ 
ing, in multiuser detection, in power allocation 
over parallel Gaussian channels, in the proof 
of Shannon’s entropy power inequality and its 
generalizations, as well as in the treatment of the 
capacity region of several multiuser channels. 


Linear Dynamic Continuous-Time Case 

While in the static case the MI is assumed to be 
a function of the SNR, in the dynamic case it 
is of great interest to investigate the relationship 
between the MI and the MMSE as a function of 
time. 

Consider the following first-order (scalar) lin¬ 
ear Gaussian continuous-time stochastic dynami¬ 
cal system: 


dx t = ax t dt + dw t 
dy t = /ax t dt + dv t 

where a is a real-valued constant while w t 
and v t are independent standard Brownian 
motion processes that represent the process and, 
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respectively, measurement noises. Defining 
by Xg = {xy,0 < s < t} the collection 
of all states up to time t and analogously 
y f 0 = {y 5 ,0 < s < t} for the channel outputs 
(i.e., measurements) and considering the MI 
between x* 0 and y* 0 as a function of time t , i.e., 
I(t) = I (vq, yj), it can be shown that (Duncan 
1970; Mayer-Wolf and Zakai 1983) 



0 H 



(4) 


where x t = E [^|Jq] is the minimum mean- 
square error estimate of the state x t given all the 
channel outputs up to time t , i.e., y^. Figure 2 
depicts the time behavior of both MI and MMSE 
for several values of cr and a = 1. 


Conclusions and Future Trends 

Despite the long history of estimation and the 
huge amount of work on several theoretical and 
practical aspects of estimation, there is still a lot 
of research investigation to be done in several 


directions. Among the many new future trends, 
networked estimation and quantum estimation 
(briefly overviewed in the subsequent parts of this 
section) certainly deserve special attention due to 
the growing interest on wireless sensor networks 
and, respectively, quantum computing. 

Networked Information Fusion and 
Estimation 

Information or data fusion is about combining, 
or fusing, information or data from multiple 
sources to provide knowledge that is not 
evident from a single source (Bar-Shalom 
et al. 2013; Farina and Studer 1986). In 
1986, an effort to standardize the terminol¬ 
ogy related to data fusion began and the 
JDL (Joint Directors of Laboratories) data 
fusion working group was established. The 
result of that effort was the conception of 
a process model for data fusion and a data 
fusion lexicon (Blasch et al. 2012; Hall and 
Llinas 1997). Information and data fusion are 
mainly supported by sensor networks which 
present the following advantages over a single 
sensor: 
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• Can be deployed over wide regions 

• Provide diverse characteristics/viewing angles 
of the observed phenomenon 

• Are more robust to failures 

• Gather more data that, once fused, provide a 
more complete picture of the observed phe¬ 
nomenon 

• Allow better geographical coverage, i.e., 
wider area and less terrain obstructions. 

Sensor network architectures can be centralized, 
hierarchical (with or without feedback), and 
distributed (peer-to-peer). Today’s trend for 
many monitoring and decision-making tasks is 
to exploit large-scale networks of low-cost and 
low-energy consumption devices with sensing, 
communication, and processing capabilities. 
For scalability issues, such networks should 
operate in a fully distributed (peer-to-peer) 
fashion, i.e., with no centralized coordination, 
so as to achieve in each node a global 
estimation/decision objective through localized 
processing only. 

The attainment of this goal actually requires 
several issues to be addressed like: 


• Spatial and temporal sensor alignment 

• Scalable fusion 

• Robustness with respect to data incest (or dou¬ 
ble counting), i.e., repeated use of the same 
information 

• Handling data latency (e.g., out-of-sequence 
measurements/estimates) 

• Communication bandwidth limitations 

In particular, to counteract data incest the 
so-called covariance intersection (Julier and 
Uhlmann 1997) robust fusion approach has 
been proposed to guarantee, at the price of 
some conservatism, consistency of the fused 
estimate when combining estimates from 
different nodes with unknown correlations. For 
scalable fusion, a consensus approach (Olfati- 
Saber et al. 2007) can be undertaken. This 
allows to carry out a global (i.e., over the 
whole network) processing task by iterating 
local processing steps among neighboring 
nodes. 

Several consensus algorithms have been 
proposed for distributed parameter (Calafiore 
and Abrate 2009) or state (Alriksson and 
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Rantzer 2006; Kamgarpour and Tomlin 2007; 
Olfati-Saber 2007; Stankovic et al. 2009; Xiao 
et al. 2005) estimation. Recently, Battistelli and 
Chisci (2014) introduced a generalized consensus 
on probability densities which opens up the 
possibility to perform in a fully distributed 
and scalable way any Bayesian estimation 
task over a sensor network. As by-products, 
this approach allowed to derive consensus 
Kalman filters with guaranteed stability under 
minimal requirements of system observability 
and network connectivity (Battistelli et al. 2011, 
2012; Battistelli and Chisci 2014), consensus 
nonlinear filters (Battistelli et al. 2012), and a 
consensus CPHD filter for distributed multitarget 
tracking (Battistelli et al. 2013). Despite these 
interesting preliminary results, networked 
estimation is still a very active research area with 
many open problems related to energy efficiency, 
estimation performance optimality, robustness 
with respect to delays and/or data losses, etc. 

Quantum Estimation 

Quantum estimation theory consists of a general¬ 
ization of the classical estimation theory in terms 
of quantum mechanics. As a matter of fact, the 
statistical theory can be seen as a particular case 
of the more general quantum theory (Helstrom 
1969, 1976). Quantum mechanics presents prac¬ 
tical applications in several fields of technology 
(Personick 1971) such as, the use of quantum 
number generators in place of the classical ran¬ 
dom number generators. Moreover, manipulating 
the energy states of the cesium atoms, it is 
possible to suppress the quantum noise levels 
and consequently improve the accuracy of 
atomic clocks. Quantum mechanics can also be 
exploited to solve optimization problems, giving 
sometimes optimization algorithms that are faster 
than conventional ones. For instance, McGeoch 
and Wang (2013) provided an experimental 
study of algorithms based on quantum annealing. 
Interestingly, the results of McGeoch and Wang 
(2013) have shown that this approach allows to 
obtain better solutions with respect to those found 
with conventional software solvers. In quantum 
mechanics, also the Kalman filter has found 
its proper form, as the quantum Kalman filter. 


In Iida et al. (2010) the quantum Kalman filter is 
applied to an optical cavity composed of mirrors 
and crystals inside, which interacts with a probe 
laser. In particular, a form of a quantum stochastic 
differential equation can be written for such a 
system so as to design the algorithm that updates 
the estimates of the system variables on the basis 
of the measurement outcome of the system. 

Cross-References 

► Averaging Algorithms and Consensus 

► Bounds on Estimation 

► Consensus of Complex Multi-agent Systems 

► Data Association 

► Estimation and Control over Networks 

► Estimation for Random Sets 

► Extended Kalman Filters 

► Kalman Filters 

► Moving Horizon Estimation 

► Networked Control Systems: Estimation and 
Control over Lossy Networks 

► Nonlinear Filters 

► Observers for Nonlinear Systems 

► Observers in Linear Systems Theory 

► Particle Filters 
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Abstract 

Recent developments in computer and commu¬ 
nication technologies have led to a new type 
of large-scale resource-constrained wireless 
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embedded control systems. It is desirable in 
these systems to limit the sensor and control 
computation and/or communication to instances 
when the system needs attention. However, 
classical sampled-data control is based on 
performing sensing and actuation periodically 
rather than when the system needs attention. 
This article discusses event- and self-triggered 
control systems where sensing and actuation is 
performed when needed. Event-triggered control 
is reactive and generates sensor sampling and 
control actuation when, for instance, the plant 
state deviates more than a certain threshold from 
a desired value. Self-triggered control, on the 
other hand, is proactive and computes the next 
sampling or actuation instance ahead of time. The 
basics of these control strategies are introduced 
together with references for further reading. 

Keywords 

Event-triggered control; Hybrid systems; Real¬ 
time control; Resource-constrained embedded 
control; Sampled-data systems; Self-triggered 
control 

Introduction 

In standard control textbooks, e.g., Astrom and 
Wittenmark (1997) and Franklin et al. (2010), pe¬ 
riodic control is presented as the only choice for 
implementing feedback control laws on digital 
platforms. Although this time-triggered control 
paradigm has proven to be extremely success¬ 
ful in many digital control applications, recent 
developments in computer and communication 
technologies have led to a new type of large-scale 
resource-constrained (wireless) control systems 
that call for a reconsideration of this traditional 
paradigm. In particular, the increasing popularity 
of (shared) wired and wireless networked con¬ 
trol systems raises the importance of explicitly 
addressing energy, computation, and communi¬ 
cation constraints when designing feedback con¬ 
trol loops. Aperiodic control strategies that allow 
the inter-execution times of control tasks to be 
varying in time offer potential advantages with 


respect to periodic control when handling these 
constraints, but they also introduce many new 
interesting theoretical and practical challenges. 

Although the discussions regarding periodic 
vs. aperiodic implementation of feedback control 
loops date back to the beginning of computer- 
controlled systems, e.g., Gupta (1963), in the 
late 1990s two influential papers (Arzen 1999; 
Astrom and Bernhardsson 1999) highlighted the 
advantages of event-based feedback control. 
These two papers spurred the development 
of the first systematic designs of event-based 
implementations of stabilizing feedback control 
laws, e.g., Yook et al. (2002), Tabuada (2007), 
Heemels et al. (2008), and Henningsson et al. 
(2008). Since then, several researchers have 
improved and generalized these results and 
alternative approaches have appeared. In the 
meantime, also so-called self-triggered control 
(Velasco et al. 2003) emerged. Event-triggered 
and self-triggered control systems consist of 
two elements, namely, a feedback controller 
that computes the control input and a triggering 
mechanism that determines when the control 
input has to be updated again. The difference 
between event-triggered control and self- 
triggered control is that the former is reactive, 
while the latter is proactive. Indeed, in event- 
triggered control, a triggering condition based on 
current measurements is continuously monitored 
and when the condition holds, an event is 
triggered. In self-triggered control the next 
update time is precomputed at a control update 
time based on predictions using previously 
received data and knowledge of the plant 
dynamics. In some cases, it is advantageous 
to combine event-triggered and self-triggered 
control resulting in a control system reactive 
to unpredictable disturbances and proactive by 
predicting future use of resources. 

Time-Triggered, Event-Triggered and 
Self-Triggered Control 

To indicate the differences between various dig¬ 
ital implementations of feedback control laws, 
consider the control of the nonlinear plant 
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x = f(x,u ) (1) 

with x G R nx the state variable and u e R nu 
the input variable. The system is controlled by a 
nonlinear state feedback law 

u = h{x) (2) 

where h : R Hx —> is an appropriate mapping 
that has to be implemented on a digital platform. 
Recomputing the control value and updating the 
actuator signals will occur at times denoted by 
4, t\, 4,. • • with 4 = 0. If we assume the inputs 
to be held constant in between the successive re¬ 
computations of the control law (referred to as 
sample-and-hold or zero-order-hold), we have 

u(t) = u(tk) =h(x(tk)) W G [ 4 , 4 + i), k G N. 

(3) 

We refer to the instants {tk}ke n as the triggering 
times or execution times. Based on these times 
we can easily explain the difference between 
time-triggered control, event-triggered control, 
and self-triggered control. 

In time-triggered control we have the equality 
4 = kT s with T s > 0 being the sampling period. 
Hence, the updates take place equidistantly in 
time irrespective of how the system behaves. 
There is no “feedback mechanism” in determin¬ 
ing the execution times; they are determined a 
priori and in “open loop.” Another way of writing 
the triggering mechanism in time-triggered con¬ 
trol is 

4+i — 4 + T s , k G N (4) 

with 4 = 0. 

In event-triggered control the next execution 
time of the controller is determined by an event¬ 
triggering mechanism that continuously verifies 
if a certain condition based on the actual state 
variable becomes true. This condition includes 
often also information on the state variable x(4) 
at the previous execution time 4 and can be 
written, for instance, as C(x(t),x(tk)) > 0. For¬ 
mally, the execution times are then determined by 


4 +i = inf {t > t k | C(x(t),x(t k )) > 0 } (5) 

with 4 — 0. Hence, it is clear from (5) that 
there is a feedback mechanism present in the 
determination of the next execution time as it is 
based on the measured state variable. In this sense 
event-triggered control is reactive. 

Finally, in self-triggered control the next ex¬ 
ecution time is determined proactively based on 
the measured state x ( 4 ) at the previous execution 
time. In particular, there is a function M : R Hx -+ 
M>o that specifies the next execution time as 

4+i — 4 + M(x(4)) (6) 

with 4 = 0. As a consequence, in self-triggered 
control both the control value w( 4 ) and the next 
execution time 4+1 are computed at execution 
time 4 . In between 4 and 4 + 1 , no further actions 
are required from the controller. Note that the 
time-triggered implementation can be seen as a 
special case of the self-triggered implementation 
by taking M(x) = T s for all x G R Hx . 

Clearly, in all the three implementation 
schemes T s , C and M are chosen together with 
the feedback law given through h to provide 
stability and performance guarantees and to 
realize a certain utilization of computer and 
communication resources. 

Lyapunov-Based Analysis 

Much work on event-triggered control used one 
of the following two modeling and analysis 
frameworks: The perturbation approach and the 
hybrid system approach. 

Perturbation Approach 

In the perturbation approach one adopts 
perturbed models that describe how the event- 
triggered implementation of the control law per¬ 
turbs the ideal continuous-time implementation 
u(t ) = h(x(t)), t G M>o. In order to do so, 
consider the error e given by 

e(t) = x ( 4 ) —x(t) for t G [ 4 , 4 + 1 ), k G N. 

(7) 
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Using this error variable we can write the closed- 
loop system based on (1) and (3) as 

x = f(x,h(x-\-e)). (8) 

Essentially, the three implementations discussed 
above have their own way of indicating when 
an execution takes place and the error e is reset 
to zero. The equation (8) clearly shows how the 
ideal closed-loop system is perturbed by using a 
time-triggered, event-triggered, or self-triggered 
implementation of the feedback law in (2). In¬ 
deed, when e = 0 we obtain the ideal closed loop 


holds. When a < 1//3, we obtain from (11) and 
(13) that GAS properties are preserved for the 
event-triggered implementation. Besides, under 
certain conditions provided in Tabuada (2007), a 
global positive lower bound exists on the inter¬ 
execution times, i.e., there exists a r m i n > 0 such 
that 4+1 — 4 > r m in for all k e N and all initial 
states vo. 

Also self-triggered controllers can be derived 
using the perturbation approach. In this case, sta¬ 
bility properties can be guaranteed by choosing 
M in (6) ensuring that C(x(t),x(tk)) < 0 holds 
for all times t e [ 4 , 4 + 1 ) and all k e N. 


x = f(x,h(x)). (9) 

The control law in (2) is typically chosen so 
as to guarantee that the system in (9) has certain 
global asymptotic stability (GAS) properties. In 
particular, it is often assumed that there exists a 
Lyapunov function V : R Ux -> M>o in the sense 
that V is positive definite and for all x e we 
have 

^-f(x,h(x))<-\\x\\ 2 . (10) 

OX 

Note that this inequality is stronger than strictly 
needed (at least for nonlinear systems), but for 
pedagogical reasons we choose this simpler for¬ 
mulation. For the perturbed model, the inequality 
in (10) can in certain cases (including linear 
systems) be modified to 

^-f(x,h(x))<-\\x\\ 2 + p\\e\\ 2 ( 11 ) 

ox 

in which /3 > 0 is a constant used to indicate 
how the presence of the implementation error e 
affects the decrease of the Lyapunov function. 
Based on (10) one can now choose the function 
C in (5) to preserve GAS of the event-triggered 
implementation. For instance, C(x(t), x(tk)) = 
\\x(t k ) - x(t)\\ -o\\x(t\\, i.e., 

4+i = inf{f > t k | IK0II > o-||*(0ll}> (12) 

assures that 


(13) 


Hybrid System Approach 

By taking as a state variable £ = ( x,e ), one 
can write the closed-loop event-triggered control 
system given by (1), (3), and (5) as the hybrid 
impulsive system (Goebel et al. 2009) 


i( f(x,h(x + e)) \ 
\-f(x,h(x +e))J 


when C(x, x + e) > 0 
(14a) 





when C(x, x + e) < 0. 


(14b) 


This observation was made in Donkers and 
Heemels (2010,2012) and Postoyan et al. (2011). 
Tools from hybrid system theory can be used to 
analyze this model, which is more accurate as it 
includes the error dynamics of the event-triggered 
closed-loop system. In fact, the stability bounds 
obtained via the hybrid system approach can be 
proven to be never worse than ones obtained 
using the perturbation approach in many cases, 
see, e.g., Donkers and Heemels (2012), and 
typically the hybrid system approach provides 
(strictly) better results in practice. However, 
in general an analysis via the hybrid system 
approach is more complicated than using a 
perturbation approach. 

Note that by including a time variable r, 
one can also write the closed-loop system corre¬ 
sponding to self-triggered control (1), (3), and (6) 
as a hybrid system using the state variable / = 
(v, e, r). This leads to the model 


IMI < o'Ml 
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/ f(x,h(x+e)) \ 

X = I —f(x,h(x + e )) I whenO < r< M(x + e ) 

(15a) 


0 I when r = M(x + e ), (15b) 

0 / 

which can be used for analysis based on hybrid 
tools as well. 


Alternative Event-Triggering 
Mechanisms 

There are various alternative event-triggering 
mechanisms. A few of them are described in this 
section. 

Relative, Absolute, and Mixed Triggering 
Conditions 

Above we discussed a very basic event-triggering 
condition in the form given in ( 12 ), which is 
sometimes called relative triggering as the next 
control task is executed at the instant when the 
ratio of the norms of the error ||e|| and the 
measured state \\x\\ is larger than or equal to cr. 
Also absolute triggering of the form 

4+i = inf{f > 4 I IKOII > <5} (16) 

can be considered. Here 8 > 0 is an abso- 
lute threshold, which has given this scheme the 
name send-on-delta (Miskowicz 2006). Recently, 
a mixed triggering mechanism of the form 

4 +i = inf{? > 4 | IHOII > cr||x(r)|| + 5}, 

(17) 

combining an absolute and a relative threshold, 
was proposed (Donkers and Heemels 2012). It 
is particularly effective in the context of output- 
based control. 

Model-Based Triggering 

In the triggering conditions discussed so far, es¬ 
sentially the current control value u(t ) is based 


on a held value x( 4 ) of the state variable, as 
specified in (3). However, if good model-based 
information regarding the plant is available, one 
can use better model-based predictions of the 
actuator signal. For instance, in the linear context, 
(Lunze and Lehmann 2010) proposed to use a 
control input generator instead of a plain zero- 
order hold function. In fact, the plant model was 
described by 

x — Ax T - Bu T - Ew (18) 

with v G W* x the state variable, u G the input 
variable, and w e W* w a bounded disturbance 
input. It was assumed that a well functioning state 
feedback controller u = Kx was available. The 
control input generator was then based on the 
model-based predictions given for [ 4 , 4 + 1 ) by 

x s (t) — {A + BK)x s (t) + Ew(tk) 

with x s ( 4 ) = *( 4 ) (19) 

and w( 4 ) is an estimate for the (average) dis¬ 
turbance value, which is determined at execution 
time 4 , k e N. The applied input to the actuator 
is then given by u{t) = Kx s {t ) for t G [4, 4+i), 
k G N. Note that (19) provides a prediction of 
the closed-loop state evolution using the latest 
received value of the state x( 4 ) and the esti¬ 
mate w( 4 ) of the disturbances. Also the event¬ 
triggering condition is based on this model-based 
prediction of the state as it is given by 

4+1 = inf{£ > 4 | \\x s (t) - x(t)\\ > 8}. ( 20 ) 

Hence, when the prediction x s (t) diverts to far 
from the measured state x(t), the next event is 
triggered so that updates of the state are sent 
to the actuator. These model-based triggering 
schemes can enhance the communication savings 
as they reduce the number of events by using 
model-based knowledge. 

Other model-based event-triggered control 
schemes are proposed, for instance, in Yook 
et al. (2002), Garcia and Antsaklis (2013), and 
Heemels and Donkers (2013). 
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Triggering with Time-Regularization 
Time-regularization was proposed for output- 
based triggering to avoid the occurrence of 
accumulations in the execution times (Zeno 
behavior) that would obstruct the existence of 
a positive lower bound on the inter-execution 
times 4+1 — 4 , k G N. In Tallapragada and 
Chopra (2012a,b), the triggering update 

tk +1 = in+ >4 + ^1 IKOII - ° 1 I X ( 0 II} 

( 21 ) 

was proposed, where T > 0 is a built-in lower 
bound on the minimal inter-execution times. The 
authors discussed how T and a can be designed 
to guarantee closed-loop stability. In Heemels 
et al. (2008) a similar triggering was proposed 
using an absolute-type of triggering. 

An alternative to exploiting a built-in lower 
bound T is combining ideas from time-triggered 
control and event-triggering control. Essentially, 
the idea is to only verify a specific event¬ 
triggering condition at certain equidistant time 
instants kT s , k G N, where T s > 0 is the 
sampling period. Such proposals were mentioned 
in, for instance, Arzen (1999), Yook et al. ( 2002 ), 
Henningsson et al. (2008), and Heemels et al. 
(2008, 2013). In this case the execution times are 
given by 

4+1 = inf{t > 4 \ t = kT s , k e N, 

and ||e(f)|| > cr||x(0||} (22) 

in case a relative triggering is used. In Heemels 
et al. (2013) the term periodic event-triggered 
control was coined for this type of control. 

Decentralized Triggering Conditions 
Another important extension of the mentioned 
event-triggered controllers, especially in large- 
scale networked systems, is the decentralization 
of the event-triggered control. Indeed, if one 
focuses on any of the abovementioned event¬ 
triggering conditions (take, e.g., (5)), it is ob¬ 
vious that the full state variable x(t) has to be 
continuously available in a central coordinator 
to determine if an event is triggered or not. If 
the sensors that measure the state are physically 


distributed over a wide area, this assumption is 
prohibitive for its implementation. In such cases, 
it is of high practical importance that the event¬ 
triggering mechanism can be decentralized and 
the execution of control tasks can be executed 
based on local information. One first idea could 
be to use local event-triggering mechanisms for 
the i -th sensor that measures Xi . One could “de¬ 
centralize” the condition (5), into 

t‘ k i+ 1 = inf{? > t l k , | ||e,(0H > o\\xi(t)\\}, 

(23) 

in which ej(t) = Xi(t l ki ) — Xi(t) for t G 
[t 1 ^, t l ki+l ), k l G N. Note that each sensor 
now has its own execution times t l ki , k l G N at 
which the information X[ ( t ) is transmitted. More 
importantly, the triggering condition (23) is based 
on local data only and does not need a central 
coordinator having access to the complete state 
information. Besides since (23) still guarantees 
that (13) holds, stability properties can still be 
guaranteed; see Mazo and Tabuada (2011). 

Several other proposals for decentralized 
event-triggered control schemes were made, 
e.g., Persis et al. (2013), Wang and Lemmon 
(2011), Garcia and Antsaklis (2013), Yook et al. 
(2002), and Donkers and Heemels (2012). 

Triggering for Multi-agent Systems 
Event-triggered control strategies are suitable 
for cooperative control of multi-agent systems. 
In multi-agent systems, local control actions of 
individual agents should lead to a desirable global 
behavior of the overall system. A prototype 
problem for control of multi-agent systems is the 
agreement problem (also called the consensus 
or rendezvous problem), where the states of 
all agents should converge to a common value 
(sometimes the average of the agents’ initial 
conditions). The agreement problem has been 
shown to be solvable for certain low-order 
dynamical agents in both continuous and discrete 
time, e.g., Olfati-Saber et al. (2007). It was 
recently shown in Dimarogonas et al. (2012), Shi 
and Johansson (2011), and Seyboth et al. (2013) 
that the agreement problem can be solved using 
event-triggered control. In Seyboth et al. (2013) 
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the triggering times for agent i are determined 
by 

t' ki+l = inf {t > t l ki | Ci(Xi(t),Xi(t ki )) > 0}, 

(24) 

which should be compared to the triggering 
times as specified through (5). The triggering 
condition compares the current state value with 
the one previously communicated, similarly to 
the previously discussed decentralized event- 
triggered control (see (23)), but now the 
communication is only to the agent’s neighbors. 
Using such event-triggered communication, 
the convergence rate to agreement (i.e., 
\\xi(t) — Xj(t) || -> 0 as t -> oo for all 

i, j) can be maintained with a much lower 
communication rate than for time-triggered 
communication. 


Outlook 

Many simulation and experimental results show 
that event-triggered and self-triggered control 
strategies are capable of reducing the number 
of control task executions, while retaining a 
satisfactory closed-loop performance. In spite 
of these results, the actual deployment of these 
novel control paradigms in relevant applications 
is still rather marginal. Some exceptions include 
recent event-triggered control applications in 
underwater vehicles (Teixeira et al. 2010), 
process control (Lehmann et al. 2012), and 
control over wireless networks (Araujo et al. 
2014). To foster the further development of 
event-triggered and self-triggered controllers in 
the future, it is therefore important to validate 
these strategies in practice, next to building up 
a complete system theory for them. Regarding 
the latter, it is fair to say that, even though 
many interesting results are currently available, 
the system theory for event-triggered and self- 
triggered control is far from being mature, 
certainly compared to the vast literature on time- 
triggered (periodic) sampled-data control. As 
such, many theoretical and practical challenges 
are ahead of us in this appealing research field. 
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Abstract 

Evolutionary games constitute the most recent 
major mathematical tool for understanding, 
modelling and predicting evolution in biology 
and other fields. They complement other well 
established tools such as branching processes 
and the Lotka-Volterra (1910) equations (e.g. 
for the predator - prey dynamics or for epidemics 
evolution). Evolutionary Games also brings novel 
features to game theory. First, it focuses on the 
dynamics of competition rather than restricting 
attention to the equilibrium. In particular, it 
tries to explain how an equilibrium emerges. 
Second, it brings new definitions of stability, 
that are more adapted to the context of large 
populations. Finally, in contrast to standard 
game theory, players are not assumed to be 
“rational” or “knowledgeable” as to anticipate 
the other players’ choices. The objective of this 
article, is to present foundations as well as recent 
advances in evolutionary games, highlight the 
novel concepts that they introduce with respect 
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to game theory as formulated by John Nash, and 
describe through several examples their huge 
potential as tools for modeling interactions in 
complex systems. 

Keywords 

Evolutionary stable strategies; Fitness; Replicator 
dynamics 

Introduction 

Evolutionary game theory is the youngest of 
several mathematical tools used in describing and 
modeling evolution. It was preceded by the the¬ 
ory of branching processes (Watson and Francis 
Galton 1875) and its extensions (Altman 2008) 
which have been introduced in order to explain 
the evolution of family names in the English 
population of the second half of the nineteenth 
century. This theory makes use of the probabilis¬ 
tic distribution of the number of offspring of an 
individual in order to predict the probability at 
which the whole population would become even¬ 
tually extinct. It describes the evolution of the 
number of offsprings of a given individual. The 
Lotka-Volterra equations (Lotka-Volterra 1910) 
and their extensions are differential equations that 
describe the population size of each of several 
species that have a predator-prey type relation. 
One of the foundations in evolutionary games 
(and its extension to population games) which is 
often used as the starting point in their definition 
is the replicator dynamics which, similarly to the 
Lotka-Volterra equations, describe the evolution 
of the size of various species that interact with 
each other (or of various behaviors within a given 
population). In both the Lotka-Volterra equations 
and in replicator dynamics, the evolution of the 
size of one type of population may depend on 
the sizes of all other populations. Yet, unlike 
the Lotka-Volterra equations, the object of the 
modeling is the normalized sizes of populations 
rather than the size itself. By normalized size 
of some type, we mean the fraction of that type 
within the whole population. A basic feature in 


evolutionary games is, thus, that the evolution 
of the fraction of a given type in the population 
depends on the sizes of other types only through 
the normalized size rather than through their 
actual one. 

The relative rate of the decrease or increase 
of the normalized population size of some type 
in the replicator dynamics is what we call fitness 
and is to be understood in the Darwinian sense. 
If some type or some behavior increases more 
than another one, then it has a larger fitness, 
the evolution of the fitness as described by the 
replicator dynamics is a central object of study in 
evolutionary games. 

So far we did not actually consider any 
game and just discussed ways of modeling 
evolution. The relation to game theory is due 
to the fact that under some conditions, the fitness 
converges to some fixed limit, which can be 
identified as an equilibrium of a matrix game 
in which the utilities of the players are the 
fitnesses. This limit is then called an ESS - 
evolutionary stable strategy - as defined by 
Meynard Smith and Price in Maynard Smith 
and Price (1973). It can be computed using 
elementary tools in matrix games and then used 
for predicting the (long term) distribution of 
behaviors within a population. Note that an 
equilibrium in a matrix game can be obtained 
only when the players of the matrix game are 
rational (each one maximizing its expected 
utility, being aware of the utilities of other players 
and of the fact that these players maximize 
their utilities, etc.). A central contribution of 
evolutionary games is thus to show that evolution 
of possibly nonrational populations converges 
under some conditions to the equilibrium of a 
game played by rational players. This surprising 
relationship between the equilibrium of a 
noncooperative matrix game and the limit points 
of the fitness dynamics has been supported by a 
rich body of experimental results; see Friedman 
(1996). 

On the importance of the ESS for understand¬ 
ing the evolution of species, Dawkins writes in 
his book “The Selfish Gene” (Dawkins 1976): 
“we may come to look back on the invention of 
the ESS concept as one of the most important 
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advances in evolutionary theory since Darwin.” 
He further specifies: “Maynard Smith’s concept 
of the ESS will enable us, for the first time, to see 
clearly how a collection of independent selfish 
entities can come to resemble a single organized 
whole.” 

Here we shall follow the nontraditional ap¬ 
proach describing evolutionary games: we shall 
first introduce the replicator dynamics and then 
introduce the game theoretic interpretation re¬ 
lated to it. 


Replicator Dynamics 

In the biological context, the replicator dynamics 
is a differential equation that describes the way 
in which the usage of strategies changes in time. 
They are based on the idea that the average 
growth rate per individual that uses a given strat¬ 
egy is proportional to the excess of fitness of that 
strategy with respect to the average fitness. 

In engineering, the replicator dynamics could 
be viewed as a rule for updating mixed strategies 
by individuals. It is a decentralized rule since 
it only requires knowing the average utility of 
the population rather than the strategy of each 
individual. 

Replicator dynamics is one of the most studied 
dynamics in evolutionary game theory. It has 
been introduced by Taylor and Jonker (1978). 
The replicator dynamics has been used for de¬ 
scribing the evolution of road traffic congestion in 
which the fitness is determined by the strategies 
chosen by all drivers (Sandholm 2009). It has 
also been studied in the context of the association 
problem in wireless communications (Shakkottai 
et al. 2007). 

Consider a set of N strategies and let pj (t) 
be the fraction of the whole population that uses 
strategy j at time t. Let p(t ) be the correspond¬ 
ing N -dimensional vector. A function fj is asso¬ 
ciated with the growth rate of strategy j , and it is 
assumed to depend on the fraction of each of the 
N strategies in the population. There are various 
forms of replicator dynamics (Sandholm 2009) 
and we describe here the one most commonly 
used. It is given by 


pj(t)= iip jit) 


fjip(t)) - Pk(t)fk(p(t )) 


k =i 


( 1 ) 


where /z is some positive constant and the payoff 
function fk is called the fitness of strategy k. 

In evolutionary games, evolution is assumed to 
be due to pairwise interactions between players, 
as will be described in the next section. There- 
fore, f k has the form f k (p) = XA J(k, i)p(i ) 
where J(k , i) is the fitness of an individual play¬ 
ing k if it interacts with an individual that plays 
strategy i . 

Within quite general settings (Weibull 1995), 
the above replicator dynamics is known to con¬ 
verge to an ESS (which we introduce in the next 
section). 


Evolutionary Games: Fitnesses 

Consider an infinite population of players. Each 
individual i plays at times t l n , n = 1,2,3,... 
(assumed to constitute an independent Poisson 
process with some rate A) a matrix game against 
some player j(n) randomly selected within the 
population. The choice j(n) of the other players 
at different times is independent. All players have 
the same finite space of pure strategies (also 
called actions) K. Each time it plays, a player 
may use a mixed strategy p , i.e., a probability 
measure over the set of pure strategies. We con¬ 
sider J(k, i ) (defined in the previous section) to 
be the payoff for a tagged individual if it uses 
a strategy A, and it interacts with an individual 
using strategy /. With some abuse of notation, 
one denotes by J(p, q) the expected payoff for a 
player who uses a mixed strategy p when meeting 
another individual who adopts the mixed strategy 
q. If we define a payoff matrix A and consider 
p and q to be column vectors, then J(p,q) = 
p'Aq. The payoff function J is indeed linear in 
p and q. A strategy q is called a Nash equilibrium 
if 

WpeA(K), J(q,q) > J(p, q) (2) 

where A (K) is the set of probabilities over the set 
K. 
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Suppose that the whole population uses a 
strategy q and that a small fraction e (called “mu¬ 
tations”) adopts another strategy p. Evolutionary 
forces are expected to select against p if 

J(q, ep + (1 - e)q) > J(p , ep + (1 - e)q). (3) 

Evolutionary Stable Strategies: ESS 
Definition 1 q is said to be an evolutionary sta¬ 
ble strategy (ESS) if for every p ^ q there 
exists some e p > 0 such that (3) holds for all 
6 e (0,?,). 

The definition of ESS is thus related to a 
robustness property against deviations by a whole 
(possibly small) fraction of the population. This 
is an important difference that distinguishes the 
equilibrium in populations as seen by biologists 
and the standard Nash equilibrium often used in 
economics context, in which robustness is defined 
against the possible deviation of a single user. 
Why do we need the stronger type of robust¬ 
ness? Since we deal with large populations, it 
is likely to be expected that from time to time, 
some group of individuals may deviate. Thus 
robustness against deviations by a single user is 
not sufficient to ensure that deviations will not 
develop and end up being used by a growing 
portion of the population. 

Often ESS is defined through the following 
equivalent definition. 

Theorem 1 (Weibull 1995, Proposition 2.1 or 
Hofbauer and Sigmund 1998, Theorem 6.4.1, 

p 63) A strategy q is said to be an evolutionary 
stable strategy if and only ifVp ^ q one of the 
following conditions holds: 

J{q,q) > J{p,q), (4) 

or 

J(q, q) = J(p, q) and J(q, p) > J(p, p). (5) 

In fact, if condition (4) is satisfied, then the 
fraction of mutations in the population will tend 
to decrease (as it has a lower fitness, meaning a 
lower growth rate). Thus, the strategy q is then 
immune to mutations. If it does not but if still 


the condition (5) holds, then a population using 
q is “weakly” immune against a mutation using 
p. Indeed, if the mutant’s population grows, then 
we shall frequently have individuals with strategy 
q competing with mutants. In such cases, the 
condition J(q,p) > J(p,p) ensures that the 
growth rate of the original population exceeds 
that of the mutants. 

A mixed strategy q that satisfies (4) for all p ^ 
q is called strict Nash equilibrium. Recall that a 
mixed strategy q that satisfies (2) for all p ^ q is 
a Nash equilibrium. We conclude from the above 
theorem that being a strict Nash equilibrium im¬ 
plies being an ESS, and being an ESS implies 
being a Nash equilibrium. Note that whereas a 
mixed Nash equilibrium is known to exist in a 
matrix game, an ESS may not exist. However, 
an ESS is known to exist in evolutionary games 
where the number of strategies available to each 
player is 2 (Weibull 1995). 

Proposition 1 In a symmetric game with two 
strategies for each player and no pure Nash 
equilibrium, there exists a unique mixed Nash 
equilibrium which is an ESS. 


Example: The Hawk and Dove Game 

We briefly describe the hawk and dove game 
(Maynard Smith and Price 1973). A bird 
that searches food finds itself competing with 
another bird over food and has to decide 
whether to adopt a peaceful behavior (dove) 
or an aggressive one (hawk). The advantage of 
behaving aggressively is that in an interaction 
with a peaceful bird, the aggressive one gets 
access to all the food. This advantage comes 
at a cost: a hawk which meets another hawk 
ends up fighting with it and thus takes a risk 
of getting wounded. In contrast, two doves that 
meet in a contest over food share it without 
fighting. The fitnesses for player 1 (who chooses 
a row) are summarized in Table 1, in which the 
cost for fighting is taken to be some parameter 
8 > 1 / 2 . 

This game has a unique mixed Nash equi¬ 
librium (and thus a unique ESS) in which the 
fraction p of aggressive birds is given by 
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Evolutionary Games, Table 1 The hawk-dove game 



H 

D 

H 

1/2 -8 

1 

D 

0 

1/2 


2 


Extension: Evolutionary Stable Sets 

Assume that there are two mixed strategies pi and 
Pj that have the same performance against each 
other, i.Q., J(pi, pj) = J(pj , pj). Then neither 
one of them can be an ESS, even if they are 
quite robust against other strategies. Now assume 
that when excluding one of them from the set 
of mixed strategies, the other one is an ESS. 
This could imply that different combinations of 
these two ESS’s could coexist and would together 
be robust to any other mutations. This motivates 
the following definition of an ESSet (Cressman 
2003): 

Definition 2 A set E of symmetric Nash equilib¬ 
ria is an evolutionarily stable set (ESSet) if, for all 
q e E, we have J(q , p) > J(p, p) for all p $ E 
and such that J(p, q) = J(q, q). 

Properties of ESSet: 

(i) For all p and p r in an ESSet E , we have 

j(p\p) = j{p,p). 

(ii) If a mixed strategy is an ESS, then the 
singleton containing that mixed strategy is 
an ESSet. 

(iii) If the ESSet is not a singleton, then there is 
no ESS. 

(iv) If a mixed strategy is in an ESSet, then it is 
a Nash equilibrium (see Weibull 1995, p. 48, 
Example 2.7). 

(v) Every ESSet is a disjoint union of Nash 
equilibria. 

(vi) A perturbation of a mixed strategy which is 
in the ESSet can move the system to another 
mixed strategy in the ESSet. In particular, 
every ESSet is asymptotically stable for the 
replicator dynamics (Cressman 2003). 


Summary and Future Directions 

The entry has provided an overview of the foun¬ 
dations of evolutionary games which include the 
ESS (evolutionary stable strategy) equilibrium 
concept that is stronger than the standard Nash 
equilibrium and the modeling of the dynamics of 
the competition through the replicator dynamics. 
Evolutionary game framework is a first step in 
linking game theory to evolutionary processes. 
The payoff of a player is identified as its fitness, 
i.e., the rate of reproduction. Further develop¬ 
ment of this mathematical tool is needed for 
handling hierarchical fitness, i.e., the cases where 
the individual that interacts cannot be directly 
identified with the reproduction as it is part of a 
larger body. For example, the behavior of a blood 
cell in the human body when interacting with a 
virus cannot be modeled as directly related to 
the fitness of the blood cell but rather to that of 
the human body. A further development of the 
theory of evolutionary games is needed to define 
meaningful equilibrium notions and relate them 
to replication in such contexts. 

Cross-References 

► Dynamic Noncooperative Games 

► Game Theory: Historical Overview 

Recommended Reading 

Several books cover evolutionary game theory 
well. These include Cressman (2003), Hofbauer 
and Sigmund (1998), Sandholm (2009), Vincent 
and Brown (2005), and Weibull (1995). In ad¬ 
dition, the book The Selfish Gene by Dawkins 
presents an excellent background on evolution in 
biology. 
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Abstract 

Understanding the effect of experiment on 
estimation result is a crucial part of system 
identification - if the experiment is constrained 
or otherwise fixed, then the implied limitations 
need to be understood - but if the experiment 
can be designed, then given its fundamental 
importance that design parameter should be fully 
exploited, this entry will give an understanding of 
how it can be exploited. We also briefly discuss 
the particulars of identification for model-based 


control, one of the main applications of system 
identification. 


Keywords 
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Introduction 

The accuracy of an identified model is governed 
by: 

(i) Information content in the data used for esti¬ 
mation 

(ii) The complexity of the model structure 

The former is related to the noise properties and 
the “energy” of the external excitation of the 
system and how it is distributed. In regard to (ii), 
a model structure which is not flexible enough to 
capture the true system dynamics will give rise to 
a systematic error, while an overly flexible model 
will be overly sensitive to noise (so-called overfit¬ 
ting). The model complexity is closely associated 
with the number of parameters used. For a linear 
model structure with n parameters modeling the 
dynamics, it follows from the invariance result 
in Rojas et al. (2009) that to obtain a model 
for which the variance of the frequency function 
estimate is less than 1/y over all frequencies, the 
signal-to-noise ratio, as measured by input energy 
over noise variance, must be at least n y. With 
energy being power x time and as input power 
is limited in physical systems, this indicates that 
the experiment time grows at least linearly with 
the number of model parameters. When the input 
energy budget is limited, the only way around 
this problem is to sacrifice accuracy over certain 
frequency intervals. The methodology to achieve 
this in a systematic way is known as experiment 
design. 
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Model Quality Measures 

The Cramer-Rao bound provides a lower bound 
on the covariance matrix of the estimation error 
for an unbiased estimator. With On e W 1 denot¬ 
ing the parameter estimate (based on N input- 
output samples) and 6 0 the true parameters, 


NE 


(o N - 0 O ) (§ N - 6 o y 


> N Ip 1 (0 o ,N) 

( 1 ) 


where If( 0 o , N) e R nxn appearing in the lower 
bound is the so-called Fisher information ma¬ 
trix (Ljung 1999). For consistent estimators, i.e., 
when On —> 0 o as TV —> oo, the inequal¬ 
ity (1) typically holds asymptotically as the sam¬ 
ple size N grows to infinity. The right-hand 
side in (1) is then replaced by the inverse of 
the per sample Fisher information If( 0 o ) : = 
limjv^oo If( 0 o , N)/N. An estimator is said to 
be asymptotically efficient if equality is reached 
in (1) as N —> oo. 

Even though it is possible to reduce the mean- 
square error by constraining the model flexibility 
appropriately, it is customary to use consistent 
estimators since the theory for biased estimators 
is still not well understood. For such estimators, 
using some function of the Fisher information as 
performance measure is natural. 


General-Purpose Quality Measures 

Over the years a number of “general-purpose” 
quality measures have been proposed. Perhaps 
the most frequently used is the determinant of the 
inverse Fisher information. This represents the 
volume of confidence ellipsoids for the parameter 
estimates and minimizing this measure is known 
as D-optimal design. Two other criteria relating 
to confidence ellipsoids are E-optimal design, 
which uses the length of the longest principal 
axis (the minimum eigenvalue of If) as quality 
measure, and A-optimal design, which uses the 
sum of the squared lengths of the principal axes 
(the trace of Z^ 1 ). 


Application-Oriented Quality Measures 

When demands are high and/or experimentation 
resources are limited, it is necessary to tailor the 
experiment carefully according to the intended 
use of the model. Below we will discuss a couple 
of closely related application-oriented measures. 

Average Performance Degradation 
Let Lapp (0) > 0 be a measure of how well the 
model corresponding to parameter 0 performs 
when used in the application. In finance, F a pp 
can, e.g., represent the ability to predict the stock 
market. In process industry, V app can represent the 
profit gained using a feedback controller based 
on the model corresponding to 0 . Let us assume 
that Lapp is normalized such that min# V avv (0) = 
Lapp (0 o ) = 0. That Lapp has minimum corre¬ 
sponding to the parameters of the true system is 
quite natural. We will call F a pp the application 
cost. Assuming that the estimator is asymptoti¬ 
cally efficient, using a second-order Taylor ap¬ 
proximation gives that the average application 
cost can be expressed as (the first-order term 
vanishes since 0 o is the minimizer of Lapp) 




1 

/ - \ T / „ \ 

E 

i 

T3 

2 

(o N -e 0 )v^{o 0 )(o N -o 0 ) 


= ^Tr (2) 

This is a generalization of the A-optimal de¬ 
sign measure and its minimization is known as 
L-optimal design. 

Acceptable Performance 

Alternatively, one may define a set of acceptable 
models, i.e., a set of models which will give 
acceptable performance when used in the appli¬ 
cation. With a performance degradation measure 
defined of the type Lapp above, this would be a 
level set 


£app= {tf: F app (0)<fj (3) 

for some constant y > 0. The objective of 
the experiment design is then to ensure that the 
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resulting estimate ends up in £ app with high prob¬ 
ability. 


Design Variables 

In an identification experiment there are a number 
of design variables at the user’s disposal. Below 
we discuss three of the most important ones. 

Sampling Interval 

For the sampling interval, the general advice from 
an information theoretic point of view is to sam¬ 
ple as fast as possible (Ljung 1999). However, 
sampling much faster than the time constants of 
the system may lead to numerical issues when 
estimating discrete time models as there will be 
poles close to the unit circle. Downsampling may 
thus be required. 

Feedback 

Generally speaking, feedback has three effects 
from an identification and experiment design 
point of view: 

(i) Not all the power in the input can be used to 
estimate the system dynamics when a noise 
model is estimated as a part of the input 
signal has to be used for the latter task; see 
Section 8.1 in Forssell and Ljung (1999). 
When a very flexible noise model is used, 
the estimate of the system dynamics then has 
to rely almost entirely on external excitation. 

(ii) Feedback can reduce the effect of distur¬ 
bances and noise at the output. When there 
are constraints on the outputs, this allows for 
larger (input) excitation and therefore more 
informative experiments. 

(iii) The cross-correlation between input and 
noise/disturbances requires good noise 
models to avoid biased estimates (Ljung 
1999). 

Strictly speaking, (i) is only valid when the sys¬ 
tem and noise models are parametrized sepa¬ 
rately. Items (i) and (ii) imply that when there 
are constraints on the input only, then the opti¬ 
mal design is always in open loop, whereas for 
output constrained only problems, the experiment 


should be conducted in closed loop (Agiiero and 
Goodwin 2007). 

External Excitation Signals 

The most important design variable is the ex¬ 
ternal excitation, including the length of the ex¬ 
periment. Even for moderate experiment lengths, 
solving optimal experiment design problems with 
respect to the entire excitation sequence can be a 
formidable task. Fortunately, for experiments of 
reasonable length, the design can be split up in 
two steps: 

(i) First, optimization of the probability density 
function of the excitation 

(ii) Generation of the actual sequence from the 
obtained density function through a stochas¬ 
tic simulation procedure 

More details are provided in section “Computa¬ 
tional Issues.” 


Experimental Constraints 

An experiment is always subject to constraints, 
physical as well as economical. Such constraints 
are typically translated into constraints on the 
following signal properties: 

(i) Variability. For example, too high level of 
excitation may cause the end product to 
go off-spec, resulting in product waste and 
associated high costs. 

(ii) Frequency content. Often, too harsh move¬ 
ments of the inputs may damage equipment. 

(iii) Amplitudes. For example, actuators have 
limited range, restricting input amplitudes. 

(iv) Waveforms. In process industry, it is not 
uncommon that control equipment limit the 
type of signals that can be applied. In other 
applications, it may be physically possible to 
realize only certain types of excitation. See 
section “Waveform Generation” for further 
discussion. 

It is also often desired to limit the experiment 
time so that the process may go back to normal 
operation, reducing, e.g., cost of personnel. The 
latter is especially important in the process in¬ 
dustry where dynamics are slow. The above type 
of constraints can be formulated as constraints on 
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the design variables in section “Design Variables” 
and associated variables. 


Experiment Design Criteria 

There are two principal ways to define an optimal 
experiment design problem: 

(i) Best effort. Here the best quality as, e.g., 
given by one of the quality measures in 
section “Model Quality Measures” is sought 
under constraints on the experimental effort 
and cost. This is the classical problem for¬ 
mulation. 

(ii) Least-costly. The cheapest experiment is 
sought that results in a predefined model 
quality. Thus, as compared to best effort 
design, the optimization criterion and 
constraint are interchanged. This type of 
design was introduced by Bombois and 
coworkers; see Bombois et al. (2006). 

As shown in Rojas et al. (2008), the two ap¬ 
proaches typically lead to designs only differing 
by a scaling factor. 


Computational Issues 


The optimal experiment design problem based on 
the Fisher information is typically non-convex. 
For example, consider a finite-impulse response 
model subject to an experiment of length N with 
the measured outputs collected in the vector 


Y = + 


u( 0) 


u(N - 1) 


u(-(n - 1)) 


u(N — n ) 


sequence, all typical quality measures become 
non-convex. 

While various methods for non-convex nu¬ 
merical optimization can be used to solve such 
problems, they often encounter problems with, 
e.g., local minima. To address this a number 
of techniques have been developed where either 
the problem is reparametrized so that it becomes 
convex or where a convex approximation is used. 
The latter technique is called convex relaxation 
and is often based on a reparametrization as well. 
We use the example above to provide a flavor of 
the different techniques. 

Reparametrization 

If the input is constrained to be periodic so that 
u(t) = u(t + N ), t = —n ,..., —1, it follows 
that the Fisher information is linear in the sample 
correlations of the input. Using these as design 
variables instead of u results in that all quality 
measures referred to above become convex func¬ 
tions. 

This reparametrization thus results in the two- 
step procedure discussed in section “External 
Excitation Signals”: First, the sample correlations 
are obtained from an optimal experiment design 
problem, and then an input sequence is generated 
that has this sample correlation. In the second 
step there is a considerable freedom. Notice, 
however, that since correlations do not directly 
relate to the actual amplitudes of the resulting 
signals, it is difficult to incorporate waveform 
constraints in this approach. On the contrary, 
variance constraints are easy to incorporate. 

Convex Relaxations 

There are several approaches to obtain convex 
relaxations. 


where E e R^ is zero-mean Gaussian noise with 
covariance matrix g 2 In x n- Then it holds that 

If(Po,N) = LqTq (4) 

G Z 

From an experiment design point of view, the 
input vector u = [u(— (n — 1)) ... u(N)] T is 
the design Variable, but with the elements of 
If( 0 o ,N ) being a quadratic function of the input 


Using the per Sample Fisher Information 
If the input is a realization of a stationary random 
process and the sample size N is large enough, 
If(9 0 , N)/N is approximately equal to the per 
sample Fisher matrix which only depends on 
the correlation sequence of the input. Using this 
approximation, one can now follow the same 
procedure as in the reparametrization approach 
and first optimize the input correlation sequence. 
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The generation of a stationary signal with a cer¬ 
tain correlation is a stochastic realization problem 
which can be solved using spectral factorization 
followed by filtering white noise sequence, i.e., 
a sequence of independent identically distributed 
random variables, through the (stable) spectral 
factor (Jansson and Hjalmarsson 2005). 

More generally, it turns out that the per sample 
Fisher information for linear models/systems 
only depends on the joint input/noise spectrum 
(or the corresponding correlation sequence). 
A linear parametrization of this quantity thus 
typically leads to a convex problem (Jansson and 
Hjalmarsson 2005). 

The set of all spectra is infinite dimensional 
and this precludes a search over all possible spec¬ 
tra. However, since there is a finite-dimensional 
parametrization of the per sample Fisher informa¬ 
tion (it is a symmetric nxn matrix), it is also pos¬ 
sible to find finite-dimensional sets of spectra that 
parametrize all possible per sample Fisher infor¬ 
mation matrices. Multisines with appropriately 
chosen frequencies is one possibility. However, 
even though all per sample Fisher information 
matrices can be generated, the solution may be 
suboptimal depending on which constraints the 
problem contains. 

The situation for nonlinear problems is con¬ 
ceptually the same, but here the entire proba¬ 
bility density function of the stationary process 
generating the input plays the same role as the 
spectrum in the linear case. This is a much more 
complicated object to parametrize. 


Lifting 

An approach that can deal with amplitude con¬ 
straints is based on a so-called lifting technique: 
Introduce the matrix U = uu T , representing 
all possible products of the elements of u. This 
constraint is equivalent to 


U u 
u T 1 


> 0 , 


rank 


U u 
u T 1 


(5) 


U and u (subject to the matrix inequality in (5)) 
are decision variables. 


Frequency-by-Frequency Design 
An approximation for linear systems that allows 
frequency-by-frequency design of the input spec¬ 
trum and feedback is obtained by assuming that 
the model is of high order. Then the variance of an 
ftth-order estimate, G(e lC0 , 9n), of the frequency 
function can approximately be expressed as 


Var G(e ico , § N ) 


n <J> v (ft>) 
N <&„(*>) 


( 6 ) 


(►System Identification: An Overview) in the 
open loop case (there is a closed-loop extension 
as well), where and <& v are the input and noise 
spectra, respectively. Performance measures of 
the type (2) can then be written as 



®v(a>) 

$>«(*>) 


dco 


where the weighting W(e lco ) > 0 depends on the 
application. When only variance constraints are 
present, such problems can be solved frequency 
by frequency, providing both simple calculations 
and insight into the design. 


Implementation 

We have used the notation If(0 o ,N ) to indicate 
that the Fisher information typically (but not al¬ 
ways) depends on the parameter corresponding to 
the true system. That the optimal design depends 
on the to-be identified system is a fundamental 
problem in optimal experiment design. There are 
two basic approaches to address this problem 
which are covered below. Another important as¬ 
pect is the choice of waveform for the external 
excitation signal. This is covered last in this 
section. 


The idea of lifting is now to observe that the 
Fisher information matrix is linear in the ele¬ 
ments of U and by dropping the rank constraint 
in (5) a convex relaxation is obtained, where both 


Robust Experiment Design 

In robust experiment design, it is assumed that 
it is known beforehand that the true parameter 
belongs to some set, i.e., 0 o e 0. A minimax 
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approach is then typically taken, finding the ex¬ 
periment that minimizes the worst performance 
over the set 0. Such optimization problems are 
computationally very difficult. 

Adaptive Experiment Design 

The alternative to robust experiment design is 
to perform the design adaptively or sequentially, 
meaning that first a design is performed based 
on some initial “guess” of the true parameter, 
and then as samples are collected, the design is 
revised taking advantage of the data information. 
Interestingly, the convergence rate of the parame¬ 
ter estimate is typically sufficiently fast that for 
this approach the asymptotic distribution is the 
same as for the design based on the true model 
parameter (Hjalmarsson 2009). 

Waveform Generation 

We have argued above that it is the spectrum of 
the excitation (together with the feedback) that 
determines the achieved model accuracy in the 
linear time-invariant case. In section “Using the 
per Sample Fisher Information” we argued that a 
signal with a particular spectrum can be obtained 
by filtering a white noise sequence through a 
stable spectral factor of the desired spectrum. 
However, we have also in section “Experimental 
Constraints” argued that particular applications 
may require particular waveforms. We will here 
elaborate further on how to generate a waveform 
with desired characteristics. 

From an accuracy point of view, there are two 
general issues that should be taken into account 
when the waveform is selected: 

• Persistence of excitation. A signal with a spec¬ 
trum having n nonzero frequencies (on the 
interval (—tt, zr]) can be used to estimate at 
most n parameters. Thus, as is typically the 
case, if there is uncertainty regarding which 
model structure to use before the experiment, 
one has to ensure that a sufficient number of 
frequencies is excited. 

• The crest factor. For all systems, the maximum 
input amplitude, say A, is constrained. To deal 
with this from an experiment design point of 
view, it is convenient to introduce what is 
called the crest factor of a signal: 


max ? U 2 (t) 

Hindoo jj £, = i w 2 (0 

The crest factor is thus the ratio between 
the squared maximum amplitude and the 
power of the signal. Now, for a class of signal 
waveforms with a given crest factor, the input 
power that can be used is upper-bounded 
by 


lim — 

N^o o N 




t =1 



(7) 


However, the power is the integral of the 
signal spectrum, and since increasing the 
amplitude of the input signal spectrum will 
increase a model’s accuracy, cf. (6), it is 
desirable to use as much signal power as 
possible. By (7) we see that this means that 
waveforms with low crest factor should be 
used. 

A lower bound for the crest factor is readily seen 
to be 1. This bound is achieved for binary sym¬ 
metric signals. Unfortunately, there exists no sys¬ 
tematic way to design a binary sequence that has 
a prescribed spectrum. However, the so-called 
arcsin law may be used. It states that the sign 
of a zero-mean Gaussian process with correlation 
sequence r T gives a binary signal having corre¬ 
lation sequence r T = 2/7T arcsin(r r ). With r T 
given, one can try to solve this relation for the 
corresponding r T . 

A crude, but often sufficient, method to gen¬ 
erate binary sequences with desired spectral con¬ 
tent is based on the use of pseudorandom binary 
signals (PRBS). Such signals (which are gener¬ 
ated by a shift register) are periodic signals which 
have correlation sequences similar to random 
white noise, i.e., a flat spectrum. By resampling 
such sequences, the spectrum can be modified. 
It should be noted that binary sequences are less 
attractive when it comes to identifying nonlinear¬ 
ities. This is easy to understand by considering a 
static system. If only one amplitude of the input is 
used, it will be impossible to determine whether 
the system is nonlinear or not. 
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A PRBS is a periodic signal and can therefore 
be split into its Fourier terms. With a period of 
M, each such term corresponds to one frequency 
on the grid 2 nk/M, k = 0,..., M — 1. Such a 
signal can thus be used to estimate at most M pa¬ 
rameters. Another way to generate a signal with 
period M is to add sinusoids corresponding to 
the above frequencies, with desired amplitudes. 
A periodic signal generated in this way is com¬ 
monly referred to as a MultiSine. The crest factor 
of a multisine depends heavily on the relation 
between the phases of the sinusoids, times the 
number of sinusoids. It is possible to optimize the 
crest factor with respect to the choice of phases 
(Rivera et al. 2009). There exist also simple 
deterministic methods for choosing phases that 
give a good crest factor, e.g., Schroeder phasing. 
Alternatively, phases can be drawn randomly and 
independently, giving what is known as random- 
phase multisines (Pintelon and Schoukens 2012), 
a family of random signals with properties similar 
to Gaussian signals. Periodic signals have some 
useful features: 

• Estimation of nonlinearities. A linear time- 
invariant system responds to a periodic input 
signal with a signal consisting of the same 
frequencies, but with different amplitudes 
and phases. Thus, it can be concluded 
that the system is nonlinear if the output 
contains other frequencies than the input. 
This can be explored in a systematic way 
to estimate also the nonlinear part of a 
system. 

• Estimation of noise variance. For a linear 
time-invariant system, the difference in the 
output between different periods is due en¬ 
tirely to the noise if the system is in steady 
state. This can be used to devise simple meth¬ 
ods to estimate the noise level. 

• Data compression. By averaging measure¬ 
ments over different periods, the noise level 
can be reduced at the same time as the number 
of measurements is reduced. 

Further details on waveform generation and 
general-purpose signals useful in system 
identification can be found in Pintelon and 
Schoukens (2012) and Ljung (1999). 


Implications for the Identification 
Problem Per Se 

In order to get some understanding of how 
optimal experimental conditions influence the 
identification problem, let us return to the 
finite-impulse response model example in 
section “Computational Issues.” Consider a least- 
costly setting with an acceptable performance 
constraint. More specifically, we would like 
to use the minimum input energy that ensures 
that the parameter estimate ends up in a set 
of the type (3). An approximate solution to 
this is that a 99 % confidence ellipsoid for the 
resulting estimate is contained in £ app . Now, 
it can be shown that a confidence ellipsoid is 
a level set for the average least-squares cost 
E[W0)] =E[||F-O0|| 2 ] = \\9-6 0 \\\ T9 +a\ 
Assuming the application cost Tapp also is 
quadratic in 9 , it follows after a little bit of 
algebra (see Hjalmarsson 2009) that it must hold 
that 

E[7*(0)] > o 2 (1 + ycF app (0)), we (8) 

for a constant c that is not important for our 
discussion. The value of E[ Vn(9)\ = || 9 — 
0 o \\\ T ^ + cr 2 is determined by how large the 
weighting <t> r <t> is, which in turn depends on how 
large the input u is. In a least-costly setting with 
the energy ||w|| 2 as criterion, the best solution 
would be that we have equality in (8). Thus we 
see that optimal experiment design tries to shape 
the identification criterion after the application 
cost. We have the following implications of this 
result: 

(i) Perform identification under appropriate 
scaling of the desired operating conditions. 
Suppose that V app (9) is a function of 
how the system outputs deviate from a 
desired trajectory (determined by 9 0 ). 
Performing an experiment which performs 
the desired trajectory then gives that the 
sum of the squared prediction errors are 
an approximation of V apv (9), at least for 
parameters close to 9 0 . Obtaining equality 
in (8) typically requires an additional scaling 
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of the input excitation or the length of 
the experiment. The result is intuitively 
appealing: The desired operating conditions 
should reveal the system properties that are 
important in the application. 

(ii) Identification cost for application perfor¬ 
mance. We see that the required energy 
grows (almost) linearly with y, which 
is a measure of how close to the ideal 
performance (using the true parameter 0 o ) 
we want to come. Furthermore, it is typical 
that as the performance requirements in the 
application increase, the sensitivity to model 
errors increases. This means that F app (0) 
increases, which thus in turn means that the 
identification cost increases. In summary, 
the identification cost will be higher, the 
higher performance that is required in the 
application. The inequality (8) can be used 
to quantify this relationship. 

(iii) Model structure sensitivity. As F app will be 
sensitive to system properties important for 
the application, while insensitive to system 
properties of little significance, with the 
identification criterion Vn matched to F app , 
it is only necessary that the model structure 
is able to model the important properties of 
the system. 

In any case, whatever model structure 
that is used, the identified model will be 
the best possible in that structure for the 
intended application. This is very different 
from an arbitrary experiment where it is 
impossible to control the model fit when a 
model of restricted complexity is used. 

We conclude that optimal experiment de¬ 
sign simplifies the overall system identifica¬ 
tion problem. 

Identification for Control 

Model-based control is one of the most impor¬ 
tant applications of system identification. Robust 
control ensures performance and stability in the 
presence of model uncertainty. However, the ma¬ 
jority of such design methods do not employ the 


parametric ellipsoidal uncertainty sets resulting 
from standard system identification. In fact only 
in the last decade analysis and design tools for 
such type of model uncertainty have started to 
emerge, e.g., Raynaud et al. (2000) and Gevers 
et al. (2003). 

The advantages of matching the identification 
criterion to the application have been recognized 
since long in this line of research. For control 
applications this typically implies that the iden¬ 
tification experiment should be performed under 
the same closed-loop operation conditions as the 
controller to be designed. This was perhaps first 
recognized in the context of minimum variance 
control (see Gevers and Ljung 1986) where vari¬ 
ance errors were the concern. Later on this was 
recognized to be the case also for the bias error, 
although here pre-filtering can be used to achieve 
the same objective. 

To account for that the controller to be 
designed is not available, techniques where 
control and identification are iterated have been 
developed, cf. adaptive experiment design in 
section “Adaptive Experiment Design.” Conver¬ 
gence of such schemes has been established when 
the true system is in the model set but has proved 
out of reach for models of restricted complexity. 

In recent years, techniques integrating exper¬ 
iment design and model predictive control have 
started to appear. A general-purpose design cri¬ 
terion is used in Rathousky and Havlena (2013), 
while Larsson et al. (2013) uses an application- 
oriented criterion. 


Summary and Future Directions 

When there is the “luxury” to design the exper¬ 
iment, then this opportunity should be seized by 
the user. Without informative data there is little 
that can be done. In this expose we have outlined 
the techniques that exist but also emphasized 
that a well-conceived experiment, reflecting the 
intended application, significantly can simplify 
the overall system identification problem. 

Further developments of computational tech¬ 
niques are high on the agenda, e.g., how to handle 
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time-domain constraints and nonlinear models. 
To this end, developments in optimization 
methods are rapidly being incorporated. While, 
as reported in Hjalmarsson (2009), there are some 
results on how the identification cost depends on 
the performance requirements in the application, 
further understanding of this issue is highly 
desirable. Theory and further development of 
the emerging model predictive control schemes 
equipped with experiment design may very well 
be the direction that will have most impact in 
practice. 


Cross-References 

► System Identification: An Overview 


Recommended Reading 

A classical text on optimal experiment design 
is Fedorov (1972). The textbooks Goodwin 
and Payne (1977) and Zarrop (1979) cover this 
theory adapted to a dynamical system framework. 
A general overview is provided in Pronzato 
(2008). A semi-definite programming framework 
based on the per sample Fisher information is 
provided in Jansson and Hjalmarsson (2005). 
The least-costly framework is covered in 
Bombois et al. (2006). The lifting technique 
was introduced for input design in Manchester 
(2010). Details of the frequency-by-frequency 
design approach can be found in Ljung (1999). 
References to robust and adaptive experiment 
design can be found in Pronzato (2008) and 
Hjalmarsson (2009). For an account of the 
implications of optimal experiment design for 
the system identification problem as a whole, 
see Hjalmarsson (2009). Thorough accounts of 
the developments in identification for control 
are provided in Hjalmarsson (2005) and Gevers 
(2005). 
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Abstract 

Model predictive control (MPC) has been used 
in the process industries for more than 30 years 
because of its ability to control multivariable 
systems in an optimized way under constraints 
on input and output variables. Traditionally, MPC 
requires the solution of a quadratic program 
(QP) online to compute the control action, often 
restricting its applicability to slow processes. 
Explicit MPC completely removes the need for 
on-line solvers by precomputing the control law 
off-line, so that online operations reduce to a 
simple function evaluation. Such a function is 
piecewise affine in most cases, so that the MPC 
controller is equivalently expressed as a lookup 
table of linear gains, a form that is extremely easy 
to code, requires only basic arithmetic operations, 
and requires a maximum number of iterations that 
can be exactly computed a priori. 


Keywords 

Constrained control; Embedded optimization; 
Model predictive control; Multiparametric 
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Introduction 

Model predictive control (MPC) is a well-known 
methodology for synthesizing feedback control 
laws that optimize closed-loop performance 
subject to prespecified operating constraints 
on inputs, states, and outputs (Borrelli et al. 
2011; Mayne and Rawlings 2009). In MPC, the 
control action is obtained by solving a finite 
horizon open-loop optimal control problem at 
each sampling instant. Each optimization yields 
a sequence of optimal control moves, but only 
the first move is applied to the process: At the 
next time step, the computation is repeated over a 
shifted time horizon by taking the most recently 
available state information as the new initial 
condition of the new optimal control problem. 
For this reason, MPC is also called “receding 
horizon control.” In most practical applications, 
MPC is based on a linear discrete-time time- 
invariant model of the controlled system and 
quadratic penalties on tracking errors and actu¬ 
ation efforts; in such a formulation, the optimal 
control problem can be recast as a quadratic 
programming (QP) problem, whose linear term 
of the cost function and right-hand side of the 
constraints depend on a vector of parameters that 
may change from one step to another (such as 
the current state and reference signals). To enable 
the implementation of MPC in real industrial 
products, a QP solution method must be embed¬ 
ded in the control hardware. The method must 
be fast enough to provide a solution within short 
sampling intervals and require simple hardware, 
limited memory to store the data defining the 
optimization problem and the code implementing 
the algorithm itself, a simple program code, and 
good worst-case estimates of the execution time 
to meet real-time system requirements. 

Several online solution algorithms have been 
studied for embedding quadratic optimization 
in control hardware, such as active-set meth¬ 
ods (Ricker 1985), interior-point methods (Wang 
and Boyd 2010), and fast gradient projection 
methods (Patrinos and Bemporad 2014). Explicit 
MPC takes a different approach to meet the above 
requirements, where multiparametric quadratic 
programming is proposed to pre-solve the QP 
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off-line , therefore converting the MPC law into a 
continuous and piecewise-affine function of the 
parameter vector (Bemporad et al. 2002b). We 
review the main ideas of explicit MPC in the 
next section, referring the reader to Alessio and 
Bemporad (2009) for a more complete survey 
paper on explicit MPC. 


Model Predictive Control Problem 

Consider the following finite-time optimal con¬ 
trol problem formulation for MPC: 


N -1 

V*(p) = min i N (x N ) + ^ £(x k , u k ) (la) 


k =0 

s.t. x k +x = Ax k + Bu k (lb) 

Cx%k T - C u Uk — c (lc) 

k = 0 ,..., N - 1 
CnXn < cn (Id) 

xo = x (le) 


in the MPC problem formulation (1) is q = 
Nn c + riN. 

By eliminating the states x k = A k x + 
Y^j =o A j Bu k -\-j from problem (1), the optimal 
control problem (1) can be expressed as the 
convex quadratic program (QP): 


V*(x) = min 

z 


s.t. 


]-z'Hz + x'F'z+ \x'Yx 
2 2 

(3a) 

Gz <W + Sx (3b) 


where H = H' G R n is the Hessian matrix; F G 
M nxm defines the linear term of the cost function; 
Y G M mxm has no influence on the optimizer, as 
it only affects the optimal value of (3a); and the 
matrices G G W Xn , S e W e M* define 

in a compact form the constraints imposed in (1). 
Because of the assumptions made on the weight 
matrices Q , R , P, matrix H is positive definite 
and matrix [ U F F y ] is positive semidefinite. 

The MPC control law is defined by setting 


u(x) = [I 0 ... 0]z(jc) (4) 


where N is the prediction horizon; v e M m is 
the current state vector of the controlled system; 
u k e is the vector of manipulated variables 
at prediction time k, k = 0,..., N — 1; z = 
[u'q ... G M", n = n u N, is the vector of 

decision variables to be optimized; 


£(x, u) = -x'Qx + u Ru 

(2a) 

l N (x) = ^x' Px 

(2b) 


are the stage cost and terminal cost, respectively; 
Q, P are symmetric and positive semidefinite 
matrices; and R is a symmetric and positive 
definite matrix. 

Let n c G N be the number of constraints 
imposed at prediction time k = 0,..., N — 1, 
namely, C x G !T cXm , C u G c G M Wc , 

and let be the number of terminal constraints, 
namely, Cn g g R nN . The total 

number q of linear inequality constraints imposed 


where z(x) is the optimizer of the QP problem (3) 
for the current value of x and I is the identity 
matrix of dimension n u x n u . 

Multiparametric Solution 

Rather than using a numerical QP solver online to 
compute the optimizer z(v) of (3) for each given 
current state vector x, the basic idea of explicit 
MPC is to pre-solve the QP off-line for the entire 
set of states v (or for a convex polyhedral subset 
X c M m of interest) to get the optimizer function 
z, and therefore the MPC control law u, explicitly 
as a function of v. 

The main tool to get such an explicit solu¬ 
tion is multiparametric quadratic programming 
(mpQP). For mpQP problems of the form (3), 
Bemporad et al. (2002b) proved that the opti¬ 
mizer function z* : Xf \-+ W 1 is piecewise affine 
and continuous over the set Xf of parameters 
v for which the problem is feasible (Xf is a 
polyhedral set, possibly Xf = X) and that 
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the value function F* : Xf i-> M associating 
with every x e Xf the corresponding optimal 
value of (3) is continuous, convex, and piecewise 
quadratic. 

An immediate corollary is that the explicit 
version of the MPC control law u in (4), being 
the first n u components of vector z(x), is also 
a continuous and piecewise-affine state-feedback 
law defined over a partition of the set Xf of states 
into M polyhedral cells; 


u(x) = 


Fix + g i if H\x < K\ 


Fmx + gM if H m x < K m 


(5) 


An example of such a partition is depicted in 
Fig. 1 . The explicit representation (5) has mapped 
the MPC law (4) into a lookup table of linear 
gains, meaning that for each given x, the values 
computed by solving the QP (3) online and those 
obtained by evaluating (5) are exactly the same. 


Multiparametric QP Algorithms 

A few algorithms have been proposed in the liter¬ 
ature to solve the mpQP problem (3). All of them 


construct the solution by exploiting the Karush- 
Kuhn-Tucker (KKT) conditions for optimality: 

Hz + Fx + G'A = 0 (6a) 

X i (G i z— W { -S [ x ) = 0, V/ = l,...,^ (6b) 
Gz<W + Sx (6c) 

A > 0 (6d) 

where A e R q is the vector of Lagrange multipli¬ 
ers. For the strictly convex QP (3), conditions (6) 
are necessary and sufficient to characterize opti¬ 
mality. 

An mpQP algorithm starts by fixing an arbi¬ 
trary starting parameter vector xo £ (e.g., 

the origin xo = 0), solving the QP (3) to get the 
optimal solution z(xq), and identifying the subset 


Gz(x) = Sx + W (7a) 

of all constraints (6c) that are active at z(x o) and 
the remaining inactive constraints: 

Gz(x) < Sx + W 


(7b) 
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Correspondingly, in view of the complementarity 
condition (6b), the vector of Lagrange multipliers 
is split into two sub vectors: 

A(x) > 0 (8a) 

A(x) = 0 (8b) 

We assume for simplicity that the rows of G 
are linearly independent. From (6a), we have the 
relation 

z(x) = -H~\Fx + G'A(x)) (9) 

that, when substituted into (7a), provides 

X(x) = -M(W + (S + GH~ l F)x) (10) 

where M = and, by substitu- 

tion in (9), 

z(x) = H~\MW + M(S + GH~ l F)x - Fx ) 

( 11 ) 

The solution z(x) provided by (11) is the correct 
one for all vectors x such that the chosen com¬ 
bination of active constraints remains optimal. 
Such all vectors x are identified by imposing con¬ 
straints (7b) and (8a) on z(x) and A(x), respec¬ 
tively, that leads to constructing the polyhedral 
set (“critical region”): 

CR 0 = {xef: A(x) > 0, Gz(x) < W + Sx} 

( 12 ) 

Different mpQP solvers were proposed to 
cover the rest X \ CRq of the parameter set 
with other critical regions corresponding to 
new combinations of active constraints. The 
most efficient methods exploit the so-called 
“facet-to-facet” property of the multiparametric 
solution (Spjptvold et al. 2006) to identify 
neighboring regions as in Tpndel et al. (2003a) 
and Baotic (2002). Alternative methods were 
proposed in Jones and Morari (2006), based 
on looking at (6) as a multiparametric linear 
complementarity problem, and in Patrinos and 
Sarimveis (2010), which provides algorithms for 
determining all neighboring regions even in the 
case the facet-to-facet property does not hold. 


All methods handle the case of degeneracy , 
which may happen for some combinations of 
active constraints that are linearly dependent, that 
is, the associated matrix G has no full row rank 
(in this case, A(x) may not be uniquely defined). 


Extensions 

The explicit approach described earlier can be 
extended to the following MPC setting: 

N ~ l l 1 

- r k )'Q y (y k - r k )+-Au' k R A Au k 

z k=0 

+ (u k - u{)'R(u k - O' + p e € 2 (13a) 
s.t. Xk +i = Ax k + Bu k + B v \ k (13b) 

y k = Cx k + D u u k + D v \ k (13c) 

Uk = Uk -1 + A Uk, k = 0,..., N — 1 

(13d) 

A Uk = 0, k = N u ,..., N — 1 (13e) 

U min <U k < < ax , k = 0, . . . ,N U - 1 (13f) 

Au^ n < A u k < Au^ ax , k = 0,..., N u - 1 

(13g) 

ymin - e ^min 5 Vk A + € Knax (13h) 

k = 0,..., N c - 1 

where is a symmetric and positive definite 
matrix; matrices Q y and R are symmetric and 
positive semidefinite; Vk is a vector of measured 
disturbances; yk is the output vector; rk its corre¬ 
sponding reference to be tracked; A Uk is the vec¬ 
tor of input increments; is the input reference; 

U L> “max. AU L’ Al, max- 3^,. Ymax are bounds; 

and N , N u , N c are, respectively, the prediction, 
control, and constraint horizons. The extra vari¬ 
able € is introduced to soften output constraints, 
penalized by the (usually large) weight p € in the 
cost function (13a). 

Everything marked in bold-face in (13), to¬ 
gether with the command input u-\ applied at 
the previous sampling step and the current state 
x, can be treated as a parameter with respect to 
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which to solve the mpQP problem and obtain the 
explicit form of the MPC controller. For example, 
for a tracking problem with no anticipative action 
(rk = ro,Vk = 0,..., N — 1), no measured dis¬ 
turbance, and fixed upper and lower bounds, the 
explicit solution is a continuous piecewise affine 
function of the parameter vector jo^ j. Note that 
prediction models and/or weight matrices in (13) 
cannot be treated as parameters to maintain the 
mpQP formulation (3). 

Linear MPC Based on Convex 
Piecewise-Affine Costs 

A similar setting can be repeated for MPC 
problems based on linear prediction models 
and convex piecewise-affine costs, such as 
1- and oo-norms. In this case, the MPC 
problem is mapped into a multiparametric linear 
programming (mpLP) problem, whose solution 
is again continuous and piecewise-affine with 
respect to the vector of parameters. For details, 
see Bemporad et al. (2002a). 

Robust MPC 

Explicit solutions to min-max MPC problems 
that provide robustness with respect to additive 
and/or multiplicative unknown-but-bounded 
uncertainty were proposed in Bemporad et al. 
(2003), based on a combination of mpLP and 
dynamic programming. Again the solution is 
piecewise affine with respect to the state vector. 

Hybrid MPC 

An MPC formulation based on 1- or oo-norms 
and hybrid dynamics expressed in mixed-logical 
dynamical (MLD) form can be solved explic¬ 
itly by treating the optimization problem asso¬ 
ciated with MPC as a multiparametric mixed 
integer linear programming (mpMILP) problem. 
The solution is still piecewise affine but may be 
discontinuous, due to the presence of binary vari¬ 
ables (Bemporad et al. 2000). A better approach 
based on dynamic programming combined with 
mpLP (or mpQP) was proposed in Borrelli et al. 
(2005) for hybrid systems in piecewise-affine 
(PWA) dynamical form and linear (or quadratic) 
costs. 


Applicability of Explicit MPC 

Complexity of the Solution 

The complexity of the solution is given by the 
number M of regions that form the explicit so¬ 
lution (5), dictating the amount of memory to 
store the parametric solution (F z , G z , // z , F z , 
i = 1 ,..., M), and the worst-case execution 
time required to compute F z v + G z once the 
problem of identifying the index i of the region 
{x : HjX < Kj } containing the current state v 
is solved (which usually takes most of the time). 
The latter is called the “point location problem,” 
and a few methods have been proposed to solve 
the problem more efficiently than searching lin¬ 
early through the list of regions (see, e.g., the 
tree-based approach of Tpndel et al. 2003b). 

An upper bound to M is 2 q , which is the 
number of all possible combinations of active 
constraints. In practice, M is much smaller than 
2 q , as most combinations are never active at 
optimality for any of the vectors v (e.g., lower 
and upper limits on an actuation signal cannot 
be active at the same time, unless they coin¬ 
cide). Moreover, regions in which the first n u 
component of the multiparametric solution z(v) 
is the same can be joined together, provided that 
their union is a convex set (an optimal merging 
algorithm was proposed by Geyer et al. (2008) to 
get a minimal number M of partitions). Nonethe¬ 
less, the complexity of the explicit MPC law 
typically grows exponentially with the number 
q of constraints. The number m of parameters 
is less critical and mainly affects the number of 
elements to be stored in memory (i.e., the number 
of columns of matrices F z , // z ). The number n 
of free variables also affects the number M of 
regions, mainly because they are usually upper 
and lower bounded. 


Computer-Aided Tools 

The Model Predictive Control Toolbox (Bempo¬ 
rad et al. 2014) offers functions for designing ex¬ 
plicit MPC controllers in MATLAB since 2014. 
Other tools exist such as the Hybrid Toolbox 
(Bemporad 2003) and the Multi-Parametric Tool¬ 
box (Kvasnica et al. 2006). 
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Summary and Future Directions 

Explicit MPC is a powerful tool to convert an 
MPC design into an equivalent control law that 
can be implemented as a lookup table of linear 
gains. Whether the explicit form is preferable 
to solving the QP problem online depends on 
available CPU time, data memory, and program 
memory and other practical considerations. Al¬ 
though suboptimal methods have been proposed 
to reduce the complexity of the control law, still 
the explicit MPC approach remains convenient 
for relatively small problems (such as one or 
two command inputs, short control and constraint 
horizons, up to ten states). For larger problems, 
and/or problems that are linear time varying, on 
line QP solution methods tailored to embedded 
MPC may be preferable. 

Cross-References 

► Model-Predictive Control in Practice 

► Nominal Model-Predictive Control 

► Optimization Algorithms for Model Predictive 
Control 

Recommended Reading 

For getting started in explicit MPC, we 
recommend reading the paper by Bemporad et al. 
(2002b) and the survey paper Alessio and 
Bemporad (2009). Hands-on experience using 
one of the MATFAB tools listed above is also 
useful for fully appreciating the potentials and 
limitations of explicit MPC. For understanding 
how to program a good multiparametric QP 
solver, the reader is recommended to take the 
approach of Tpndel et al. (2003a) and Spjptvold 
et al. (2006) or, in alternative, of Patrinos and 
Sarimveis (2010) or Jones and Morari (2006). 
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Synonyms 

EKF 


Abstract 

The extended Kalman filter (EKF) is the most 
popular estimation algorithm in practical appli¬ 
cations. It is based on a linear approximation to 
the Kalman filter theory. There are thousands of 
variations of the basic EKF design, which are 
intended to mitigate the effects of nonlinearities, 
non-Gaussian errors, ill-conditioning of the co- 
variance matrix and uncertainty in the parameters 
of the problem. 

Keywords 

Estimation; Nonlinear filters 

The extended Kalman filter (EKF) is by far 
the most popular nonlinear filter in practical 
engineering applications. It uses a linear 
approximation to the nonlinear dynamics and 
measurements and exploits the Kalman filter 
theory, which is optimal for linear and Gaussian 
problems; Gelb (1974) is the most accessible 
but thorough book on the EKF. The real-time 
computational complexity of the EKF is rather 
modest; for example, one can run an EKF 


with high-dimensional state vectors (d = several 
hundreds) in real time on a single microprocessor 
chip. The computational complexity of the EKF 
scales as the cube of the dimension of the state 
vector (d) being estimated. The EKF often gives 
good estimation accuracy for practical nonlinear 
problems, although the EKF accuracy can be 
very poor for difficult nonlinear non-Gaussian 
problems. There are many different variations 
of EKF algorithms, most of which are intended 
to improve estimation accuracy. In particular, 
the following types of EKFs are common in 
engineering practice: (1) second-order Taylor 
series expansion of the nonlinear functions, (2) 
iterated measurement updates that recompute the 
point at which the first order Taylor series is 
evaluated for a given measurement, (3) second- 
order iterated (i.e., combination of items 1 
and 2), (4) special coordinate systems (e.g., 
Cartesian, polar or spherical, modified polar 
or spherical, principal axes of the covariance 
matrix ellipse, hybrid coordinates, quaternions 
rather than Euler angles, etc.), (5) preferred order 
of processing sequential scalar measurement 
updates, (6) decoupled or partially decoupled or 
quasi-decoupled covariance matrices, and many 
more variations. In fact, there is no such thing 
as “the” EKF, but rather there are thousands of 
different versions of the EKF. There are also 
many different versions of the Kalman filter 
itself, and all of these can be used to design EKFs 
as well. For example, there are many different 
equations to update the Kalman filter error 
covariance matrices with the intent of mitigating 
ill-conditioning and improving robustness, 
including (1) square-root factorization of the 
covariance matrix, (2) information matrix update, 
(3) square-root information update, (4) Joseph’s 
robust version of the covariance matrix update, 
(5) at least three distinct algebraic versions of the 
covariance matrix update, as well as hybrids of 
the above. 

Many of the good features of the Kalman filter 
are also enjoyed by the EKF, but unfortunately 
not all. For example, we have a very good theory 
of stability for the Kalman filter, but there is 
no theory that guarantees that an EKF will be 
stable in practical applications. The only method 
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to check whether the EKF is stable is to run 
Monte Carlo simulations that cover the relevant 
regions in state space with the relevant measure¬ 
ment parameters (e.g., data rate and measurement 
accuracy). Secondly, the Kalman filter computes 
the theoretical error covariance matrix, but there 
is no guarantee that the error covariance matrix 
computed by the EKF approximates the actual 
filter errors, but rather the EKF covariance ma¬ 
trix could be optimistic by orders of magnitude 
in real applications. Third, the numerical val¬ 
ues of the process noise covariance matrix can 
be computed theoretically for the Kalman filter, 
but there is no guarantee that these will work 
well for the EKF, but rather engineers typically 
tune the process noise covariance matrix using 
Monte Carlo simulations or else use a heuris¬ 
tic adaptive process (e.g., IMM). All of these 
short-comings of the EKF compared with the 
Kalman filter theory are due to a myriad of 
practical issues, including (1) nonlinearities in 
the dynamics or measurements, (2) non-Gaussian 
measurement errors, (3) unmodeled measurement 
error sources (e.g., residual sensor bias), (4) un¬ 
modeled errors in the dynamics, (5) data associa¬ 
tion errors, (6) unresolved measurement data, (7) 
ill-conditioning of the covariance matrix, etc. The 
actual estimation accuracy of an EKF can only 
be gauged by Monte Carlo simulations over the 
relevant parameter space. 

The actual performance of an EKF can depend 
crucially on the specific coordinate system that 
is used to represent the state vector. This is 
extremely well known in practical engineering 
applications (e.g., see Mehra 1971; Stallard 
1991; Miller 1982; Markley 2007; Daum 1983; 
Schuster 1993). Intuitively, this is because the 
dynamics and measurement equations can be 
exactly linear in one coordinate system but not 
another; this is very easy to see; start with dy¬ 
namics and measurements that are exactly linear 
in Cartesian coordinates and transform to polar 
coordinates and we will get highly nonlinear 
equations. Likewise, we can have approximately 
linear dynamics and measurements in a specific 
coordinate system but highly nonlinear equations 
in another coordinate system. But in theory, the 
optimal estimation accuracy does not depend on 


the coordinate system. Moreover, in math and 
physics, coordinate-free methods are preferred, 
owing to their greater generality and simplicity 
and power. The physics does not depend on the 
specific coordinate system; this is essentially a 
definition of what “physics” means, and it has 
resulted in great progress in physics over the 
last few hundred years (e.g., general relativity, 
gauge invariance in quantum field theory, Lorentz 
invariance in special relativity, as well as a host 
of conservation laws in classical mechanics that 
are explained by Noether’s theorem which relates 
invariance to conserved quantities). Similarly 
in math, coordinate-free methods have been the 
royal road to progress over the last 100 years 
but not so for practical engineering of EKFs, 
because EKFs are approximations rather than 
being exact, and the accuracy of the EKF 
approximation depends crucially on the specific 
coordinate system used. Moreover, the effect 
of ill-conditioning of the covariance matrices 
in EKFs depends crucially on the specific 
coordinate system used in the computer; for 
example, if we could compute the EKF in 
principal coordinates, then the covariance 
matrices would be diagonal, and there would 
be no effect of ill-conditioning, despite enormous 
condition numbers of the covariance matrices. 
Surprisingly, these two simple points about 
coordinate systems are still not well understood 
by many researchers in nonlinear filtering. 

Cross-References 

► Estimation, Survey on 

► Kalman Filters 

► Nonlinear Filters 

► Particle Filters 
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Abstract 

Extremum seeking (ES) is a method for real-time 
non-model-based optimization. Though ES was 
invented in 1922, the “turn of the twenty-first cen¬ 
tury” has been its golden age, both in terms of the 
development of theory and in terms of its adop¬ 
tion in industry and in fields outside of control 
engineering. This entry overviews basic gradient- 
and Newton-based versions of extremum seeking 
with periodic and stochastic perturbation signals. 

Keywords 

Gradient climbing; Newton’s method 


The Basic Idea of Extremum Seeking 

Many versions of extremum seeking exist, with 
various approaches to their stability study (Krstic 
and Wang 2000; Liu and Krstic 2012; Tan et al. 



a sin (cod 


Extremum Seeking Control, Fig. 1 The simplest 
perturbation-based extremum seeking scheme for a 

rrr 

quadratic single-input map f(0) = f* + (6 — 6*) 2 , 

where are all unknown. The user has to only 

know the sign of /", namely, whether the quadratic map 
has a maximum or a minimum, and has to choose the 
adaptation gain k such that sgn k = — sgn/". The user 
has to also choose the frequency oo as relatively large 
compared to a, k, and /" 


2006). The most common version employs per¬ 
turbation signals for the purpose of estimating the 
gradient of the unknown map that is being opti¬ 
mized. To understand the basic idea of extremum 
seeking, it is best to first consider the case of a 
static single-input map of the quadratic form, as 
shown in Fig. 1. 

Three different thetas appear in Fig. 1: is 

the unknown optimizer of the map, 0(t) is the 
real-time estimate of 0*, and 0(t) is the actual 
input into the map. The actual input 0(t) is based 
on the estimate 0(t) but is perturbed by the 
signal a sin(&>0 for the purpose of estimating the 
unknown gradient f" -(0 — 0*) of the map /($). 
The sinusoid is only one choice for a perturbation 
signal - many other perturbations, from square 
waves to stochastic noise, can be used in lieu of 
sinusoids, provided they are of zero mean. The 
estimate 0(0 is generated with the integrator k/s 
with the adaptation gain k controlling the speed 
of estimation. 

The ES algorithm is successful if the error 
between the estimate 0(t) and the unknown #*, 
namely, the signal 


0(0 = 0(0 - 0 


( 1 ) 
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converges towards zero. Based on Fig. 1, the esti¬ 
mate is governed by the differential equation 9 = 
k sin(W) /( 0 ), which means that the estimation 
error is governed by 


d 0 
d t 


= ka sin(a;£) 



( 5 


+ a sin 



( 2 ) 


Expanding the right-hand side, one obtains 


6.9 jt) 
d t 


f n 

= kaf * sin(W) + ka 3 — sin 3 (o ;0 


f" 

+ ka — sin (cot) 


mean =0 mean =0 

ml 

fast, mean=() slow 
+ ka 2 f" sin 2 (cot) 9{t) (3) 
fast, mean= 1 /2 slow 


A theoretically rigorous time-averaging pro¬ 
cedure allows to replace the above sinusoidal 
signals by their means, yielding the “average 
system” 


<o 

d6»ave _ kf" a 1 ~ 

At 2 ave 


(4) 


which is exponentially stable. The averaging the¬ 
ory guarantees that there exists sufficiently large 
co such that, if the initial estimate 0 ( 0 ) is suffi¬ 
ciently close to the unknown 0 *, 


kf a 1 


\9(t) — 0 *| < |0(O)-0*|e^' + 0 [- 

-\~a , Vt > 0. 


G) 


(5) 


For the user, the inequality (5) guarantees that, 
if a is chosen small and co is chosen large, the 
input 9{t) exponentially converges to a small in¬ 
terval around the unknown 0 * and, consequently, 
the output /( 0 ( 0 ) converges to the vicinity of the 
optimal output /*. 



Extremum Seeking Control, Fig. 2 Extremum seeking 
algorithm for a multivariable map y = Q(6), where 9 
is the input vector 6 = [0\, 62, • • • , 0 n ] T . The algorithm 
employs the additive perturbation vector signal S(t) given 
in (6) and the multiplicative demodulation vector signal 
M(t) given in (7) 


ES for Multivariable Static Maps 

For static maps, ES extends in a straightforward 
manner from the single-input case shown in Fig. 1 
to the multi-input case shown in Fig. 2. 

The algorithm measures the scalar signal 
y(t) = Q(9(t)), where Q(-) is an unknown map 
whose input is the vector 0 = [0i, 02 , • • • , 9 n ] T . 
The gradient is estimated with the help of the 
signals 


S(t ) = [a\ sin(&>i0 


M(t) = 


— sin(co\t) 
a\ 


a n sin (co n t)] T 

( 6 ) 


— sin (co n t) 
a n 

(7) 


with nonzero perturbation amplitudes a\ and with 
a gain matrix K that is diagonal. To guarantee 
convergence, the user should choose cot ^ C 0 j . 
This is a key condition that differentiates the 
multi-input case from the single-input case. In 
addition, for simplicity in the convergence analy¬ 
sis, the user should choose coi / C 0 j as rational and 
cot + coj 7 ^ cok for distinct /, j, and k. 

If the unknown map is quadratic, namely, 
Q(d) = Q* + \(d-Q*) T H{6-6*), the averaged 
system is 


= KH9, ve , 


H = Hessian. ( 8 ) 
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Extremum Seeking Control, Fig. 3 The ES algorithm 
in the presence of dynamics with an equilibrium map 0 \-> 
y that satisfies the same conditions as in the static case. If 
the dynamics are stable and the user employs parameters 
in the ES algorithm that make the algorithm dynamics 


slower than the dynamics of the plant, convergence is 
guaranteed (at least locally). The two filters are useful 
in the implementation to reduce the adverse effect of the 
perturbation signals on asymptotic performance but are 
not needed in the stability analysis 


If, for example, the map Q(0 has a maximum 
that is locally quadratic (which implies H = 
H t < 0) and if the user chooses the elements 
of the diagonal gain matrix K as positive, the 
ES algorithm is guaranteed to be locally conver¬ 
gent. However, the convergence rate depends on 
the unknown Hessian H. This weakness of the 
gradient-based ES algorithm is removed with the 
Newton-based ES algorithm. 

A stochastic version of the algorithm in Fig. 2 
also exists, in which S(t ) and M(t) are re¬ 
placed by 


-SOKO) = Pisin07i(0),...,a„ sin (r]n(t))] T , 


(9) 




— - sin 0?i(0)> • • •, 

a\(l —e q i) 


— - —sm(ri„(t)) 

a n ( 1 -e in) 


( 10 ) 


parameters are chosen so that the algorithm’s 
dynamics are slower than those of the plant. The 
algorithm is shown in Fig. 3. 

The technical conditions for convergence in 

the presence of dynamics are that the equilibria 

x = 1(6) of the system x = f(x,a(x,6)), 

where a(x, 9) is the control law of an internal 

feedback loop, are locally exponentially stable 

uniformly in 6 and that, given the output map 

y = h(x ), there exists at least one 0* e W 1 
a o2 

such that — ~(hol)(6*) = 0 and —— (hol)(0*) = 
06 o 6 z 

H <0, H = H t . 

The stability analysis in the presence of 
dynamics employs both averaging and singular 
perturbations, in a specific order. The design 
guidelines for the selection of the algorithm’s pa¬ 
rameters follow the analysis. Though the guide¬ 
lines are too lengthy to state here, they ensure 
that the plant’s dynamics are on a fast time scale, 
the perturbations are on a medium time scale, and 
the ES algorithm is on a slow time scale. 


where r]i = - [Wj] and Wj are independent 

SiS + 1 

unity-intensity white noise processes. 


ES for Dynamic Systems 

ES extends in a relatively straightforward man¬ 
ner from static maps to dynamic systems, pro¬ 
vided the dynamics are stable and the algorithm’s 


Newton ES Algorithm for Static Map 

A Newton version of the ES algorithm, shown in 
Fig. 4, ensures that the convergence rate be user 
assignable, rather than being dependent on the 
unknown Hessian of the map. 

The elements of the demodulating matrix N(t) 
for generating the estimate of the Hessian are 
given by 
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Extremum Seeking Control, Fig. 4 A Newton-based 
ES algorithm for a static map. The multiplicative excita¬ 


tion N(t) helps generate the estimate of Hessian 


d 2 Q(0) 

do 2 


as H(t) = N(t)y(t). The Riccati matrix differential 
equation T(t) generates an estimate of the Hessian’s 
inverse matrix, avoiding matrix inversions of Hessian 
estimates that may be singular during the transient 


16/ o 1\ 

Niiit) = ^sin 2 (a)it ) - -J, 

4 

N;j(t) = -sin(*w/f) sin (co/t) (11) 

didj 

For a quadratic map, the averaged system in 
error variables 0 = 0 — 0*, f = T — H~ l is 

-K0™ - KT™ Q HO aYQ , 

s -V-' 

quadratic 

(D r f™-(D r f ave tff ave . (12) 
quadratic 

Since the eigenvalues are determined by K 
and co r and are therefore independent of the 
unknown //, the (local) convergence rate is user 
assignable. 

Further Reading on Extremum 
Seeking 

Since the publication of the first proof of sta¬ 
bility of extremum seeking (Krstic and Wang 
2000 ), thousands of papers have been published 


on this topic, presenting further theoretical de¬ 
velopments and applications of ES. A proof that 
expands the validity of extremum seeking from 
local to global stability was published in Tan et al. 
(2006). The book Liu and Krstic (2012) presents 
stochastic versions of the algorithms in this entry, 
where the sinusoids are replaced by filtered white 
noise perturbation signals. 
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Abstract 

The fundamental concepts and methods of 
fault detection and diagnosis are reviewed. 
Faults are defined and classified as additive or 
multiplicative. The model-free approach of alarm 
systems is described and critiqued. Residual 
generation, using the mathematical model 
of the plant, is introduced. The propagation 
of additive and multiplicative faults to the 
residuals is discussed, followed by a review 
of the effect of disturbances, noise, and model 
errors. Enhanced residuals (structured and 
directional) are introduced. The main residual 
generation techniques are briefly described, 
including direct consistency relations, parity 
space, and diagnostic observers. Principal 
component analysis and its application to 
fault detection and diagnosis are outlined. The 
article closes with some thoughts about future 
directions. 


Keywords 

Consistency relations; Diagnostic observers; 
Fault detection; Fault diagnosis; Parity space; 
Principal component analysis; Residual genera¬ 
tion 


Introduction 

Faults are malfunctions of various elements of 
technical systems. Extreme cases of faults, called 
failures , are catastrophic breakdowns of the 
same. The technical systems ( the plant) we are 
concerned with range from complex production 
systems (chemical plants, oil refineries, power 
stations) through major transportation equipment 
(airplanes, ships) to consumer machines (auto¬ 
mobiles, home-heating systems, etc.). The faults 
may affect various parts of the main technical 
system (motors, pumps, storage tanks, pipelines) 
or devices interfacing the main technical system 
with computers providing for control, monitor¬ 
ing, and operator information. These latter in¬ 
clude sensors (measuring devices) and actuators 
(devices acting on the process, such as valves). 

The objective of fault detection is to determine 
and signal if there is a fault anywhere in the 
system. Fault diagnosis is aimed at providing 
more specific information about the fault; fault 
isolation is to pinpoint at the component(s) (sen¬ 
sors, actuators, or plant components) where the 
fault is located, while fault identification is to 
determine (estimate) the size of the fault and, in 
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some cases, the time of its arrival. With the ubiq¬ 
uitous presence of the computer, fault detection 
and diagnosis (FDD) is, in general, a function of 
the computer interfaced to the plant. 

The simplest approaches to FDD consist of 
comparing individual plant measurements to pre¬ 
set limits, without utilizing any knowledge of the 
plant model (limit checking or alarm systems). 
More sophisticated techniques rely on an explicit 
mathematical model of the plant. They compare 
plant measurements to estimates obtained, from 
other measurements, by the model; any discrep¬ 
ancy may be an indication of faults. Another class 
of techniques (generally but incorrectly called 
“data driven ”), most notably principal compo¬ 
nent analysis (PCA), include the estimation of 
an implicit model, from empirical plant data, and 
then use this in ways similar to the model-based 
methods. These approaches will be described in 
more detail in the sequel. 

Alarm Systems 

Alarm systems rely on the comparison of 
individual plant measurements to their respective 
limits. The limits may be two or one sided 
(upper and lower limit or upper limit only) and 
may have one or two levels (preliminary and 
full alarm). Momentary comparisons may be 
extended to include trend checks. Alarm systems 
are relatively simple but suffer from two major 
shortcomings: 

- They have very limited fault specificity. 
A variable exceeding its limit is not a fault 
but a symptom of faults. A single-component 
fault may cause alarm on many variables 
and a particular alarm may be due to various 
component faults. 

- They have limited fault sensitivity. What is 
“normal” for a plant output variable depends 
on the value of the plant inputs. Such relation¬ 
ship, however, cannot be considered without 
a plant model; therefore, the alarm thresholds 
need to be set conservatively high. 

Because of their simplicity, and in spite of the 
above shortcomings, alarm systems are widely 
used in industrial applications. 


Model-Based FDD Concepts 

Model-based methods utilize an explicit 
mathematical model of the plant. Such model 
is obtained usually from empirical plant data by 
systems identification methods or, exceptionally, 
from the “first principles” understanding of the 
plant. Model building, though critical to the 
success of model-based FDD, is usually not 
considered part of the FDD effort. The models 
may be linear or nonlinear, static or dynamic, 
and continuous or discrete time. In FDD, most 
frequently linear discrete-time dynamic models 
are used. 

The fundamental idea of model-based FDD is 
the comparison of measured plant outputs to their 
estimates, obtained, via the mathematical model, 
from measured or actuated plant inputs (Fig. 1). 
Any discrepancy is (at least ideally) an indication 
that a fault (or faults) is (are) present in the 
system. Mathematically, the difference between 
the measured output yi (t) and its estimate y i A (t) 
is a (primary) residual (Willsky 1976): 

ei(t) = yi(t)-yC(t) 

In general, residuals are quantities that are 
zero in the absence of faults and nonzero in their 
presence. 

Unfortunately, it is not only the faults that can 
make the residuals nonzero. Usually, the plant 
is subject to disturbances (unmeasured determin- 


faults 



Fault Detection and Diagnosis, Fig. 1 Analytical re¬ 
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istic inputs) and noise (unmeasured random in¬ 
puts) (Fig. 1). In addition, and most importantly, 
model-based FDD is subject to model errors (due 
either to initial inaccuracies in model building or 
to changes in the physical plan). The FDD algo¬ 
rithm should be designed, as much as possible, 
to be insensitive to noise and “ robust ” in face of 
disturbances and model errors. 

Additive and multiplicative faults. Depending 
on the way they appear in the system equations, 
faults may be additive or multiplicative. Additive 
faults are sensor and actuator biases, leaks in the 
plant, etc. Multiplicative faults are changes in the 
plant parameters. In the following input-output 
relationship, u(£) is the vector of observed (mea¬ 
sured or commanded) plant inputs, y(t) is the 
vector of measured plant outputs, and p(£) is the 
vector of additive faults and t is the discrete time. 
M(g) and S(g) are transfer function matrices in 
the shift operator q , and 0 is the vector of plant 
parameters. Then, 

y(0 = M( 4 , 0)u(O + S(q, 6)p(0 

The (“primary”) residual vector e(t ), in response 
to additive faults, is 

e(0 = y(0 - M (q, 0)u(O = S (q, 0)p(f) 

If there are multiplicative faults, then 0=0° + 
A0, where 0° is the nominal parameter vector 
and A0 is its change (the parametric fault); now 
and the residual vector e(t) is (Gertler 1998) 

e(0 = y(0 

= Ej(3M(^,O)/a0j)u(O A0j 

Enhanced residuals. To facilitate the isolation 
of faults, the primary residuals e(t) are subject 
to some enhancement manipulation. The three 
widely used enhancement techniques are: 

- Structured residuals , whereas each residual 
is selectively sensitive to a subset of faults, 
resulting in a fault-specific set of zero/nonzero 
residuals upon a particular fault (fault codes) 


faults 



residuals 


Fault Detection and Diagnosis, Fig. 2 Generating 
model-based residuals 


- Directional residuals , whereas the residual 
vector maintains a fault-specific direction in 
response to each particular fault 

- Diagonal residuals , whereas each residual re¬ 
sponds only to a particular fault 

Residual generators take the input and output ob¬ 
servations from the plant and generate enhanced 
residuals by one of the above schemes, utilizing 
the mathematical model of the plant (Fig. 2). 

Dealing with noise. Noise is practically 
unavoidable in physical systems. In FDD, 
basically two steps may be taken to reduce the 
effect of noise: 

- Residual filtering. This can be achieved by 
basing decisions on moving averages of the 
residuals or by applying explicit low-pass fil¬ 
ters to the residuals or by designing the resid¬ 
ual generators in such a way that they have 
built-in low-pass behavior. 

- Statistical testing of the residuals. Structured 
residuals are tested individually; each scalar 
residual is then represented by a Boolean 
1 or 0 , depending on the outcome of the 
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test. Directional residuals are tested as vectors 
against multivariable distributions. The test 
thresholds are determined either theoretically, 
using assumptions for the source noise, or em¬ 
pirically based on measurements from fault- 
free operating conditions. 

Dealing with disturbances. Additive dis¬ 
turbances are unmeasurable inputs. If the 
disturbance-to-output transfer function (or 
equivalent state-space representation) is known, 
then it is possible to design residuals that are 
completely decoupled from (insensitive to) those 
disturbances. However, the FDD algorithm is 
subject to a certain degree of “design freedom,” 
defined by the number of outputs in the physical 
system; disturbance decoupling is competing for 
this freedom with fault isolation enhancement. 
If there are too many disturbances, or if their 
path to the outputs is unknown, then only 
approximate decoupling is possible, making FDD 
also approximate, usually designed to optimize 
some (H-infinity) performance index. 

Dealing with model errors. Model errors are 
also unavoidable in most practical situations. This 
is the most serious obstacle in the application 
of model-based FDD techniques. In some very 
special cases, uncertainty of a particular plant 
parameter may be handled as a “multiplicative 
disturbance,” and residuals designed to be ex¬ 
plicitly decoupled from it. In general, however, 
only approximate solutions are possible, reducing 
the residuals’ sensitivity to modeling errors, at 
the expense of also reducing their sensitivity to 
faults. Design methods utilizing some optimiza¬ 
tion techniques, mostly based on H-infinity or 
similar performance indices, are available in the 
literature (Edelmayer et al. 1994). 

Residual Generation Methods 

For linear dynamic systems, provided exact (non- 
approximate) solution is possible, there are three 
major techniques to design residual generators: 
(i) direct consistency (parity) relations, (ii) parity 
space, and (iii) diagnostic observers. We will 


briefly introduce the three methods, for discrete¬ 
time plant models and additive faults. Note that 
though they look formally different, if designed 
for the same plant under the same design condi¬ 
tions, the three methods yield identical residuals 
(Gertler 1991). 

Direct consistency (parity) relations (Gertler 
1998). The input-output model of the plant is 
utilized directly in the design. The enhanced 
residuals are obtained from the primary residuals 
by a transformation W (q): 

r(?) = W(<5r)e(0 = W(<5r)[y(0 -M(^)u(r)] 

= W(g)S(g)p(?) 

The desired behavior of the residuals is specified 
as r(t) = where the specification Z (q) 

contains the basic residual properties (structure 
or directions) plus the residual dynamics. The 
resulting design condition is W(q)S(q) = Z (q). 
If the S(q) matrix is square, what is usually the 
case (Gertler 1998), then this can be solved for 
W(g) by direct inversion. The residual generator 
has to be causal and stable; this can always be 
achieved by the appropriate modification of the 
dynamics in Z (q). 

Parity space (Chow and Willsky 1984). This 
method, also known as the “Chow-Willsky 
scheme,” relies on the state-space description 
of the system: 

x(t + 1) = A x(t) + B u(0 + E p(0 
y(0 = C x(t) + D n(t) + F p(t) 

Stacking n consecutive output vectors y (t) 
(where n is the order of the model), and chain- 
substituting the state x(t), yields the equation 

Y(f) = J x(t - n) + K U(0 + L P(0 

where Y(t),U(t), and P (t) are stacked vectors 
and J, K, and L are hyper-matrices composed of 
the A, B, C, D, E, F matrices. Now 

E*(0 = Y(t) - K U(0 = L P(0 + J x(t - n) 
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would be a stacked vector of primary residuals, 
was it not for the presence of the inaccessible ini¬ 
tial state x(t —n). To obtain true residuals, a trans¬ 
formation ri(t) = W/E*(7) is necessary, so that 
W/J = 0. Any vector w ? satisfying this orthogo¬ 
nality condition is a parity vector , together span¬ 
ning the parity space. Any parity vector yields a 
true residual r t (t); they can be so chosen that a 
set of residuals possesses structured behavior. 

Diagnostic observers. Various observer 
schemes have been extensively investigated 
as possible residual generator algorithms. The 
basic full-order Luenberger observer (assuming 
D = 0) is 

x (,+1) = A x (,) + B u(0 + K e(0 
where K is the observer gain matrix and 
e(0 = y(0 - C x (r) 

is the innovation vector. If the observer is stable 
then, apart from the start-up transient of the 
observer, the innovation qualifies as the primary 
residual. The gain matrix K is the major design 
parameter; it is chosen to place the observer 
poles, thus achieving stability and desired dy¬ 
namic behavior (e.g., noise suppression). The 
remaining design freedom can be utilized to in¬ 
fluence residual properties. The latter are further 
affected by the transformation r(£) = He(0, 
where the H matrix is an additional design pa¬ 
rameter. Diagnostic observers can be designed for 
both structured and directional residuals (Chen 
and Patton 1999; White and Speyer 1987). Other 
observer schemes, most notably the unknown in¬ 
put observer , have also been proposed (Frank and 
Wunnenberg 1989). Because of their complexity, 
the detailed design procedures of diagnostic ob¬ 
servers go beyond the scope of this entry. 

Principal Component Analysis 

Principal component analysis is extensively 
used in the monitoring of complex plants with 
hundreds of variables because, by revealing linear 


relations among the variables, it significantly 
reduces the dimensionality of the plant model 
(Kresta et al. 1991). The application of PCA for 
FDD implies two phases. In the training phase, 
an implicit plant model is created from empirical 
plant data. In the monitoring phase, this model is 
used for FDD. 

Training data (measured inputs and outputs) 
are collected from the plant during fault-free 
operation. The covariance matrix of the data is 
formed and its eigenstructure obtained. Due to 
linear relations among the data, some of the 
eigenvalues will be zero (or near zero, in the 
presence of noise). The eigenvectors belonging 
to the nonzero eigenvalues form the data space , 
where the fault-free data exist, while those be¬ 
longing to the zero eigenvalues form the residual 
space. 

It is the residual space that is utilized for FDD. 
The projection of a measurement vector onto the 
residual space is the (primary) residual. A statis¬ 
tical test on its size leads to a detection decision 
(the absence or presence of faults). A thresh¬ 
old test is necessary because noise also causes 
nonzero residuals. An analysis of the eigenvec¬ 
tors spanning the residual space shows how the 
various faults propagate to the primary residual. 
This allows for the design of residual manipula¬ 
tions yielding structured or directional residuals, 
just like in the FDD methods based on exact 
models (Gertler et al. 1999). 

The procedure as described above applies to 
sensor and actuator faults; inclusion of plant 
faults requires extra effort (and experiments). 
Also, PCA is primarily meant for static models. 
Its extension to discrete-time dynamic models is 
straightforward, but it increases the size of the 
model, proportionally to the dynamic order of the 
model. 


Summary and Future Directions 

Fault detection and diagnosis is today a mature 
field of systems and control engineering. There 
is a very significant level of activity, as measured 
in published papers and conference contributions, 
but much of this (in the opinion of this author) 
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is just minor refinements of earlier results. This 
applies particularly to the long ongoing quest to 
create “robust” FDD algorithms, especially in the 
face of model errors. 

There are still open challenges in a couple of 
areas, most notably extensions to various non¬ 
linear or parameter varying problems. Another 
open and active area, of great practical impor¬ 
tance, is FDD in networked control systems. 
What is really of the greatest interest, though, 
is the application of the wealth of available the¬ 
oretical results and design methods to real-life 
problems; there has recently been some visible 
progress here, a most welcome development. 
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Abstract 

A closed-loop control system for an engineer¬ 
ing process may have unsatisfactory performance 
or even instability when faults occur in actua¬ 
tors, sensors, or other process components. Fault- 
tolerant control (FTC) involves the development 
and design of special controllers that are capable 
of tolerating the actuator, sensor, and process 
faults while still maintaining desirable and ro¬ 
bust performance and stability properties. FTC 
designs involve knowledge of the nature and/or 
occurrence of faults in the closed-loop system 
either implicitly or explicitly using methods of 
fault detection and isolation (FDI), fault detection 
and diagnosis (FDD), or fault estimation (FE). 
FTC controllers are reconfigured or restructured 
using FDI/FDD information so that the effects of 
the faults are reduced or eliminated within each 
feedback loop in active or passive approaches or 
compensated in each control-loop using FE meth¬ 
ods. A non-mathematical outline of the essential 
features of FTC systems is given with important 
definitions and a classification of FTC systems 
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into either active/passive approaches with exam¬ 
ples of some well-known strategies. 


Keywords 

Active FTC; Fault accommodation; Fault 
detection and diagnosis (FDD); Fault detection 
and isolation (FDI); Fault estimation (FE); Fault- 
tolerant control; Passive FTC; Reconfigurable 
control 


Patton 1999; Gertler 1998; Patton et al. 2000) 
motivated by studies in the 1980s on this topic 
(Patton et al. 1989). Fault-tolerant control (FTC) 
began to develop in the early 1990s (Patton 1993) 
and is now a standard in the literature (Patton 
1997; Blanke et al. 2006; Zhang and Jiang 2008), 
based on the aerospace subject of reconfigurable 
flight control making use of redundant actua¬ 
tors and sensors (Steinberg 2005; Edwards et al. 
2010 ). 


Introduction 

The complexity of modern engineering systems 
has led to strong demands for enhanced control 
system reliability, safety, and green operation in 
the presence of even minor anomalies. There 
is a growing need not only to determine the 
onset and development of process faults before 
they become serious but also to adaptively com¬ 
pensate for their effects in the closed-loop sys¬ 
tem or using hardware redundancy to replace 
faulty components by duplicate and fault-free 
alternatives. The title “failure tolerant control” 
was given by Eterno et al. (1985) working on a 
reconfigurable flight control study defining the 
meaning of control system tolerance to failures 
or faults. The word “failure” is used when a fault 
is so serious that the system function concerned 
fails to operate (Isermann 2006). The title failure 
detection has now been superseded by fault de¬ 
tection , e.g., in fault detection and isolation (FDI) 
or fault detection and diagnosis (FDD) (Chen and 


Definitions Relating to Fault-Tolerant 
Control 

FTC is a strategy in control systems architecture 
and design to ensure that a closed-loop system 
can continue acceptable operation in the face 
of bounded actuator, sensor, or process faults. 
The goal of FTC design must ensure that the 
closed-loop system maintains satisfactory stabil¬ 
ity and acceptable performance during either one 
or more fault actions. When prescribed stability 
and closed-loop performance indices are main¬ 
tained despite the action of faults, the system 
is said to be “fault tolerant,” and the control 
scheme that ensures the fault tolerance is the 
fault-tolerant controller (Blanke et al. 2006; Pat¬ 
ton 1997). 

Fault modelling is concerned with the rep¬ 
resentation of the real physical faults and their 
effects on the system mathematical model. Fault 
modelling is important to establish how a fault 
should be detected, isolated, or compensated. 


Fault-Tolerant Control, 
Fig. 1 Closed-loop system 
with actuator, process, and 
sensor faults 
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The faults illustrated in Fig. 1 act at system loca¬ 
tions defined as follows (Chen and Patton 1999): 

An actuator fault ( f a (t)) corresponds to vari¬ 
ations of the control input u(t) applied to the 
controlled system either completely or partially. 
The complete failure of an actuator means that 
it produces no actuation regardless of the input 
applied to it, e.g., as a result of breakage and 
burnout of wiring. For partial actuator faults, the 
actuator becomes less effective and provides the 
plant with only a part of the normal actuation 
signal. 

A sensor is an item of equipment that 
takes a measurement or observation from the 
system, e.g., potentiometers, accelerometers, 
tachometers, pressure gauges, strain gauges, etc.; 
a sensor fault ( f s (t )) implies that incorrect 
measurements are taken from the real system. 
This fault can also be subdivided into either a 
complete or partial sensor fault. When a sensor 
fails, the measurements no longer correspond 
to the required physical parameters. For a 
partial sensor fault the measurements give 
an inaccurate indication of required physical 
parameters. 

A process fault ( f p (t )) directly affects the 
physical system parameters and in turn the in¬ 
put/output properties of the system. Process faults 
are often termed component faults , arising as 
variations from the structure or parameters used 
during system modelling, and as such cover a 
wide class of possible faults, e.g., dirty water hav¬ 
ing a different heat transfer coefficient compared 
to when it is clean, or changes in the viscosity of a 
liquid or components slowly degrading over time 
through wear and tear, aging, or environmental 
effects. 


Architectures and Classification 
of FTC Schemes 

FTC methods are classified according to whether 
they are “passive” or “active,” using fixed or 
reconfigurable control strategies (Eterno et al. 
1985). Various architectures have been proposed 
for the implementation of FTC schemes, for 
example, the structure of reconfigurable control 


based on generalized internal model control 
(GIMC) has been proposed by Zhou and Ren 
(2001) and other studies by Niemann and 
Stoustrup (2005). Figure 2 shows a suitable 
architecture to encompass active and passive 
FTC methods in which a distinction is made 
between “execution” and “supervision” levels. 
The essential differences and requirements 
between the passive FTC (PFTC) and active 
FTC (AFTC). 

PFTC is based solely on the use of robust 
control in which potential faults are considered 
as if they are uncertain signals acting in the 
closed-loop system. This can be related to the 
concept of reliable control (Veillette et al. 1992). 
PFTC requires no online information from the 
fault diagnosis (FDI/FDD/FE) function about the 
occurrence or presence of faults and hence it is 
not by itself and adaptive system and does not 
involve controller reconfiguration (Patton 1993, 
1997; Siljak 1980). PFTC approach can be used 
if the time window during which the system 
remains stabilizable in the presence of a fault is 
short; see, for example, the problem of the double 
inverted pendulum (Weng et al. 2007) which is 
unstable during a loop failure. 

AFTC has two conceptual steps to provide 
the system with fault-tolerant capability (Blanke 
et al. 2006; Patton 1997; Zhang and Jiang 2008; 
Edwards et al. 2009): 

• Equip the system with a mechanism to make it 
able to detect and isolate (or even estimate) the 
fault promptly, identify a faulty component, 
and select the required remedial action in to 
maintain acceptable operation performance. 
With no fault a baseline controller attenu¬ 
ates disturbances and ensures good stability 
and closed-loop tracking performance (Pat¬ 
ton 1997), and the diagnostic (FDI/FDD/FE) 
block recognizes that the closed-loop system 
is fault-free with no control law change re¬ 
quired (supervision level). 

• Make use of supervision level information and 
adapt or reconfigure/restructure the controller 
parameters so that the required remedial activ¬ 
ity can be achieved (execution level). 

Figure 3 gives a classification of PFTC and AFTC 
methods (Patton 1997). 
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Fault-Tolerant Control, 
Fig. 2 Scheme of FTC 
(Adapted from Blanke 
et al. (2006)) 
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Figure 3 shows that AFTC approaches 
are divided into two main types of methods: 
projection-based methods and online automatic 
controller redesign methods. The latter involves 
the calculation of new controller parameters 
following control impairment, i.e., using 
reconfigurable control. In projection-based 
methods, a new precomputed control law is 
selected according to the required controller 
structure (i.e., depending on the type of isolated 
fault). 

AFTC methods use online-fault accommoda¬ 
tion based on unanticipated faults, classified as 
(Patton 1997): 

(a) Based on offline (pre-computed) control laws 

(b) Online-fault accommodating 


(c) Tolerant to unanticipated faults using 
FDI/FDD/FE 

(d) Dependent upon use of a baseline controller 

AFTC Examples 

One example of AFTC is model-based predictive 
control (MPC) which uses online computed con¬ 
trol redesign. MPC is online-fault accommodat¬ 
ing; it does not use an FDI/FDD unit and is not 
dependent on a baseline controller. MPC has a 
certain degree of fault tolerance against actuator 
faults under some conditions even if the faults are 
not detected. The representation of actuator faults 
in MPC is relatively natural and straightforward 
since actuator faults such as jams and slew-rate 
reductions can be represented by changing the 
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MPC optimization problem constraints. Other 
faults can be represented by modifying the inter¬ 
nal model used by MPC (Maciejowski 1998). The 
fact that online-fault information is not required 
means that MPC is an interesting method for 
flight control reconfiguration as demonstrated by 
Maciejowski and Jones in the GARTEUR AG 16 
project on “fault-tolerant flight control” (Edwards 
etal. 2010). 

Another interesting AFTC example that makes 
use of the concept of model-matching in explicit 
model following is the so-called pseudo-inverse 
method (PIM) (Gao and Antsaklis 1992) which 
requires the nominal or reference closed-loop 
system matrix to compute the new controller gain 
after a fault has occurred. The challenges are: 

1. Guarantee of stability of the reconfigured 

closed-loop system 

2. Minimization of the time consumed to ap¬ 
proach the acceptable matching 

3. Achieving perfect matching through use of 

different control methodologies 

Exact model-matching may be too demanding, 
and some extensions to this approach make 
use of alternative, approximate (norm-based) 
model-matching through the computation of 
the required model-following gain. To relax 
the matching condition further, Staroswiecki 
(2005) proposed an admissible model-matching 
approach which was later extended by Tornil 
et al. (2010) using D-region pole assignment. 
The PIM approach requires an FDI/FDD/FE 
mechanism and is online-fault accommodating 
only in terms of a priori anticipated faults. 
This limits the practical value of this ap¬ 
proach. 

As a third example, feedback linearization 
can be used to compensate for nonlinear dy¬ 
namic effects while also implementing control 
law reconfiguration or restructure. In flight con¬ 
trol an aileron actuator fault will cause a strong 
coupling between the lateral and longitudinal 
aircraft dynamics. Feedback linearization is an 
established technique in flight control (Ochi and 
Kanai 1991). The faults are identified indirectly 
by estimating aircraft flight parameters online, 
e.g., using a recursive least-squares algorithm to 
update the FTC. 


Hence, a AFTC system provides fault 
tolerance either by selecting a precomputed 
control law ( projection-based ) (Boskovic and 
Mehra 1999; Maybeck and Stevens 1991; 
Rauch 1995) or by synthesizing a new control 
strategy online (« online controller redesign ) 
(Ahmed-Zaid et al. 1991; Richter et al. 
2007; Efimov et al. 2012; Zou and Kumar 
2011 ). 

Another widely studied AFTC method is the 
estimation and compensation approach , where 
a fault compensation input is superimposed 
on the nominal control input (Noura et al. 
2000; Boskovic and Mehra 2002; Sami and 
Patton 2013; Zhang et al. 2004). There is a 
growing interest in robust FE methods based on 
sliding mode estimation (Edwards et al. 2000) 
and augmented observer methods (Gao and 
Ding 2007; Jiang et al. 2006; Sami and Patton 
2013). 

An important development of this approach is 
the so-called fault hiding strategy which is cen¬ 
tered on achieving FTC loop goals such that the 
nominal control loop remains unchanged through 
the use of virtual actuators or virtual sensors 
(Blanke et al. 2006; Sami and Patton 2013). Fault 
hiding makes use of the difference between the 
nominal and faulty system state to changes in the 
system dynamics such that the required control 
objectives are continuously achieved even if a 
fault occurs. In the sensor fault case, the effect 
of the fault is hidden from the input of the con¬ 
troller. However, actuator faults are compensated 
by the effect of the fault (Funze and Steffen 
2006; Richter et al. 2007; Ponsart et al. 2010; 
Sami and Patton 2013) in which it is assumed 
that the FDI/FDD or FE scheme is available. 
The virtual actuator/sensor FTC can be good 
practical value if FDI/FDD/FE robustness can be 
demonstrated. 

Traditional adaptive control methods that 
automatically adapt controller parameters to 
system changes can be used in a special 
application of AFTC, potentially removing the 
need for FDI/FDD and controller redesign steps 
(Tang et al. 2004; Zou and Kumar 2011) but 
possibly using the FE function. Adaptive control 
is suitable for FTC on plants that have slowly 
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varying parameters and can tolerate actuator and 
process faults. Sensor faults are not tolerated well 
as the controller parameters must adapt according 
to the faulty measurements, causing incorrect 
closed-loop system operation; the FDI/FDD/FE 
unit is required for such cases. 


Summary and Future Directions 

FTC is now a significant subject in control 
systems science with many quite significant 
application studies, particularly since the new 
millennium. Most of the applications are within 
the flight control field with studies such as 
the GARTEUR AG16 project “Fault-Tolerant 
Flight Control” (Edwards et al. 2010). As a very 
complex engineering-led and mathematically 
focused subject, it is important that FTC remains 
application-driven to keep the theoretical con¬ 
cepts moving in the right directions and satisfying 
end-user needs. The original requirement for FTC 
in safety-critical systems has now widened to en¬ 
compass a good range of fault-tolerance require¬ 
ments involving energy and economy, e.g., for 
greener aircraft and for FTC in renewable energy. 

Faults and modelling uncertainties as well as 
endogenous disturbances have potentially com¬ 
peting effects on the control system performance 
and stability. This is the robustness problem in 
FTC which is beyond the scope of this article. 
The FTC system provides a degree of tolerance 
to closed-loop systems faults and it is also subject 
to the effects of modelling uncertainty arising 
from the reality that all engineering systems are 
nonlinear and can even have complex dynamics. 
For example, consider the PFTC approach rely¬ 
ing on robustness principles - as a more complex 
extension to robust control. PFTC design requires 
the closed-loop system to be insensitive to faults 
as well as modelling uncertainties. This requires 
the use of multi-objective optimization methods 
e.g., using linear matrix inequalities (LMI), as 
well as methods of accounting for dynamical sys¬ 
tem parametric variations, e.g., linear parameter 
varying (LPV) system structures, Takagi-Sugeno, 
or sliding mode methods. 
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Abstract 

Effective methods exist for the control of linear 
systems but this is less true for nonlinear systems. 
Therefore, it is very useful if a nonlinear system 
can be transformed into or approximated by a 
linear system. Linearity is not invariant under 
nonlinear changes of state coordinates and non¬ 
linear state feedback. Therefore, it may be possi¬ 
ble to convert a nonlinear system into a linear one 
via these transformations. This is called feedback 
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linearization. This entry surveys feedback lin¬ 
earization and related topics. 

Keywords 

Distribution; Frobenius theorem; Involutive dis¬ 
tribution; Lie derivative 

Introduction 

A controlled linear dynamics is of the form 

x = Fx + Gu (1) 

where the state x e IR n and the control u e IR m . 
A controlled nonlinear dynamics is of the form 

x = f{x, u ) (2) 

where x, u have the same dimensions but may 
be local coordinates on some manifolds X, U. 
Frequently, the dynamics is affine in the control, 
i.e., 

X = f{x) + g(x)u (3) 

where /(x) e IR nXl is a vector field and g(x) = 
[^(x),..., g m (x)] e JR nXm is a matrix field. 

Linear dynamics are much easier to analyze 
and control than nonlinear dynamics. For exam¬ 
ple, to globally stabilize the linear dynamics (1), 
all we need to do is to find a linear feedback law 
u = Kx such that all the eigenvalues of F + GK 
are in the open left half plane. Finding a feedback 
law u = k (x) to globally stabilize the nonlinear 
dynamics is very difficult and frequently impos¬ 
sible. Therefore, finding techniques to linearize 
nonlinear dynamics has been a goal for several 
centuries. 

The simplest example of a linearization tech¬ 
nique is to approximate a nonlinear dynamics 
around a critical point by its first-order terms. 
Suppose x°, u° is an operating point for the 
nonlinear dynamics (2), that is, /(x°,m°) = 0. 
Define displacement variables z = x — x° and 
v = u — if, and assuming f(x,u) is smooth 


around this operating point, expand (2) to first 
order 

z = t^-(x°, u°)z + (x°, u°)v 

ox ou 

+ 0(z,v) 2 (4) 

Ignoring the higher order terms, we get a linear 
dynamics (1) where 

F = f(x°,u% G = f(x°,u°) 

OX ou 

This simple technique works very well in 
many cases and is the basis for many engineer¬ 
ing designs. For example, if the linear feedback 
v = Kz puts all the eigenvalues of F + GK 
in the left half plane, then the affine feedback 
u = u° + K(x — x°) makes the closed-loop 
dynamics locally asymptotically stable around 
x°. So, one way to linearize a nonlinear dynamics 
is to approximate it by a linear dynamics. 

The other way to linearize is by a nonlinear 
change of state coordinates and a nonlinear state 
feedback because linearity is not invariant under 
these transformations. To see this, suppose we 
have a controlled linear dynamics (1) and we 
make the nonlinear change of state coordinates 
z = fix) and nonlinear feedback u = y(z,v). 
We assume that these transformations are invert¬ 
ible from some neighborhood of x° = 0, u° = 0 
to some neighborhood of z°,v°, and the inverse 
maps are 

fifix)) = x, fifiz) = z 
Kifiz), yiz, v)) = v, y(0(x),*:(x, u)) — u 

then (1) becomes 
3 d> 

z = -f- (x) iFx + Gu) 
ox 

= fifU)) (Ftyiz) + Gy(z, V)) 

which is a controlled nonlinear dynamics (2). 
This raises the question asked by Brockett (1978), 
when is a controlled nonlinear dynamics a change 
of coordinate and feedback away from a con¬ 
trolled linear dynamics? 
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Linearization of a Smooth Vector 
Field 

Let us start by addressing an apparently simpler 
question that was first considered by Poincare. 
Given an uncontrolled nonlinear dynamics 
around a critical point, say x° = 0, 

X = fix), /(0) = 0, 

find a smooth local change of coordinates 

z = 0(x), 0(0) = 0 

which transforms it into an uncontrolled linear 
dynamics. 

z = Fz. 

This question is apparently simpler, but as we 
shall see in the next section, the corresponding 
question for a controlled nonlinear dynamics that 
is affine in the control is actually easier to answer. 

Without loss of generality, we can restrict our 
attention to changes of coordinates which carry 
x° = 0 to z° = 0 and whose Jacobian at this 
point is the identity, i.e., 

z = x + 0(x 2 ) 

then 

F = ( 0 ). 

Poincare’s formal solution to this problem was to 
expand the vector field and the desired change of 
coordinates in a power series, 

x = Fx + f [2 \x) + 0(x 3 ) 
z = x - 21 (x) 

where <p ^ are ft-dimensional vector fields, 

whose entries are homogeneous polynomials 
of degree 2 in x. A straightforward calculation 
yields 

z= Fz- 1- / [2] (x) - [Fx,0 [2] (x)] + 0(x) 3 


where the Lie bracket of two vector fields 
/(x), g(x) is the new vector field defined by 

[f(x),g(x)] = ^(x)f(x) - ^(x)g(x). 
ox ox 

Hence, 0^(x) must satisfy the so-called ho¬ 
mological equation (Arnol’d 1983) 

[Fx,0^(x)] = f®(x). 

This is a linear equation from the space of 
quadratic vector fields to the space of quadratic 
vector fields. The quadratic ft-dimensional vector 
fields form a vector space of dimension n times 
ft + 1 choose 2. 

Poincare showed that the eigenvalues of the 
linear map 

0^(jc) [Fx, c/)^ 2 \x)] (5) 

are A* + Ay — Ajt where A/, Xj , Xf are eigenvalues 
of F. If none of these expressions are zero, 
then the operator (5) is invertible. A degree two 
resonance occurs when A z + A j — Xk = 0 and 
then the homological equation is not solvable for 
all f^ 2 \x). 

Suppose a change of coordinates exists that 
linearizes the vector field up to degree r. In the 
new coordinates, the vector field is of the form 

x = Fx + f [r] (x) + 0(x) r+l . 

We seek a change of coordinates of the form 

z = x — 0 [r] (x) 

to cancel the degree r terms, i.e., we seek a 
solution of the rth degree homological equation, 

[Fx,0^(x)] = f^ r \x). 

A degree r resonance occurs if 

A i x + ... + A i r — Xf — 0. 
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If there is no resonance of degree r, then the de¬ 
gree r homological equation is uniquely solvable 
for every f^ r \x). 

When there are no resonances of any degree, 
then the convergence of the formal power series 
solution is delicate. We refer the reader to Arnol’d 
(1983) for the details. 


Linearization of a Controlled 
Dynamics by Change of State 
Coordinates 

Given a controlled affine dynamics (3) when does 
there exist a smooth local change of coordinates 

z = 0 = 0(0) 

transforming it to 

z. = Fz + Gu 

where 

F = f(0), G = g(0) 

OX 

This is an easier question to answer than that of 
Poincare. 

The controlled affine dynamics (3) is 
said to have well-defined controllability 
(Kronecker) indices if there exists a reordering of 
g l (x),..., g m {x) and integers r\ > r 2 > • • • > 
r m > 0 such that r\ + • • • + r m = n , and the 
vector fields 

{ad k (f)g J : j = \,... ,m, k = 0,... ,rj — l} 

are linearly independent at each x where g l 
denotes the i th column of g and 

ad °(f)g i =g i , ad k {f)g i = [/, ad* -1 (f)g ‘]. 

If there are several sets of indices that satisfy this 
definition, then the controllability indices are the 
smallest in the lexicographic ordering. 


A necessary and sufficient condition is that 

[ad k (f)g‘, ad 1 (f)g j ] = 0 

for k = 0,..., n — 1, / = 0,... ,n 

The proof of this theorem is straightforward. 
Under a change of state coordinates, the vector 
fields and their Lie brackets are transformed by 
the Jacobian of the coordinate change. Trivially 
for linear systems, 

ad*(Fx)G' = (-1 ) k F k G 
[ad* (Fx)G\ ad' (Fx)G j ] = 0. 

Feedback Linearization 

We turn to a question posed and partially an¬ 
swered by Brockett (1978). Given a system affine 
in the m -dimensional control 

X = fix) + g(x)u, 

find a smooth local change of coordinates and 
smooth feedback 

z = 0(*), u = a(x) + /3(x)v 

transforming it to 

z = Fz + Gv 

Brockett solved this problem under the assump¬ 
tions that P is a constant and the control is a 
scalar, m = 1. The more general question for 
P (x) and arbitrary m was solved in different ways 
by Korobov (1979), Jakubczyk and Respondek 
(1980), Sommer (1980), Hunt and Su (1981), Su 
(1982), and Hunt et al. (1983). 

We describe the solution when m = 1. If the 
pair F, G is controllable, then there exist an H 
such that 

HF k ~ l G = 0 k = 1,— 1 

HF n ~ x G = 1 
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If the nonlinear system is feedback lineariz- 
able, then there exists a function h(x) = Hcj)(x ) 
such that 

L ad k-\ (f)gh ~ 0 k = 1 ,..., n — 1 

L a d n - l {f)gh 7 ^ ^ 

where the Lie derivative of a function h by a 
vector field g is given by 

7 dh 

l ‘ h = a?* 

This is a system of first-order PDEs, and the 
solvability conditions are given by the classical 
Frobenius theorem, namely, that 

{g,...,ad n - 2 (f)g} 

is involutive, i.e., its span is closed under Lie 
bracket. 

For controllable systems, this is a necessary 
and sufficient condition. The controllability con¬ 
dition is that {g ,..., ad n ~ l (f)g} spans v space. 

Suppose m = 2 and the system has controlla¬ 
bility (Kronecker) indices r\ > 7*2. Such a system 
is feedback linearizable iff 

{g\,g 2 ,...,ad r ‘- 2 (f)g\ad r ‘- 2 (f)g 2 } 

is involutive for i = 1,2. Another way of 

putting is that the distribution spanned by the first 
through rth rows of the following matrix must 
be involutive for r = — 1, i = 1,2. This is 

equivalent to the distribution spanned by the first 
through rth rows of the following matrix being 
involutive for all r = 1,..., r \. 

g l g 2 

ad(f)g ad(f)g 2 

ad n ~ 2 (f)g l ad n ~ 2 (f)g 2 
ad r2 ~\f)g l ad r2 ~\f)g 2 


ad r ' 2 (f)g l 
lad r '-\f)g l 


One might ask if it is possible to use dynamic 
feedback to linearize a system that is not lineariz¬ 
able by static feedback. Suppose we treat one of 
the controls uj as a state and let its derivative be 
a new control, 

Uj = Uj 

can the resulting system be linearized by state 
feedback and change of state coordinates? 
Loosely speaking, the effect of adding such 
an integrator to the yth control is to shift the 
yth column of the above matrix down by one 
row. This changes the distribution spanned by 
the first through rth rows of the above matrix 
and might make it involutive. A scalar input 
system m = 1 that is linearizable by dynamic 
state feedback is also linearizable by static state 
feedback. There are multi-input systems m > 1 
that are dynamically linearizable but not statically 
linearizable (Charlet et al. 1989, 1991). 

The generic system is not feedback lineariz¬ 
able, but mechanical systems with one actuator 
for each degree of freedom typically are feedback 
linearizable. This fact had been used in many 
applications, e.g., robotics, before the concept of 
feedback linearization. 

One should not lose sight of the fact that sta¬ 
bilization, model-matching, or some other perfor¬ 
mance criterion is typically the goal of controller 
design. Linearization is a means to the goal. 
We linearize because we know how to meet the 
performance goal for linear systems. 

Even when the system is linearizable, finding 
the linearizing coordinates and feedback can be 
a nontrivial task. Mechanical systems are the ex¬ 
ception as the linearizing coordinates are usually 
the generalized positions. Since the ad k ~ l (f)g 
for k = 1— 1 are characteristic directions 
of the PDE for h , the general solutions of the 
ODE’s 

X = ad k ~\f)g{x) 

can be used to construct the solution (Blanken¬ 
ship and Quadrat 1984). The Gardner-Shad wick 
(GS) algorithm (1992) is the most efficient 
method that is known. 

Linearization of discrete time systems was 
treated by Lee et al. (1986). Linearization of 
discrete time systems around an equilibrium 
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manifold was treated by Barbot et al. (1995) and 
Jakubczyk (1987). Banaszuk and Hauser have 
also considered the feedback linearization of 
the transverse dynamics along a periodic orbit, 
(Banaszuk and Hauser 1995a,b). 


Input-Output Linearization 

Feedback linearization as presented above ig¬ 
nores the output of the system but typically one 
uses the input to control the output. Therefore, 
one wants to linearize the input-output response 
of the system rather than the dynamics. This was 
first treated in Isidori and Krener (1982) and 
Isidori and Ruberti (1984). 

Consider a scalar input, scalar output system 
of the form 

x = f(x) + g(x)u , y = h(x ) 

The relative degree of the system is the number of 
integrators between the input and the output. To 
be more precise, the system is of relative degree 
r > 1 if for all x of interest, 

Ladi (f)gh(x') 0 j 0, . . . , V 2 

L ad’-'(f)g h ^ X ) # 0 

In other words, the control appears first in the rth 
time derivative of the output. Of course, a system 
might not have a well-defined relative degree as 
the r might vary with x. 

Rephrasing the result of the previous section, 
a scalar input nonlinear system is feedback lin- 
earizable if there exist an pseudo-output map 
h(x) such the resulting scalar input, scalar output 
system has a relative degree equal to the state 
dimension n. 

Assume we have a scalar input, scalar out¬ 
put system with a well-defined relative degree 
1 < r < n. We can define r partial coordinate 
functions 

h(x) = (L f y- l h(x) i = 1,..., r 


and choose n — r functions (x), / = 1,..., 
n — r so that (£, £) are a full set of coordinates on 
the state space. Furthermore, it is always possible 
(Isidori 1995) to choose £/ (x) so that 

L g tji (x) = 0 i = 1,..., n — r 

In these coordinates, the system is in the nor¬ 
mal form 

y = h 

?i=& 

tr -1 = 

tr = frite) + gr&$)u 

t = <Kte) 

The feedback u = defined by 

(«-/r (?,£)) 
u = - 

fr(M) 

transforms the system to 

y = H$ 
t = Fi; + Gv 

t = <Kte) 

where F, G, H are the r x r, r x 1, 1 x r matrices 



"0 1 0 

... 0" 


"0" 


00 1 

... 0 


0 

F = 

000 

... 1 

G = 

0 


_000 

... 0 _ 


_ 1 _ 


H = [100...0]. 

The system has been transformed into a string 
of integrators plus additional dynamics that is 
unobservable from the output. 

By suitable choice of additional feedback v = 
K £, one can insure that the poles of F + GK 
are stable. The stability of the overall system then 
depends on the stability of the zero dynamics 
(Byrnes and Isidori 1984, 1988), 









434 


Feedback Linearization of Nonlinear Systems 


l = <K <U) 


If this is stable, then the overall system will be 
stable. The zero dynamics is so-called because 
it is the dynamics that results from imposing the 
constraint y{t) = 0 on the system. For this to be 
satisfied, the initial value must satisfy £ (0) = 0 
and the control must satisfy 


u(t) = — 


MQ,m 

g r (o,m) 


Similar results hold in the multiple input, multi¬ 
ple output case, see Isidori (1995) for the details. 


Approximate Feedback Linearization 

Since so few controlled dynamics are exactly 
feedback linearizable, Krener (1984) introduced 
the concept of approximate feedback lineariza¬ 
tion. The goal is to find a smooth local change 
of coordinates and a smooth feedback 

Z = <p(x), U = O'(x) + P(x)v 

transforming the affinity controlled dynamics 
(3) to 

z = Fz + Gv + N(x, u) 

where the nonlinearity N(x,u ) is small in some 
sense. In the design process, the nonlinearity is 
ignored, and the controller design is done on the 
linear model and then transformed back into a 
controller for the original system. 

The power series approach of Poincare was 
taken by Krener et al. (1987,1988,1991), Krener 
(1990), and Krener and Maag (1991). See also 
Kang (1994). It is applicable to dynamics which 
may not be affine in the control. The controlled 
nonlinear dynamics (2), the change of coordi¬ 
nates, and the feedback are expanded in a power 
series 


v = u — aP^ipc, u ) 

The transformed system is 

z = Fz + Gv + /^(x, u) 

— [.Fx + Gw, 0^(x)] + Ga\ 2 \x,u) 

T O (x, iCy* 

To eliminate the quadratic terms, one seeks a 
solution of the degree two homological equations 
for (j)^ 2 \ 

\Fx + Gw,0^(x)] — Ga^(x,u) = f^ 2 \x,u) 

Unlike before, the degree two homological 
equations are not square. Almost always, the 
number of unknowns is less than the number of 
equations. Furthermore, the the mapping 

(0^(jc),gJ 2 1(jc,w)) i—> [Fx + Gw,0^(v)] 
—Ga^ (x, u) 


is less than full rank. Hence, only an approximate, 
e.g., a least squares solution, is possible. 
Krener has written a MATLAB toolbox 
(http://www.math.ucdavis.edu/ krener 1995) 
to compute term by term solutions to the 
homological equations. The routine fh2f_h_.m 
sequentially computes the least square solutions 
of the homological equations to arbitrary 
degree. 


Observers with Linearizable Error 
Dynamics 

The dual of linear state feedback is linear input- 
output injection. Linear input-output injection is 
the transformation carrying 

x = Fx + Gu 


x = Fx + Gu + /^(x, u) + 0(x, u) 3 
z = x — 0 [2] (x) 


y = Hx 
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into 

If H, F is detectable, then F + LH can be made 

x — Fx T- Bu T- Ly -|- Mu 

Hurwitz, i.e., all its eigenvalues are in the open 
left half plane. 

y = 

The case when y = identity, there are no 
inputs m = 0 and one output p = 1, 

Linear input-output injection and linear change 
of state coordinates 

X = fix) 

y = h(x) 

x — Fx T- Bu -\- Ly -|- Mu 


y = 

was solved by Krener and Isidori (1983) and 
Bestle and Zeitz (1983) when the pair H, F 

z = Tx 

defined by 

z = TFT~ x x + TLy + TMu 

F = f(0) 

define a group action on the class of linear sys¬ 
tems. Of course, output injection is not physically 
realizable on the original system, but it is realiz¬ 
able on the observer error dynamics. 

Nonlinear input-output injection is not well 
defined independent of the coordinates; input- 
output injection in one coordinate system does 
not look like input-output injection in another 
coordinate system. 

If a system 

H = ? (0) 

is observable. 

One seeks a change of coordinates z = $(x) 
so that the system is linear up to output injection 

z = Fz + a(y) 

y = Hz 

x = f(x , u), y = h(x) 

If they exist, the z coordinates satisfy the PDE’s 

can be transformed by nonlinear changes of state 
and output coordinates 

^ad n ~ k {f)g^j^ ^k,j 

z = Hx), w = y(y) 

where the vector field g(x) is defined by 

to a linear system with nonlinear input-output 
injection 

( 0 1 < k < n 

L g L k j~ l h = 

{ 1 k = n 

z, = Fz + Gu + a(y , u) 

The solvability conditions for these PDE’s are 

w = Hz 

that forl<k</<ft — 1 

then the observer 

[ad k -\f)g,ad 1 -\f)g}= 0. 

z = (F + L7/)2 + Gu + of(y, u) — Lw 

The general case with y, m, p arbitrary was 
solved by Krener and Respondek (1985). The 

has linear error dynamics 

solution is a three-step process. First, one must 
set up and solve a linear PDE for y(y). The 

■ Mi 

II 

M 

1 

M> 

integrability conditions for this PDE involve the 
vanishing of a pseudo-curvature (Krener 1986). 

z = (F + LH)z 

The next two steps are similar to the above. 



436 


Feedback Linearization of Nonlinear Systems 


One defines a vector field g J , 1 < j < p for 

each output, these define a PDE for the change 
of coordinates, for which certain integrability 
conditions must be satisfied. The process is more 
complicated than feedback linearization and even 
less likely to be successful so approximate solu¬ 
tions must be sought which we will discuss later 
in this section. We refer the reader Krener and 
Respondek (1985) and related work Zeitz (1987) 
and Xia and Gao (1988a,b, 1989). 

Very few systems can be linearized by change 
of state coordinates and input-output injection, so 
Krener et al. (1987, 1988, 1991), Krener (1990), 
and Krener and Maag (1991) sought approximate 
solutions by the power series approach. Again, 
the system, the changes of coordinates, and the 
output injection are expanded in a power series. 
See the above references for details. 


Conclusion 

We have surveyed the various ways a nonlinear 
system can be approximated by a linear system. 


Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Lie Algebraic Methods in Nonlinear Control 

► Nonlinear Zero Dynamics 
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Abstract 

We consider the simplest design problem for non¬ 
linear systems: the problem of rendering asymp¬ 
totically stable a given equilibrium by means of 
state feedback. For such a problem, we provide 
a necessary condition, known as Brockett condi¬ 
tion, and a sufficient condition, which relies upon 
the definition of a class of functions, known as 
control Lyapunov functions. The theory is illus¬ 
trated by means of a few examples. In addition, 
we discuss a nonlinear enhancement of the so- 
called separation principle for stabilization by 
means of partial state information. 

Keywords 

Brockett theorem; Control Lyapunov function; 
Output feedback; State feedback 

Introduction 

The problem of feedback stabilization, namely, 
the problem of designing a feedback control law 
locally, or globally, asymptotically stabilizing a 
given equilibrium point, is the simplest design 
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problem for nonlinear systems. If the state of 
the system is available for feedback, then the 
problem is referred to as the state feedback stabi¬ 
lization problem, whereas if only part of the state, 
for example, an output signal, is available for 
feedback, the problem is referred to as the partial 
state feedback (or output feedback) stabilization 
problem. We initially focus on the state feedback 
stabilization problem, which can be formulated as 
follows. 

Consider a nonlinear system described by the 
equation 

x = F(x, u), (1) 

where x(t) G IR n denotes the state of the system, 
u(t) e IR m denotes the input of the system, and 
F : IR n x IR m —> IR n is a smooth mapping. 

Let Vo G IR n be an achievable equilibrium, 
i.e., Vo is such that there exists a constant uo G 
IR m such that F(x o, uo) = 0. The state feedback 
stabilization problem consists in finding, if possi¬ 
ble, a state feedback control law, described by the 
equation 

u = a (v), (2) 

with a : IR n —> IR m , such that the equilibrium 
Vo is a locally asymptotically stable equilibrium 
for the closed-loop system 

x = F(x,a(x)). (3) 

Alternatively, one could require that the equilib¬ 
rium be globally asymptotically stable. Note that 
it is not always possible to extend local properties 
to global properties. For example, for the system 
described by the equations X\ = V 2 (l — x\), 
v 2 = u , with x\(t) G IR , v 2 (0 G IR , and 
u(t ) G IR, it is not possible to design a feedback 
law which renders the zero equilibrium globally 
asymptotically stable. 

If only partial information on the state is 
available, then one has to resort to a dynamic 
output feedback controller, namely, a controller 
described by equations of the form 

% = « = “(£). (4) 


where £(t) G IR V describes the state of the 
controller, y(t) G IR P is given by y = h{x ), 
for some mapping h : IR n —> IR P , and describes 
the available information on the state v, and f : 
IR V x IRP -> IR V and a : IR V -> IR m are smooth 
mappings. Within this scenario, the stabilization 
problem boils down to selecting the (nonnega¬ 
tive) integer v (i.e., the order of the controller), 
a constant £o £ IR V , and the mappings a and f 
such that the closed-loop system 

x = F(x,am t = M,h(x)), (5) 

has a locally (or globally) asymptotically stable 
equilibrium at (vo,£o)- Alternatively, one may 
require that the equilibrium (vo, £o) of the closed- 
loop system (5) be locally asymptotically stable 
with a region of attraction that contains a given, 
user-specified, set. 

The rest of the entry is organized as follows. 
We begin discussing two key results. The first 
is a necessary condition, due to R.W. Brockett, 
for continuous stabilizability. This provides an 
obstruction to the solvability of the problem and 
can be used to show that, for nonlinear systems, 
controllability does not imply stabilizability by 
continuous feedback. The second one is the ex¬ 
tension of the Lyapunov direct method to systems 
with control. The main idea is the introduction 
of a control version of Lyapunov functions, the 
control Lyapunov functions , which can be used 
to design stabilizing control laws by means of a 
universal formula. We then describe two classes 
of systems for which it is possible to construct, 
with systematic procedures, smooth control laws 
yielding global asymptotic stability of a given 
equilibrium: systems in feedback and in feedfor¬ 
ward form. There are several other constructive 
and systematic stabilization methods which have 
been developed in the last few decades. Worth 
mentioning are passivity-based methods and cen¬ 
ter manifold-based methods. 

We conclude the entry describing a nonlin¬ 
ear version of the separation principle for the 
asymptotic stabilization, by output feedback, of 
a general class of nonlinear systems. 
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Preliminary Results 

To highlight the difficulties and peculiarities of 
the nonlinear stabilization problem, we recall 
some basic facts from linear systems theory and 
exploit such facts to derive a sufficient condition 
and a necessary condition. In the case of linear 
systems, i.e., systems described by the equation 
x = Ax + Bu , with A e IR nXn and B e IR nxm , 
and linear state feedback, i.e., feedback described 
by the equation u = Kx, with K e IR mXn , the 
stabilization problem boils down to the problem 
of placing, in the complex plane, the eigenvalues 
of the matrix A + BK to the left of the imaginary 
axis. This problem is solvable if and only if the 
uncontrollable modes of the system are located, 
in the complex plane, to the left of the imaginary 
axis. 

The linear theory may be used to provide 
a simple obstruction to feedback stabilizability 
and a simple sufficient condition. Let Xo be an 
achievable equilibrium with uo = 0 and note 
that the linear approximation of the system (1) 
around Xo is described by an equation of the form 
x — Ax Bu. 

If for any K e IR mXn the condition 

a (A + BK) flC + /0 (6) 

holds, then the equilibrium of the nonlinear sys¬ 
tem cannot be stabilized by any continuously 
differentiable feedback such that a(xo) = 0. 
The notation a (A) denotes the spectrum of the 
matrix A, i.e., the eigenvalues of A. Note however 
that, if the condition a(xo) = 0 is dropped, the 
obstruction does not hold: the zero equilibrium of 
x = x + xu is not stabilizable by any continuous 
feedback such that Q'(O) = 0, yet the feedback 
u = — 2 is a (global) stabilizer. 

On the contrary, if there exists a K such that 

a (A + BK) C C~ 

then the feedback a(x) = Kx locally asymptot¬ 
ically stabilizes the equilibrium Vo of the closed- 
loop system. This fact is often referred to as the 
linearization approach. 


The above linear arguments are often inade¬ 
quate to design feedback stabilizers: a theory for 
nonlinear feedback has to be developed. How¬ 
ever, this theory is much more involved. In partic¬ 
ular, it is important to observe that the solvability 
of the stabilization problem may depend upon 
the regularity properties of the feedback, i.e., 
of the mapping a. In fact, a given equilibrium 
of a nonlinear system may be rendered locally 
asymptotically stable by a continuous feedback, 
whereas there may be no continuously differen¬ 
tiable feedback achieving the same goal. If the 
feedback is required to be continuously differen¬ 
tiable, then the problem is often referred to as the 
smooth stabilization problem. 

Example 1 To illustrate the role of the regularity 
properties of the feedback, consider the system 
described by the equations 

X\ = X\ — x\, X 2 = u, 

with x\ (t) e IR , X 2 (t) e IR , and u(t) e IR , and 
the equilibrium Vo = (0,0). The equilibrium is 
globally asymptotically stabilized by the contin¬ 
uous feedback 

/ \ 4 i 3 

u(x) = —X 2 + Xi + -x{ - x 2 , 

but it is not stabilizable by any continuously dif¬ 
ferentiable feedback. Note, in fact, that condition 
(6) holds. 


Brockett Theorem 

Brockett’s necessary condition, which is far from 
being sufficient, provides a simple test to rule out 
the existence of a continuous stabilizer. 

Theorem 1 Consider the system (1) and assume 
Vo = 0 is an achievable equilibrium with uq = 0. 

Assume there exists a continuous stabilizing 
feedback u = a(x). Then, for each € > 0 there 
exists 8 > 0 such that, for all y with \ \y\ \ <8, the 
equation y = F(x,u ) has at least one solution in 
the set ||jc|| < e, ||w|| < e. 
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Theorem 1 can be reformulated as follows. 
The existence of a continuous stabilizer implies 
that the image of the mapping F : IR n x IR m -> 
IR n covers a neighborhood of the origin. Note, in 
addition, that the obstruction expressed by Theo¬ 
rem 1 is of topological nature. Hence, it requires 
continuity of F and a, and time invariance: it 
does not hold if u = a(x,t), i.e., a time-varying 
feedback is designed. 

In the linear case, Brockett condition reduces 
to the condition 

rank[A — XI, B] = n 

for A = 0. This is a necessary, but clearly not 
sufficient, condition for the stabilizability of x = 
Ax -T Bu. 

Example 2 Consider the kinematic model of a 
mobile robot given by the equations 

x = cos 8 v, 
y = sin 8 v , 

8 = co, 

where (v(7), y(0) G /R 2 denotes the Cartesian 
position of the robot, 8{t) G (—tt, tt] denotes 
the robot orientation (with respect to the v-axis), 
v(t) G IR is the forward velocity of the robot, 
and co(t) G IR is its angular velocity. Simple 
intuitive considerations suggest that the system 
is controllable, i.e., it is possible to select the 
forward and angular velocities to drive the robot 
from any initial position/orientation to any final 
position/orientation in any given positive time. 
Nevertheless, the zero equilibrium (and any other 
equilibrium of the system) is not continuously 
stabilizable. In fact, the equations 

y\ = cos 0 v, y 2 = sin 8 v, y 3 = co, 

with ||(yi, y 2 , y 3 )|| < S and \\(x, y, 8 )|| < <e, 
\\(v, <z>)|| < e, are in general not solvable. For 
example, if c < tt/2 and y\ = 0, y 2 ^ 0, y 3 = 
0, then the unique solution of the first and third 
equations is v = 0 and co = 0, implying 
sin 8 v = 0; hence, the second equation does not 
have a solution. 


Control Lyapunov Functions 

The Lyapunov theory states that the equilibrium 
Vo of the system 

X = f(x), 

with / : IR n —>► IR n , is locally asymptotically 
stable if there exists a continuously differentiable 
function V : IR n —> IR, called Lyapunov 
function, and a neighborhood U of Vo such that 
V(xq) = 0, V(x) > 0, for all v G U and v ^ xo, 
and / (v) < 0, for all v G U and v ^ x$. 

To apply this idea to the stabilization problem, 
consider the system (1). If the equilibrium Vo of 
x = F(x,u) is continuously stabilizable, then 
there must exist a continuously differentiable 
function V and a neighborhood U of Vo such that 

inf ^-F{x, u) < 0, 
u dx 

for all x e U and v ^ xq. This motivates the 
following definition. 

Definition 1 A continuously differentiable func¬ 
tion V such that 

• V(xo) = 0 and V(x) > 0, for all x e U and 
x ^ 

dV 

• inf — F(x,u ) < 0, for all x e U and v ^ 

u dx 
xo, 

is called a control Lyapunov function. 

By Lyapunov theory, the existence of a contin¬ 
uous stabilizer implies the existence of a control 
Lyapunov function. On the other hand, the ex¬ 
istence of a control Lyapunov function does not 
guarantee the existence of a stabilizer. However, 
in the case of systems affine in the control, i.e., 
systems described by the equation 

X = fix) + g(x)u, (7) 

with / : IR" IR n and g : IR n -* IR nxm 
smooth mappings, very general results can be 
proven. These have been proven by Z. Artstein, 
who gave a nonconstructive statement, and have 
been given a constructive form by E.D. Sontag. 
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In particular, for single-input nonlinear systems, 
the following statement holds. 

Theorem 2 Consider the system (7), with m = 
1, and assume f (0) = 0. 

There exists an almost smooth feedback, i.e., 
the feedback a(x) is continuously differentiable 
for all x e IR n and x ^ 0, and continuous at 
x = 0 which globally asymptotically stabilizes 
the equilibrium x = 0 if and only if there 
exists a positive definite, radially unbounded, i.e., 

lim V(x) = oo and smooth function V(x) 
M-*oo 
such that 


1- ~x~g(x ) =0 =>• —/(x) < 0, for all 

OX ox 

x ^ 0 ; 

2. For each e > 0 there is a 8 > 0 such that 
||*|| < 8 implies that there is a \u\ < e such 
that 

dV dV 

t-/W + < °- 

ox ox 

Condition 2 is known as the small control 
property , and it is necessary to guarantee continu¬ 
ity of the feedback at * = 0. If Conditions 1 and 2 
hold, then an almost smooth feedback is given by 
the so-called Sontag’s universal formula: 


a(x) 


s/w+Ve/w) 2 +esw)‘ 




dV 

if — g(x) = 0, 

OX 


elsewhere. 


Constructive Stabilization 

We now introduce two classes of nonlinear sys¬ 
tems for which systematic design methods to 
solve the state feedback stabilization problem are 
available. 

Feedback Systems 

Consider a nonlinear system described by equa¬ 
tions of the form 

*1 = fl(Xi,X2), *2 = U , ( 8 ) 

with JC! (0 e IR\ x 2 (t) e IR , u(t) e IR n 
and /i(0,0) = 0. This system belongs to 

the so-called class of feedback systems for 
which a sort of reduction principle holds: the 
zero equilibrium of the system is smoothly 
stabilizable if the same holds for the reduced 
system x\ = f(x i, v), which is obtained from 
the first of Eq. (8) replacing the state variable 
*2 with a virtual control input v. To show this 
property, suppose there exist a continuously 
differentiable function ct\ : IR n —> IR and 


a continuously differentiable and radially 
unbounded function V\ : IR n —> IR such that 
V] ( 0 ) = 0 , Vi(xi) > 0 , for all x x ^ 0 , and 

dV\ 

i(x0) < 0, 

for all *i ^ 0, i.e., the zero equilibrium of the 
system x\ = f(x i, v) is globally asymptotically 
stabilizable. 

Consider now the function 

1 2 

V(xi,x 2 ) = Vi(xi) + - (* 2 — Q'i(jCi)) , 

which is radially unbounded and such that 
K(0,0) = 0 and K(*i,* 2 ) > 0 for all nonzero 
(*i, * 2 ), and note that 

dVi 

V = —f(xi,X2) + (x 2 -ai(xi)) 

OX 1 

x(w+ Ai(Xi,X 2 )) 
dVj 

= — f(xi,ai(xi)) + (x 2 -ai(xi)) 

0x1 

x(u + A 2 (Xi,X 2 )), 
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for some continuously differentiable mappings 
Ai and A 2 . As a result, the feedback 


a(x\, x 2 ) = — k 2 (x\, x 2 ) — k(x 2 — oq(jci)), 


with k > 0, yields V < 0 for all nonzero (x \, * 2 ); 
hence, the feedback is a continuously differen¬ 
tiable stabilizer for the zero equilibrium of the 
system ( 8 ). Note, finally, that the function V is 
a control Lyapunov for the system ( 8 ); hence, 
Sontag’s formula can be also used to construct a 
stabilizer. 

The result discussed above is at the basis of the 
so-called backstepping technique for recursive 
stabilization of systems described, for example, 
by equations of the form 


*1 = *2 + <pi(xi), 
x 2 = X 3 + <p 2 (xi,x 2 ), 

X 3 = X 4 + <P3(Xl,X 2 ,X 3 ), 


x n = u + (pn{x 1 , . . .,X n ), 

with xi(t) e IR for all / e [1,«], and cpi smooth 
mappings such that cpi ( 0 ) = 0 , for all i e [ 1 , n\. 

Feedforward Systems 

Consider a nonlinear system described by equa¬ 
tions of the form 


for all X 2 ^ 0. Suppose, in addition, that there ex¬ 
ists a continuously differentiable mapping M(x 2 ) 
such that 


MX 2 ) ~ ^/2(X 2 ) = 0, 

OX 2 


M(0) = 0 and -— 7 ^ 0. Existence of such 

dx 2 x 2 =0 

a mapping is guaranteed, for example, by asymp¬ 
totic stability of the linearization of the system 
x 2 = f 2 (^X 2 ) around the origin and controllability 
of the linearization of the system (9) around the 
origin. 

Consider now the function 
V(xi,x 2 ) = 1 (xi - M(x 2 )) 2 + V 2 (x 2 ), 

which is radially unbounded and such that 
L( 0 , 0 ) = 0 and V(xi,x 2 ) > 0 for all nonzero 
(xi , x 2 ), and note that 


dM dV 2 

V = -(xi - M(x 2 ))— - g 2 (x 2 )u + -r—f 2 {x 2 ) 

OX 2 OX 2 

^ dV2 . , 

+ ^—g2(x 2 )u 
OX 2 


OX 2 

xg 2 (x 2 )w. 


'3F 2 . ... dM\ 

- - (xi - M(x 2 ))~— 

, ox 2 ox 2 J 


x\ — f\(x 2 ), x 2 - f 2 (x 2 ) + g 2 (x 2 )u, (9) As a result, the feedback 


with vi(7) g IR , x 2 (t) e IR } \ u(t ) e IR , 
/i(0) = 0 and f 2 (0) = 0. This system belongs 
to the so-called class of feedforward systems for 
which, similarly to feedback systems, a sort of 
reduction principle holds: the zero equilibrium 
of the system is smoothly stabilizable if the zero 
equilibrium of the reduced system x 2 = f (x 2 ) 
is globally asymptotically stable and some ad¬ 
ditional structural assumption holds. To show 
this property, suppose there exists a continuously 
differentiable and radially unbounded function 
V 2 : IR n -> IR such that V 2 (0) = 0, V 2 {x 2 ) > 0, 
for all x 2 ^ 0 , and 


1^/2 (* 2 ) < 0 , 

ox 2 


fdV 2 9M\ 

a(x u x 2 ) = ~k --(xi - M(x 2 ))~— 

V ox 2 dx 2 ) 

Xg2(x 2 ), 

with k > 0 , yields V < 0 for all nonzero (x\ , x 2 )\ 
hence, the feedback is a continuously differen¬ 
tiable stabilizer for the zero equilibrium of the 
system (9). Note, finally, that the function V is 
a control Lyapunov for the system (9); hence, 
Sontag’s formula can be also used to construct a 
stabilizer. 

The result discussed above is at the basis of 
the so-called forwarding technique for recursive 
stabilization of systems described, for example, 
by equations of the form 
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X\ = <pi(x 2 ,...,x n ), 

X 2 = <p 2 (x 3,...,X„), 

Xfi —1 = (Pn—l(.Xn)’ 

X n = U, 

with Xj ( t ) € IR for all i € [1 ,n], and <p, smooth 
mappings such that <pt (0) = 0, for all i e [1, n]. 


4>o(x) = h(x), 

<h(x,v 0 ) = -x- 
ax 

Vf(x) + g(x)u 0 ] , 
d(p 

<p 2 (x, Vq, Vi)=~P- [/(*) + g(x)u 0 ] 
ox 

301 


Stabilization via Output Feedback 

In the previous sections, we have studied the 
stabilization problem for nonlinear systems 
under the assumption that the whole state 
is available for feedback. This requires 
the online measurement of the state vector 
v, which may pose a severe constrain in 
applications. This observation motivates the 
study of the much more challenging, but 
more realistic, problem of stabilization with 
partial state information. This problem requires 
the introduction of a notion of observabil¬ 
ity. Note that for nonlinear systems, it is 
possible to define several, nonequivalent, ob¬ 
servability notions. Similarly to section “Control 
Lyapunov Functions”, we focus on the class of 
systems affine in the control, i.e., systems 
described by equations of the form 

x = f(x)+g(x)u, 

y = Hx), 


<pk(x,V 0 ,V !,••• , Vk-l) 


dtpk-1 

dx 

X [fix) + g(x)v o] 


k —2 




dfpk -1 
3 Vi 


Vi + 1 , 


( 11 ) 


with k < n. Note that if u(t) is of class C k l , 
then 


= c/)k(x(t), u(t ), • • • , u {k 1} (0), 

where the notation y^ k \t), with k positive inte¬ 
ger, is used to denote the k -th order derivative of 
the function y{t ), provided it exists. The map¬ 
pings 0o to (j > n -i can be collected into a unique 
mapping d> : IR n x IR n ~ l IR n defined as 


$(x,v 0 ,vu--- ,v n - 2 ) 


with / : IR n IR n , g : IR n -> IR nxm and 
h : IR n -> IR P smooth mappings. This is 
precisely the class of systems in Eq. (7) with 
the addition of the output map h , i.e., a map 
which describes the information that is available 
for feedback. In addition, we assume, to simplify 
the notation, that m = 1 and p = 1: the system 
is single input, single output. Finally assume, 
without loss of generality, that the equilibrium 
to be stabilized is Xo = 0, that any stabilizing 
state feedback control law u = a(x) is such that 
a(0) = 0, and that h(0 ) = 0. 

To define the observability notion of interest, 
consider the sequence of mappings 


$0 (x) 

<P\(x,V 0 ) 

_<p n -i(x,Vo,Vw- ,V„- 2 ) _ 

The mapping is, by construction, such that 
O(x(0, u(t), u(t), • • • , u {n ~ 2) (t)) 

= [y(t) m - , 

for any t in which the indicated signals exist. As 
a consequence, if the mapping <t> is such that, as 
some point (x,v), where v = [vq v\ • • • v n -2 ] , 
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rank-- (x, v) = n, ( 12 ) 

3x 

then, by the Implicit Function Theorem, there 
exists locally around (x,v) a smooth mapping 
O : IR n x IR n ~ x -> IR n such that 

co = 0 ( 0 (tu, v), v), 

i.e., the mapping O is the inverse of O, parame¬ 
terized by v. We conclude the discussion noting 
that if, at a certain time t, x = x(t) and v = 
[ u(t) u(t) • • • u {n ~ 2 \t) y are such that the rank 
condition ( 12 ) holds, then the mapping O can be 
used to reconstruct the state of the system from 
measurements of the input, and its derivatives, 
and the output and its derivatives, for all t in a 
neighborhood of t: x(t) = 0(« co(t ), v(t)), for all 
t in a neighborhood of t , where 

v(t) = [ u(t) u(t ) • • • uS n ~ 2 \t) ] 7 , 

©(0 = . (13) 

This property is a local property: to derive a 
property which allows a global reconstruction of 
the state, we need to impose additional condi¬ 
tions. 

Definition 2 Consider the system (10) with m = 
p = 1. The system is said to be uniformly 
observable if: 

(i) The mapping H : IR n —> IR n defined as 


co = <F(vF(&>, v), v) x = T^(0(x, v), v ) 

hold for all x, v and co. In principle, this property 
may be used in an output feedback control archi¬ 
tecture obtained implementing a stabilizing state 
feedback u = a(x) as m = a(^(co, v )), with v 
and co as given in (13). This implementation is 
however not possible, since it gives an implicit 
definition of u and requires the exact differentia¬ 
tion of the input and output signals. 

To circumvent these difficulties, one needs 
to follow a somewhat longer path, as described 
hereafter. In addition, the global asymptotic sta¬ 
bility requirement should be replaced by a less 
ambitious, yet practically meaningful, require¬ 
ment: semi-global asymptotic stability. This re¬ 
quirement can be formalized as follows. 

Definition 3 The equilibrium xo of the system 
(1), or (7), is said to be semi-globally asymptot¬ 
ically stabilizable if, for each compact set JC C 
IR n such that Xo e int (JC), i.e. the set of all 
interior points of /C, there exists a feedback con¬ 
trol law, possibly depending on /C, such that the 
equilibrium xo is a locally asymptotically stable 
equilibrium of the closed-loop system and for any 
x (0 ) e JC one has lim x(t) = x$. 

t —>oo 

To bypass the need for the derivatives of the 
input signal, consider the extended system 

X = fix) +g(x)v 0 , Vo = Vu 

V l = v 2 , ••• V n -1 = U. (14) 


H(x) = 


h(x) 

L fh(x) 


L n ~ l h(x) 


is a global diffeomorphism. The functions 
Ly./z, with i nonnegative integer, are defined 
as Lfh(x) = L^h(x) = ||/(x) and, 
recursively, as L l + l h(x) = Lf{L l fh{x)). 

(ii) The rank condition (12) holds for all (x, v) e 
IR n x IR n ~ x . 

The notion of uniform observability allows to 
perform a global reconstruction of the state, i.e., 
it makes sure that the identities 


Note that, as described in section “Feedback 
Systems”, if the equilibrium xo = 0 of the system 
x = fix) + g(x)u is globally asymptotically 
stabilizable by a (smooth) feedback u = a(x ), 
then there exists a smooth state feedback 
u = a(x, Vo, v\, • • • , v n -\) which globally 
asymptotically stabilizes the zero equilibrium 
of the system (14). In the feedback a, one can 
replace x with \//(co,v ), thus yielding a feedback 
of the measurable part of the state of the system 
(14) and of the output y and its derivatives. Note 
that if co(t) = [y(t) y(t) ••• y^ n ~ x \t) then 
the feedback u = a(\/f(co, v), vo,v\,--- , v n -\) 
globally asymptotically stabilizes the zero 
equilibrium of the system (14). 





Feedback Stabilization of Nonlinear Systems 


445 


To avoid the computation of the derivatives 
of y, we exploit the uniform observability 
property, which implies that the auxiliary 
system 


*1 o = rju 

m = 


(15) 


In- 1 = < Pn(f(rj , v), V 0 , Vi, • • * , 


with rj = [rjo, rji, • • • , r\ n -\ , has the property 
of reproducing y(£) and its derivatives up to 
y( n ~ l \t) if properly initialized. This initializa¬ 
tion is not feasible, since it requires the knowl¬ 
edge of the derivative of the output at t = 0. 
Nevertheless, the auxiliary system (15) can be 
modified to provide an estimate of y and its 
derivatives. The modification is obtained adding 
a linear correction term yielding the system 


10 = rj i + L c n -i(y - rj 0 ), 

11 = m + L 2 c n - 2 (y - rjo), 

in- 1 = , v n - 1 ) + L n c 0 (y - rjo), 


(16) 


with L > 0 and the coefficients Co, • • •, c n -\ such 
that all roots of the polynomial X n + c n -\\ n ~ l + 
c\X + Co are in C~ . The system (16) has the 
ability to provide asymptotic estimates of y and 
its derivatives up to provided these are 

bounded and the gain L is selected sufficiently 
large, i.e., the system (16) is a semi-global ob¬ 
server of y and its derivatives up to y( n ~ l \ 

The closed-loop system obtained using the 
feedback law u = v), vo, v\, • • • , v n -\) 

has a locally asymptotically stable equilibrium at 
the origin. To achieve semi-global stability, one 
has to select L sufficient large and replace \j/ 
with 


\jf(rj, v ) = < 


M 


v), 

v) 

H(r],v)\\ 


if 11 ^( 77 , v)\\ < M, 
, if ||^(77,i;)|| > M, 


with M > 0 to be selected, as detailed in the 
following statement. 

Theorem 3 Consider the system (10) with m = 
p = 1. Let Vo = 0 be an achievable equilibrium. 
Assume h( 0) = 0. Suppose the system is uni¬ 
formly observable and there exists a smooth state 
feedback control law u = a(x) which globally 
asymptotically stabilizes the zero equilibrium and 
it is such that Q'(O) = 0. 

Then for each R > 0, there exist R > 0 and 
M* > 0, and for each M > M*, there exists L* 


such that for each M > M * and L > L*, the 
dynamic output feedback control law 


Vo = Vi, 

Vl = V 2 , 

Vn -1 = v), V 0 , Vl, ■ ■ ■ , V„-i) 

m = m + Lc n -i(y-r]o), 
m = 1)2 + L 2 c n - 2 (y - rjo), 

f]n-l = <pn{^{r],v),v - , V n -\) 

+ L n c 0 (y - rjo), 

U = Vo, 


yields a closed-loop system with the following 
properties: 

• The zero equilibrium of the system is locally 
asymptotically stable. 

• For any v(0), c(0) and q(0) such that 
|| v (0)|| < R and || (v(0), 77 ( 0 ) || < R, the 
trajectories of the closed-loop system are such 
that 


lim x(t) = 0, lim v(t) = 0, 

t -> 00 t -> 00 

lim r](t) = 0. 

t —>00 
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The foregoing result can be informally for¬ 
mulated as follows: global state feedback stabi- 
lizability and uniform observability imply semi- 
global stabilizability by output feedback. This 
can be regarded as a nonlinear enhancement of 
the so-called separation principle for the stabi¬ 
lization, by output feedback, of linear systems. 
Note, finally, that a global version of the separa¬ 
tion principle can be derived under one additional 
assumption: the existence of an estimator of the 
norm of the state x. 

Summary and Future Directions 

A necessary condition and a sufficient condition 
for stabilizability of an equilibrium of a nonlinear 
system have been given, together with two sys¬ 
tematic design methods. The necessary condition 
allows to rule out, using a simple algebraic test, 
existence of continuous stabilizers, whereas the 
sufficient condition provides a link with classical 
Lyapunov theory. In addition, the problem of 
semi-global stability by dynamic output feedback 
has been discussed in detail. Several issues have 
not been discussed, including the use of discon¬ 
tinuous, hybrid and time-varying feedbacks; sta¬ 
bilization by static output feedback and dynamic 
state feedback; robust stabilization. Note finally 
that similar considerations can be carried out for 
nonlinear discrete-time systems. 

Cross-References 

► Controllability and Observability 

► Fundamental Limitation of Feedback Control 

► Input-to-State Stability 

► Linear State Feedback 

► Lyapunov’s Stability Theory 

► Lyapunov Methods in Power System Stability 

► Observers for Nonlinear Systems 

► Observers in Linear Systems Theory 

► Power System Voltage Stability 

► Small Signal Stability in Electric Power Sys¬ 
tems 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 

► Stability: Lyapunov, Linear Systems 


Recommended Reading 

Classical references on stabilization for nonlinear 
systems and on recent research directions are 
given below. 
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Abstract 

Mathematical finance is an important part of 
applied mathematics since the 1980s. At the 
beginning, the main goal was to price derivative 
products and to provide hedging strategies. 
Nowadays, the goal is to provide models 
for prices and interest rates, such that better 
calibration of parameters can be done. In these 
pages, we present some basic models. Details can 
be found in Musiela and Rutkowski (2005). 

Keywords 

Affine process; Brownian process; Default times; 
Levy process; Wishart distribution 

Models for Prices of Stocks 

The first model of prices was elaborated by Louis 
Bachelier, in his thesis (1900). The idea was 
that the dynamic of prices has a trend, perturbed 
by a noise. For this noise, Bachelier set the 
fundamental properties of the Brownian motion. 
Bachelier’s prices were of the form 

S t B — S B T vt T arW t 
where IT is a Brownian motion. 


This model, where prices can take negative 
values, was changed by taking the exponential 
as in the celebrated Black-Scholes-Merton model 
(BSM) where 

S t = e sfB = Soexp(W + crW t ). 

Only two constant parameters were needed; the 
coefficient cr is called the volatility. From Ito’s 
formula, the dynamics of the BSM’s price are 

dS t = S t (/idt +odW t ) 

where /x = v + ^cr 2 . The interest rate is assumed 
to be a constant r. 

The price of a derivative product of payoff 
H e Tj = cr(* S s ,s < T) is obtained using a 
hedging procedure. One proves that there exists a 
(self-financing) portfolio with value V, investing 
in the savings account and in the stock S, with 
terminal value //, i.e., Vj = H. The price of 
H at time t is V t . Using that methodology, the 
price of a European call with strike K (i.e., for 
H = (S T - K)+) is 

V, = BS(a,K) t := SVV(di)-/fe _r(r_ 'lV<W 2 ) 

where Af is the cumulative distribution function 
of a standard Gaussian law and 

+ -oVr — t, d 2 = d\ — o>\/T — t 

Note that the coefficient f± plays no role in this 
pricing methodology. This formula opened the 
door to the notion of risk neutral probability mea¬ 
sure: the price of the option (or of any derivative 
product) is the expectation under the unique prob¬ 
ability measure Q, equivalent to P such that the 
discounted price S t e~ rt ,t > 0 is a Q martingale. 

During one decade, the financial market was 
quite smooth and this simple model was efficient 
to price derivative products and to calibrate the 
coefficients from the data. After the Black Mon¬ 
day of October 1987, the model was recognized 
to suffer some weakness and the door was fully 









448 


Financial Markets Modeling 


open to more sophisticated models, with more 
parameters. In particular, the smile effect was 
seen on the data: the BSM formula being true, 
the price of the option would be a determin¬ 
istic function of the fixed parameter cr and of 
K (the other parameter as maturity, underlying 
price, interest rate being fixed), and one would 
obtain, using ^(-, K), the inverse of BS( •, K ), the 
constant cr — \//(C 0 (K), K ), where C°(K ) is the 
observed prices associated with the strike K. This 
is not the case, the curve K -> \/f(C 0 (K), K ) 
having a smile shape (or a smirk shape). The 
BSM model is still used as a benchmark in the 
concept of implied volatility: for a given observed 
option price C°(K, T ) (with strike K and ma¬ 
turity T ), one can find the value of cr* such 
that BS(a* , K,T) = C°(K, T). The surface 
c7*(K , T ) is called the implied volatility surface 
and plays an important role in calibration issues. 

Due to the need of more accurate formula, 
many models were presented, the only (mathe¬ 
matical) restriction being that prices have to be 
semi-martingales (to avoid arbitrage opportuni¬ 
ties). 

A first class is the stochastic volatility models. 
Assuming that the diffusion coefficient (called 
the local volatility) is a deterministic function of 
time and underlying, i.e., 


dv t = —X(v t — v)dt + r\«JvtdB t 

where B and W are two Brownian motions with 
correlation factor p. This model is generalized 
by Gourieroux and Sufana (2003) as Wishart 
model where the risky asset S is a d dimensional 
process, with matrix of quadratic variation X 
satisfy 

dS t = Dia g(S t )(fidt + y/~E~ t dW t ) 
dU t = (AA r + MU t + U t M T )dt 

+ yfz' t dB t Q + Q T (dB t ) r spE t 

where W is a d dimensional Brownian motion 
and B a (d x d) matrix Brownian, A , M, Q are 
(d x d) matrices, M is semidefinite negative, 
and AA r = PQQ T with /3 > d — 1 to 
ensure strict positivity. The efficiency of these 
models was checked using calibration method¬ 
ology; however, the main idea of hedging for 
pricing issues is forgotten: there is, in general, 
no hedging strategy, even for common deriva¬ 
tive products, and the validation of the model 
(the form of the volatility, since the drift term 
plays no role in pricing) is done by calibra¬ 
tion. The risk neutral probability is no more 
unique. 


dS t = S t {pidt + a(t, S t )dW t ) 

Dupire proved, using Kolmogorov backward 
equation that the function cr is determined by the 
observed prices of call options by 

1 2 ,„ ^ d T C°(K, T) + rKd K C°(K, T ) 

2 7 d 2 KK C°(K, T) 

where dr (resp. 3 k) is the partial derivative oper¬ 
ator with respect to the maturity (resp., the strike). 
However, this important model (which allows 
hedging for derivative products) does not allow 
a full calibration of the volatility surface. 

A second class consists of assuming that the 
volatility is a stochastic process. The first exam¬ 
ple is the Heston model which assumes that 

dS t = S t {pidt + *Jv~tdWt) 


Interest Rate Models 


In the beginning of the 1970s, a specific attention 
was paid for the interest rate modeling, a constant 
interest rate being far from the real world. 

A first class consists of a dynamic for the 
instantaneous interest rate r. 

Vasisek suggested an Ornstein Uhlenbeck dif¬ 
fusion, i.e., dr t = a(b — r t )dt + adW t , the 
solution being a Gaussian process of the form 


Oo- 


■ b)e~ at + b + a f e~ a(, ~ u) dW u . 

Jo 


This model is fully tractable, one of its weakness 
is that the interest rate can take negative values. 

Cox, Ingersoll, and Rubinstein (CIR) studied 
the square root process 
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dr t = a(b — r t )dt + g *JF t dB t , 

where ab > 0 (so that r is nonnegative). No 
closed form for r is known; however, closed form 
for the price of the associated zero coupons is 
known (see below in Affine Processes). 

A second class is the celebrated Heath-Jarrow- 
Morton model (HJM): the starting point being to 
model the price of a zero coupon with maturity 
T (i.e., an asset which pays one monetary unit at 
time T, these prices being observable) in terms of 
the instantaneous forward rate f(t, T), as 

T 

f ( t , u)du 


B(t, T) 


=cxp H 


with C > 0, M > 0, G > 0, and Y < 2. 

These models are often presented as a special 
case of a change of time: roughly speaking, 
any semi-martingale is a time-changed Brownian 
motion, and many examples are constructed as 
S t = B \ t , where B is a Brownian motion and 
A an increasing process (the change of time), 
chosen independent of B (for simplicity) and A 
a Levy process (for computational issues). 

Affine Processes 

Affine models were introduced by Duffie et al. 
(2003) and are now a standard tool for modeling 
stock prices, or interest rates. An affine process 
enjoys the property that, for any affine function g, 


Assuming that 

df(t , T) = a(t, T)dt + a(t, T)dW t 


l 


E(exp(MX r + / g(X s )ds)\J r t) 


= exp (ait, T)X t +/3(t, T )) 


one finds where a and ft are deterministic solutions of 

PDEs. 

dB(t, T ) = B{t, T)(a{t, T)dt + b(t, T)dW t ) A class of affine Processes X (M"-valued) is 

the one where 


where the relationship between a,b and a, /3 is 
known. The instantaneous interest rate is r t = 
This model is quite efficient; however, 
no conditions are known in order that the interest 
rate is positive. 


Models with Jumps 

Levy Processes 

Following the idea to produce tractable models 
to fit the data, many models are now based on 
Levy’s processes. These models are used for 
modeling prices as S t = e Xt where A is a 
Levy process. They present a nice feature: even 
if closed forms for pricing are not known, nu¬ 
merical methods are efficient. One of the most 
popular is the Carr-Geman-Madan-Yor (CGMY) 
model which is a Levy process without Gaussian 
component and with Levy density 

r C 

u „~Mx it | u X) G.\ n 

“ ¥TT e + ]^j7+r e 


dX t = b{X t )dt + cr(X t )dW t + dZ t 


where the drift vector b is an affine function of 
v, the covariance matrix g(x)g t (x) is an affine 
function of x, W is an n -dimensional Brownian 
motion, and Z is a pure jump process whose 
jumps have a given law v on M” and arrive with 
intensity / (X t ) where / is an affine function. An 
example is the CIR model (without jumps), where 
one can find the price of a zero coupon as 



r s ds\Ft) 


= O(T-0exp[-r^(r-0] 


where 

*(*) = 

0 ( 5 ) = 


2(e ys - 1) 


(y + a)(e ys - 1) + 2y ’ 

/ 2 

\(y + a)(e ys - 1) + 2 y 


lab 
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Y 2 = a 2 + 2a 2 . 


Models for Defaults 

At the end of 1990s, new kinds of financial prod¬ 
ucts appear on the market: defaultable options, 
defaultable zero coupons, and credit derivatives, 
as the CDOs. Then, particular attention was paid 
to the modeling of default times; see Bielecki and 
Rutkowski (2001). The most popular model for 
a single default is the reduced form approach, 
which is based on the knowledge of the intensity 
rate process. Given a nonnegative process A, 
adapted with respect to a reference filtration F, 
a random time r is defined so that 

F(r > t^oo) = exp(— [ X s ds ) 

Jo 

This implies that 11 r < t — AT k u d u is a martingale 
(in the smallest filtration G which contains F and 
makes r a random time). This intensity rate A 
plays the role of the interest spread due to the 
equality, for any Y e Tt and F adapted interest 
rate r 

E(Yl 7 ’< r exp(— J r s ds)\Q t )t t<l 

= l ;<r E(Y exp(— ^ (r s + X s )ds)\T t ) 

The real challenge is to model multi-defaults. 
Defaults are assumed to occur at times ry, and 
one has to describe the (conditional) joint law of 
the vector r = {x \, i 2 ,..., r n ), the number n of 
defaults being quite large (100). A first step is to 
study the law of the defaults, i.e., 

P(Tl > t\ , . . . , x n > tyi) 

which is performed using the classical copula ap¬ 
proach, based on the knowledge of the marginal 
laws of the T/, i.e., on the knowledge of F(r / < 
s ) =: Fj(s) where the cumulative distribution 
functions Fj are assumed invertible. The simplest 
and most popular copula being the Gaussian one 


F(F ? - l (ri) < Ui, i = 1,..., n) 

= Ml (A/ 1 (wi),..., M~\u n )) , 

where Ml is the c.d.f. for the n -variate central 
normal distribution with the linear correlation 
matrix Y and M~ l is the inverse of the c.d.f. 
for the univariate standard normal distribution. 
However, a dynamical model was needed, mainly 
to study the contagion effect, i.e., how the oc¬ 
currence of a default affect the probability of 
occurrence of the next defaults. 

A first class of models is based on the inten¬ 
sity: starting from a given family of intensities 
(AJ, i = l,... ,n) which satisfies 

d\\ — f{t, X\)dt + g(t, X\)dW t + dL\~ 1 ^ 

where L ; (_0 = "Ek# Pk 1 Tk <t , one constructs 
random times having the given intensities. 

Another class of models, which allows for 
common jumps, is based on Markov Chains with 
absorbing state, the default time of the i -th entity 
being the time where the i- th component of 
the Markov Chain enters in the absorbing state. 
These models are efficient due to the introduction 
of common factors. 

Many studies are done to find a form of 
a dynamic copula, i.e., a family of processes 
G t (0),t > 0 such that 

G t (d) = ¥(r l >d u ...,r n >e n \T t ) 

Some examples where, for any fixed t , the 
quantity G t (•) is a (stochastic) Gaussian law are 
known; however, there are few concrete examples 
for other cases. 


Cross-References 

► Credit Risk Modeling 

► Option Games: The Interface Between Optimal 
Stopping and Game Theory 
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Abstract 

Mechanical flexibility in robot manipulators is 
due to compliance at the joints and/or distributed 
deflection of the links. Dynamic models of the 
two classes of robots with flexible joints or flex¬ 
ible links are presented, together with control 
laws addressing the motion tasks of regulation 
to constant equilibrium states and of asymptotic 
tracking of output trajectories. Control design for 
robots with flexible joints takes advantage of the 
passivity and feedback linearization properties. In 
robots with flexible links, basic differences arise 
when controlling the motion at the joint level or 
at the tip level. 

Keywords 

Feedback linearization; Gravity compensation; 
Joint elasticity; Link flexibility; Noncausal and 
stable inversion; Regulation by motor feedback; 
Singular perturbation; Vibration damping 

Introduction 

Robot manipulators are usually considered as 
rigid multi-body mechanical systems. This ideal 
assumption simplifies dynamic analysis and 


control design but may lead to performance 
degradation and even unstable behavior, due to 
the excitation of vibrational phenomena. 

Flexibility is mainly due to the limited 
stiffness of transmissions at the joints (Sweet 
and Good 1985) and to the deflection of 
slender and lightweight links (Cannon and 
Schmitz 1984). Joint flexibility is common when 
motion transmission/reduction elements such 
as belts, long shafts, cables, harmonic drives, 
or cycloidal gears are used. Link flexibility is 
present in large articulated structures, such as 
very long arms needed for accessing hostile 
environments (deep sea or space) or automated 
crane devices for building construction. In 
both situations, static displacements and 
dynamic oscillations are introduced between 
the driving actuators and the actual position 
of the robot end effector. Undesired vibrations 
are typically confined beyond the closed- 
loop control bandwidth, but flexibility cannot 
be neglected when large speed/acceleration 
and high accuracy are requested by the 
task. 

In the dynamic modeling, flexibility is as¬ 
sumed concentrated at the robot joints or dis¬ 
tributed along the robot links (most of the times 
with some finite-dimensional approximation). In 
both cases, additional generalized coordinates are 
introduced beside those used to describe the rigid 
motion of the arm in a Lagrangian formulation. 
As a result, the number of available control inputs 
is strictly less than the number of degrees of 
freedom of the mechanical system. This type of 
under-actuation, though counterbalanced by the 
presence of additional potential energy helping 
to achieve system controllability, suggests that 
the design of satisfactory motion control laws is 
harder than in the rigid case. 

From a control point of view, different de¬ 
sign approaches are needed because of struc¬ 
tural differences arising between flexible-joint 
and flexible-link robots. These differences hold 
for single- or multiple-link robots, in the linear 
or nonlinear domain, and depend on the physical 
co-location or not of mechanical flexibility versus 
control actuation, as well as on the choice of 
controlled outputs. 
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In order to measure the state of flexible 
robots for trajectory tracking control or feedback 
stabilization purposes, a large variety of sensing 
devices can be used, including encoders, joint 
torque sensors, strain gauges, accelerometers, 
and high-speed cameras. In particular, measuring 
the full state of the system would require twice 
the number of sensors than in the rigid case for 
robots with flexible joints and possibly more 
for robots with flexible links. The design of 
controllers that work provably good with a 
reduced set of measurements is thus particularly 
attractive. 


Robots with Flexible Joints 

Dynamic Modeling 

A robot with flexible joints is modeled as an 
open kinematic chain of n + 1 rigid bodies, 
interconnected by n joints undergoing deflection 
and actuated by n electrical motors. Let 0 be 
the n -vector of motor (i.e., rotor) positions, as 
reflected through the reduction gears, and q the 
n -vector of link positions. The joint deflection is 
8 = 0 — q ^ 0. The standard assumptions are: 

A1 Joint deflections 8 are small, limited to 
the domain of linear elasticity. The elastic 
torques due to joint deformations are 
x j = K(0 — q), where K is the positive 
definite, diagonal joint stiffness matrix. 

A2 The rotors of the electrical motors are mod¬ 
eled as uniform bodies having their center of 
mass on the rotation axis. 

A3 The angular velocity of the rotors is due only 
to their own spinning. 

The last assumption, introduced by Spong 
(1987), is very reasonable for large reduction 
ratios and also crucial for simplifying the 
dynamic model. 

From the gravity and elastic potential energy, 
U = U g + Us, and the kinetic energy T of the 
robot, applying the Euler-Lagrange equations to 
the Lagrangian C = T — U and neglecting all 
dissipative effects leads to the dynamic model 

M{q)q +n(q,q) + K(q -0) = 0 (1) 


BO + K(0 -q) = r, (2) 

where M(q) is the positive definite, symmetric 
inertia matrix of the robot links (including the 
motor masses); n(q,q ) is the sum of Coriolis 
and centrifugal terms c(q,q) (quadratic in q) 
and gravitational terms g(q) = (dU g /dq) T \ 
B is the positive definite, diagonal matrix of 
motor inertias (reflected through the gear ratios); 
and r are the motor torques (performing work 
on 0). The inertia matrix of the complete sys¬ 
tem is then A d(q) = block dia g{M(q),B). 
The two n -dimensional second-order differential 
equations (1) and (2) are referred to as the link 
and the motor equations, respectively. When the 
joint stiffness K -> oo, it is 0 -> q and 
x j -> r , so that the two equations collapse in 
the limit into the standard dynamic model of rigid 
robots with total inertia A4 (q) = M(q) + B . 
On the other hand, when the joint stiffness K is 
relatively large but still finite, robots with elastic 
joints show a two-time-scale dynamic behavior. 
A common large scalar factor l/e 2 1 can be 
extracted from the diagonal stiffness matrix as 
K = K/e 2 . The slow subsystem is associated 
to the link dynamics 

M(q)q + n(q,q) = r j, (3) 

while the fast subsystem takes the form 

e 2 r j =K(B- l (r-rj) 

+M~\q)(n(q,q) - tj)) (4) 

For small c, Eqs. (3) and (4) represent a singularly 
perturbed system. The two separate time scales 
governing the slow and fast dynamics are t and 
a = t/e. 

Regulation 

The basic robotic task of moving between two 
arbitrary equilibrium configurations is realized 
by a feedback control law that asymptotically 
stabilizes the desired robot state. 

In the absence of gravity (g = 0), the equi¬ 
librium states are parameterized by the desired 
reference position q d of the links and take the 
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form q = q d , 0 = 0 d = q d (with no joint 
deflection at steady state) and q = 0 = 0. As 
a result of passivity of the mapping from r to 0, 
global regulation is achieved by a decentralized 
PD law using only feedback from the motor 
variables, 

r = K P (0 d -0)-K D 0, (5) 

with diagonal K p > 0 and K p> > 0. 

In the presence of gravity, the (unique) 
equilibrium position of the motor associated 
with a desired link position q d becomes 0 d = 
q d + K~ 1 g(q d ). Global regulation is obtained 
by adding an extra gravity-dependent term r g to 
the PD control law (5), 

r = K P (0 d - 0) - K D 0 + r g , (6) 

with diagonal matrices K p > 0 (at least) and 
K p> > 0. The term r g needs to match the gravity 
load g(q d ) at steady state. The following choices 
are of slight increasing control complexity, with 
progressively better transient performance. 

• Constant gravity compensation: r g = g(q d ). 
Global regulation is achieved when the small¬ 
est positive gain in the diagonal matrix K p 
is large enough (Tomei 1991). This sufficient 
condition can be enforced only if the joint 
stiffness K dominates the gradient of gravity 
terms. 

• Online compensation: r g = g(0), 0 = 

0 — K~ l g(q d ). Gravity effects on the links 
are approximately compensated during robot 
motion. Global regulation is proven under the 
same conditions above (De Luca et al. 2005). 

• Quasi-static compensation: r g = g (#(#))• 
At any measured motor position 0 , the link 
position q (0) is computed by solving numer¬ 
ically g(q) + K(q — 0) = 0. This removes 
the need of a strictly positive lower bound on 
K P (Kugi et al. 2008), but the joint stiffness 
should still dominate the gradient of gravity 
terms. 

Additional feedback from the full robot 
state ( q,q,0,0 ), measured or reconstructed 
through dynamic observers, can provide faster 
and damped transient responses. This solution 


is particularly convenient when a joint torque 
sensor measuring r j is available ( torque- 
controlled robots). Using 

r = K P (0 d - 0) - K d 0 + K T (g(q d ) - r j) 
-K s ij+g(q d ), (7) 

the four diagonal gain matrices can be given 
a special structure so that asymptotic stability 
is automatically guaranteed (Albu-Schaffer and 
Hirzinger 2001). 

T rajectory T racking 

Let a desired sufficiently smooth trajectory q d (t) 
be specified for the robot links over a finite or 
infinite time interval. The control objective is 
to asymptotically stabilize the trajectory tracking 
error e = q d (t) — q(t) to zero, starting from a 
generic initial robot state. Assuming that q d (t) 
is four times continuously differentiable, a torque 
input profile r d (f) = r d {q d ,q d ,q d ,q d "q d ) can 
be derived from the dynamic model (1) and (2) 
so as to reproduce exactly the desired trajectory, 
when starting from matched initial conditions. A 
local solution to the trajectory tracking problem is 
provided by the combination of such feedforward 
term r d (t) with a stabilizing linear feedback 
from the partial or full robot state; see Eqs. (6) 
or (7). 

When the joint stiffness is large enough, one 
can take advantage of the system being singularly 
perturbed. A control law r s designed for the rigid 
robot will deal with the slow dynamics, while a 
relatively simple action r / is used to stabilize the 
fast vibratory dynamics around an invariant man¬ 
ifold associated to the rigid robot control (Spong 
et al. 1987). This class of composite control laws 
has the general form 

x = T s (q,q,t) +eTf(q,q,Tj,ij). ( 8 ) 

When setting e = 0 in Eqs. (3), (4), and (8), 
the control setup of the equivalent rigid robot is 
recovered as 

(M(q) + B)q +n(q,q ) = r s . 


(9) 
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Though more complex, the best performing 
trajectory tracking controller for the general case 
is based on feedback linearization. Spong (1987) 
has shown that the nonlinear state feedback 

r = cc(q,q,q,'q) + P(q)v, (10) 

with 

a = M (q)q +n(q,q ) 

+BK~ l ((X/(f) + K^jq + 2M(q)q 
+n(q,q )) 

P= BK~ { M(q), 

leads globally to the closed-loop linear system 

= v, ( 11 ) 

i.e., to decoupled chains of four input-output 
integrators from each auxiliary input Vj to 
each link position output qi, for i = 1 ,... ,n. 
The control design is then completed on the 
linear SISO side, by forcing the trajectory 
tracking error to be exponentially stable with an 
arbitrary decaying rate. The control law (10) is 
expressed as a function of the linearizing 
coordinates (q,q,q,q) (up to the link jerk), 
which can be however rewritten in terms 
of the original state (q,q,0,0) using the 
dynamic model equations. This fundamental 
result is the direct extension of the so- 
called “computed torque” method for rigid 
robots. 


Robots with Flexible Links 

Dynamic Modeling 

For the dynamic modeling of a single flexible 
link, the distributed nature of structural flexibility 
can be captured, under suitable assumptions, by 
partial differential equations (PDE) with associ¬ 
ated boundary conditions. A common model is 
the Euler-Bernoulli beam. The link is assumed 
to be a slender beam, with uniform geometric 
characteristics and homogeneous mass distribu¬ 
tion, clamped at the base to the rigid hub of an 


actuator producing a torque r and rotating on 
a horizontal plane. The beam is flexible in the 
lateral direction only, being stiff with respect to 
axial forces, torsion, and bending due to gravity. 
Deformations are small and are in the elastic do¬ 
main. The physical parameters of interest are the 
linear density p of the beam, its flexural rigidity 
El , the beam length l, and the hub inertia //* 
(with I t = I h + pi 3 / 3). The equations of motion 
combine lumped and distributed parameter parts, 
with the hub rotation 6{t) and the link deforma¬ 
tion w(x,t ), being v e [0,1] the position along 
the link. From Hamilton principle, we obtain 

I t 8(t) + p f xw(x,t)dx = r(t) (12) 

Jo 

EIw""(x,t)+ pw(x,t)+ px6(t) =0 (13) 

w(0, t ) = w (0, t) = 0, 
w"(l,t) = w m (l,t) = 0, (14) 

where a prime denotes partial derivative w.r.t. to 
space. Equation (14) are the clamped-free bound¬ 
ary conditions at the two ends of the beam (no 
payload is present at the tip). 

For the analysis of this self-adjoint PDE prob¬ 
lem, one proceeds by separation of variables in 
space and time, defining 

w(x, t ) = c/)(x)8(t) 9(t) = a(t) + k8(t), 

(15) 

where cj)(x) is the link spatial deformation, 8{t) 
is its time behavior, a(t) describes the angular 
motion of the instantaneous center of mass of the 
beam, and k is chosen so as to satisfy (12) for r = 
0. Being system (12)—(14) linear, nonrational 
transfer functions can be derived in the Laplace 
transform domain between the input torque and 
some relevant system output, e.g., the angular po¬ 
sition of the hub or of the tip of the beam (Kanoh 
1990). The PDE formalism provides also a con¬ 
venient basis for analyzing distributed sensing, 
feedback from strain sensors (Luo 1993), or even 
distributed actuation with piezo-electric devices 
placed along the link. 
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The transcendental characteristic equation as¬ 
sociated to the spatial part of the solution to 
Eqs. (12)—(14) is 

IhY 3 ( 1 + cos (yl) cosh(yf)) 

+p(sin(yl) cosh (yl) — cos (yl) sinh(yf)) = 0. 

(16) 

When the hub inertia Ih —> oo, the second 
term can be neglected and the characteristic 
equation collapses into the so-called clamped 
condition. Equation (16) has an infinite but 
countable number of positive real roots , with 
associated eigenvalues of resonant frequencies 
coi = yf y/EI/p and orthonormal eigenvectors 
<pi (jc), which are the natural deformation 
shapes of the beam (Barbieri and Ozgiiner 
1988). A finite-dimensional dynamic model is 
obtained by truncation to a finite number m e of 
eigenvalues/shapes. From 

m e 

w(x,t) = Y^<pi(x)Si(t) (17) 

i = 1 

we get 

I t a(t) = x (t) 

8i(t)+<o?Si(t)= 0'(O)r(O, (IB) 

i = l,... ,m e , 

where the rigid body motion (top equation) ap¬ 
pears as decoupled from the flexible dynamics, 
thanks to the choice of variable a rather than 
0 . Modal damping can be added on the left- 
hand sides of the lower equations through terms 
2 fycDiSi with £/ e [0,1]. The angular position of 
the motor hub at the joint is given by 

m e 

0(0 =a(0 + E ^(0)^(0. (19) 

i = 1 

while the tip angular position is 

y(t)=a(t) + j^^p-8 i (t). (20) 


The joint-level transfer function Pjoint(^) = 
6(s)/x(s) will always have relative degree two 
and only minimum phase zeros. On the other 
hand, the tip-level transfer function PtipC?) = 
y(s)/x(s) will contain non-minimum phase 
zeros. This basic difference in the pattern of the 
transmission zeros is crucial for motion control 
design. 

In a simpler modeling technique, a spec¬ 
ified class of spatial functions 0, (v) is 
assumed for describing link deformation. 
The functions need to satisfy only a re¬ 
duced set of geometric boundary conditions 
(e.g., clamped modes at the link base), but 
otherwise no dynamic equations of motion 
such as (13). The use of finite-dimensional 
expansions like (17) limits the validity of the 
resulting model to a maximum frequency. This 
truncation must be accompanied by suitable 
filtering of measurements and of control 
commands, so as to avoid or limit spillover effects 
(Balas 1978). 

In the dynamic modeling of robots with n 
flexible links, the resort to assumed modes of 
link deformation becomes unavoidable. In prac¬ 
tice, some form of approximation and a finite¬ 
dimensional treatment is necessary. Let 0 be the 
ft-vector of joint variables describing the rigid 
motion, and 8 be the m -vector collecting the 
deformation variables of all flexible links. Fol¬ 
lowing a Lagrangian formulation, the dynamic 
model with clamped modes takes the general 
form (Book 1984) 

(M ee (0,8) M e8 (0,8)\ (0\ 

\M t 0S (6J ) M ss (0,8))\8 ) 
U 0 (O,8,O,8)\ I 0 \_M 

+ v«ii(<M,M) J + \£S + /!rsJ _ (oj’ 

( 21 ) 

where the positive definite, symmetric inertia 
matrix AT of the complete robot and the Cori¬ 
olis, centrifugal, and gravitational terms n have 
been partitioned in blocks of suitable dimensions, 
K > 0 and D > 0 are the robot link stiffness 
and damping matrices, and r is the ft-vector of 
actuating torques. 
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The dynamic model (21) shows the general 
couplings existing between nonlinear rigid body 
motion and linear flexible dynamics. In this re¬ 
spect, the linear model (18) of a single flexible 
link is a remarkable exception. 

The choice of specific assumed modes may 
simplify the blocks of the robot inertia matrix, 
e.g., orthonormal modes used for each link induce 
a decoupled structure of the diagonal inertia 
subblocks of Mss • Quite often the total kinetic 
energy of the flexible robot is evaluated only 
in the undeformed configuration 8 =0. With 
this approximation, the inertia matrix becomes 
independent of 8 , and so the velocity terms in 
the model. Furthermore, due to the hypothesis of 
small deformation of each link, the dependence 
of the gravity term in the lower component ns is 
only a function of 0 . 

The validation of (21) goes through the ex¬ 
perimental identification of the relevant dynamic 
parameters. Besides those inherited from the rigid 
case (mass, inertia, etc.), also the set of structural 
resonant frequencies and associated deformation 
profiles should be identified. 

Control of Joint-Level Motion 

When the target variables to be controlled are 
defined at the joint level, the control problem 
for robots with flexible links is similar to that of 
robots with flexible joints. As a matter of fact, 
the models (1), (2), and (21) are both passive 
systems with respect to the output 0 \ see (19) 
in the scalar case. For instance, regulation is 
achieved by a PD action with constant grav¬ 
ity compensation, using a control law of the 
form (6) without the need of feeding back link 
deformation variables (De Luca and Siciliano 
1993a). Similarly, stable tracking of a joint trajec¬ 
tory 0 d(t) is obtained by a singular perturbation 
control approach, with flexible modes dynamics 
acting at multiple time scales with respect to 
rigid body motion (Siciliano and Book 1988), 
or by an inversion-based control (De Luca and 
Siciliano 1993b), where input-output (rather than 
full state) exact linearization is realized and the 
effects of link flexibility are canceled on the 
motion of the robot joints. While vibrational 
behavior will still affect the robot at the level of 


end-effector motion, the closed-loop dynamics of 
the 8 variables is stable and link deformations 
converge to a steady-state constant value (zero 
in the absence of gravity) thanks to the intrinsic 
damping of the mechanical structure. Improved 
transients are indeed obtained by active modal 
damping control (Cannon and Schmitz 1984). 

A control approach specifically developed 
for the rest-to-rest motion of flexible mechanical 
systems is command shaping (Singer and Seering 
1990). The original command designed to 
achieve a desired motion for a rigid robot is 
convolved with suitable signals delayed in time, 
so as to cancel (or reduce to a minimum) the 
effects of the excited vibration modes at the time 
of motion completion. For a single slewing link 
with linear dynamics, as in (18), the rest-to-rest 
input command is computed in closed form by 
using impulsive signals and can be made robust 
via an over-parameterization. 

Control of Tip-Level Motion 

The design of a control law that allows asymp¬ 
totic tracking of a desired trajectory for the end 
effector of a robot with flexible links needs to 
face the unstable zero dynamics associated to the 
problem. In the linear case of a single flexible 
link, this is equivalent to the presence of non¬ 
minimum phase zeros in the transfer function to 
the tip output (20). Direct inversion of the input- 
output map leads to instability, due to cancel¬ 
lation of non-minimum phase zeros by unstable 
poles, with link deformation growing unbounded 
and control saturations. 

The solution requires instead to determine the 
unique reference state trajectory of the flexible 
structure that is associated to the desired tip 
trajectory and has bounded deformation. Based 
on regulation theory, the control law will be the 
superposition of a nominal feedforward action, 
which keeps the system along the reference state 
trajectory (and thus the output on the desired 
trajectory), and of a stabilizing feedback that 
reduces the error with respect to this state tra¬ 
jectory to zero without resorting to dangerous 
cancellations. 

In general, computing such a control law 
requires the solution of a set of nonlinear partial 
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differential equations. However, in the case of 
a single flexible link with linear dynamics, the 
feedforward profile is simply derived by an 
inversion defined in the frequency domain (Bayo 
1987). The desired tip acceleration yd(t ), 
t G [0, T], is considered as part of a rest-to-rest 
periodic signal, with zero mean value and zero 
integral. The procedure, implemented efficiently 
using Fast Fourier Transform on discrete-time 
samples, will automatically generate bounded 
time signals only. The resulting unique torque 
profile Td(t) will be a noncausal command, 
anticipating the actual start of the output 
trajectory at t = 0 (so as to precharge the link to 
the correct initial deformation) and ending after 
t = T (to discharge the residual link deformation 
and recover the final rest configuration). 

The same result was recovered by Kwon and 
Book (1994) in the time domain, by forward 
integrating in time the stable part of the inverse 
system dynamics and backward integrating the 
unstable part. An extension to the multi-link non¬ 
linear case uses an iterative approach on repeated 
linear approximations of the system along the 
nominal trajectory (Bayo et al. 1989). 

Summary and Future Directions 

The presence of mechanical flexibility in the 
joints and the links of multi-dof robots poses 
challenging control problems. Control designs 
take advantage or are limited by some system- 
level properties. Robots with flexible joints are 
passive systems at the level of motor outputs, 
have no zero dynamics associated to the link 
position outputs, and are always feedback lin- 
earizable systems. Robots with flexible links are 
still passive for joint-level outputs, but cannot be 
feedback linearized in general, and have unstable 
zero dynamics (non-minimum phase zeros in the 
linear case) when considering the end-effector 
position as controlled output. 

State-of-the-art control laws address regula¬ 
tion and trajectory tracking tasks in a satisfactory 
way, at least in nominal conditions and under 
full-state feedback. Current research directions 
are aimed at achieving robustness to model 


uncertainties and external disturbances (with 
adaptive, learning, or iterative schemes), and 
further exploit the design of control laws 
under limited measurements and noisy sensing. 
Beyond free motion tasks, an accurate treatment 
of interaction tasks with the environment, 
requiring force or impedance controllers, is 
still missing for flexible robots. In this respect, 
passivity-based control approaches that do not 
necessarily operate dynamic cancellations may 
take advantage of the existing compliance, 
trading off between improved energy efficiency 
and some reduction in nominal performance. 

Often seen as a limiting factor for per¬ 
formance, the presence of joint elasticity 
is now becoming an explicit advantage for 
safe physical human-robot interaction and for 
locomotion. Next generation lightweight robots 
and humanoids will use flexible joints and also 
compact actuation with online controlled variable 
joint stiffness, an area of active research. 


Cross-References 

► Feedback Linearization of Nonlinear Systems 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Nonlinear Zero Dynamics 

► PID Control 

► Regulation and Tracking of Nonlinear Systems 


Recommended Reading 

In addition to the works cited in the body of this 
article, a detailed treatment of dynamic modeling 
and control issues for flexible robots can be found 
in De Luca and Book (2008). This includes also 
the use of dynamic feedback linearization for a 
more general model of robots with elastic joints. 
For the same class of robots, Brogliato et al. 
(1995) provided a comparison of passivity-based 
and inversion-based tracking controllers. 
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Abstract 

Flocking is a collective behavior exhibited by 
many animal species such as birds, insects, and 
fish. Such behavior is generated by distributed 
motion coordination through nearest-neighbor in¬ 
teractions. Empirical study of such behavior has 
been an active research in ecology and evolu¬ 
tionary biology. Mathematical study of such be¬ 
haviors has become an active research area in a 
diverse set of disciplines, ranging from statistical 
physics and computer graphics to control theory, 
robotics, opinion dynamics in social networks, 
and general theory of multiagent systems. While 
models vary in detail, they are all based on 
local diffusive dynamics that results in emergence 
of consensus in direction of motion. Flocking 
is closely related to the notion of consensus 
and synchronization in multiagent systems, as 
examples of collective phenomena that emerge 
in multiagent systems as result of local nearest- 
neighbor interactions. 

Keywords 

Consensus; Dynamics; Flocking; Graph theory; 
Markov chains; Switched dynamical systems; 
Synchronization 

Flocking or social aggregation is a group be¬ 
havior observed in many animal species, ranging 



Flocking in Networked Systems 


459 


from various types of birds to insects and fish. 
The phenomena can be loosely defined as any 
aggregate collective behavior in parallel rectilin¬ 
ear formation or (in case of fish) in collective 
circular motion. The mechanisms leading to such 
behavior have been (and continues to be) an 
active area of research among ecologists and 
evolutionary biologists, dating back to the 1950s 
if not earlier. The engineering interest in the topic 
is much more recent. 

In 1986, Craig Reynolds (1987), a computer 
graphics researcher, developed a computer model 
of collective behavior for animated artificial ob¬ 
jects called boids. The flocking model for boids 
was used to realistically duplicate aggregation 
phenomena in fish flocks and bird schools for 
computer animation. Reynolds developed a sim¬ 
ple, intuitive physics-based model: each boid was 
a point mass subject to three simple steering 
forces: alignment (to steer each boid towards 
the average heading of its local flockmates), co¬ 
hesion (steering to move towards the average 
position of local flockmates), and separation (to 
avoid crowding local flockmates). The term local 
should be understood as those flockmates who 
are within each other’s influence zone, which 
could be a disk (or a wide-angle sector of a 
disk) centered at each boid with a prespecified 
radius. This simple zone-based model created 
very realistic flocking behaviors and was used in 
many animations (Fig. 1). Reynolds’ 3 rules of 
flocking. 

Nine years later, in 1995, Vicsek et al. (1995) 
and coauthors independently developed a model 
for velocity alignment of self-propelled particles 
(SPPs) in a square with periodic boundary 


conditions. SPPs are essentially kinematic 
particles moving with constant speed and 
the steering law determines the angle of 
(what control theorists call a kinematic, 
nonholonomic vehicle model). Vicsek et al.’s 
steering law was very intuitive and simple 
and was essentially Reynolds’ zone-based 
alignment rule (Vicsek was not aware of 
Reynolds’ result): each particle averages the 
angle of its velocity vector with that of its 
neighbors (those within a disk of a prespecified 
distance), plus a noise term used to model 
inaccuracies in averaging. Once the velocity 
vector is determined at each time, each particle 
takes a unit step along that direction, then 
determining its neighbors again and repeating 
the protocol. 

The simulations were done in a square of 
unit length with periodic boundary conditions, 
to simulate infinite space. Vicsek and coauthors 
simulated this behavior and found that as the 
density of the particles increased, a preferred 
direction spontaneously emerged, resulting in 
a global group behavior with nearly aligned 
velocity vectors for all particles, despite 
the fact that the update protocol is entirely 
local. 

With the interest in control theory shifting 
towards multiagent systems and networked co¬ 
ordination and control, it became clear that the 
mathematics of how birds flock and fish school 
and how individuals in a social network reach 
agreement (even though they are often only influ¬ 
enced by other like-minded individuals) are quite 
related to the question of how can one engineer a 
swarm of robots to behave like bird flocks. 



Alignment 


Cohesion 


Separation 


Flocking in Networked Systems, Fig. 1 Photo from http://red3d.com/cwr/boids/ 
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These questions have occupied the minds 
of many researchers in diverse areas ranging 
from control theory to robotics, mathematics, and 
computer science. As discussed above, most of 
the early research, which happened in computer 
graphics and statistical physics, was on modeling 
and simulation of collective behavior. Over the 
past 13 years, however, the focus has shifted to 
rigorous systems theoretic foundations, leading 
to what one might call a theory of collective 
phenomena in multiagent systems. This theory 
blends dynamical systems, graph theory, Markov 
chains, and algorithms. 

This type of collective phenomena are often 
modeled as many-degrees-of-freedom (discrete¬ 
time or continuous-time) dynamical systems 
with an additional twist that the interconnection 
structure between individual dynamical systems 
changes, since the motion of each node in a flock 
(or opinion of an individual) is affected primarily 
by those in each node’s local neighborhood. 
The twist here is that the local neighborhood 
is not fixed: neighbors are defined based on 
the actual state of the system, for example, in 
case of Vicsek’s alignment rule, as each particle 
averages its velocity direction with that of its 
neighbors and then takes a step, the set of its 
nearest neighbors can change. 

Interestingly, very similar models were de¬ 
veloped in statistics and mathematical sociology 
literature to describe how individuals in a social 
network update their opinions as a function of 
the opinion of their friends. The first such model 
goes back to the seminal work of DeGroot (1974) 
in 1974. DeGroot’s model simply described the 
evolution of a group’s scalar opinion as a function 
of the opinion of their neighbors by an iterative 
averaging scheme that can be conveniently mod¬ 
eled as a Markov chain. Individuals are repre¬ 
sented by nodes of a graph, start from an opinion 
at time zero, and then are influenced by the 
people in their social clique. In DeGroot’s model 
though, the network is given exogenously and 
does not change as a function of the opinions. 
The model therefore can be analyzed using the 
celebrated Perron-Frobenius theorem. The evo¬ 
lution of opinions is a discrete dynamic system 
that corresponds to an averaging map. When the 


network is fixed and connected (i.e., there is a 
path from every node to every other node) and 
agents also include their own opinions in the 
averaging, the update results in computation of 
a global weighted average of initial opinions, 
where the weight of each initial opinion in the fi¬ 
nal aggregate is proportional to the “importance” 
of each node in the network. 

The flocking models of Reynolds (1987) and 
Vicsek et al. (1995), however, have an extra 
twist: the network changes as opinions are up¬ 
dated. Similarly, more refined sociological mod¬ 
els developed over the past decade also capture 
this endogeneity (Hegselmann and Krause 2002): 
each individual agent is influenced by others 
only when their opinion is close to her own. In 
other words, as opinions evolve, neighborhood 
structures change as the function of the evolving 
opinion, resulting in a switched dynamical sys¬ 
tem in which switching is state dependent. 

In a paper in 2003, Jadbabaie and coau¬ 
thors (2003) studied the Reynolds’ alignment 
rule in the context of Vicsek’s model when there 
is no exogenous noise. To model the endogeneity 
of the change in neighborhood structure, they 
developed a model based on repeated local 
averaging in which the neighborhood structure 
changes over time and therefore instead of a 
simple discrete-time linear dynamical system, 
the model is a discrete linear inclusion or a 
switched linear system. The question of interest 
was to determine what regimes of network 
changes could result in flocking. Clearly, as 
also DeGroot’s model suggests, when the local 
neighbor structures do not change, connectivity 
is the key factor for flocking. This is a direct 
consequence of Perron-Frobenius theory. The 
result can also be described in terms of directed 
graphs. What is the equivalent condition in 
changing networks? Jadbabaie and coauthors 
show in their paper that indeed connectivity is 
important, but it need not hold every time: rather, 
there needs to be time periods over which the 
graphs are connected in time. More formally, the 
process of neighborhood changes due to motion 
in Vicsek’s model can be simply abstracted as 
a graph in which the links “blink” on and off. 
For flocking, one needs to ensure that there are 
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time periods over which the union of edges (that 
occur as a result of proximity of particles) needs 
to correspond to a connected graph and such 
intervals need to occur infinitely often. It turns 
out that many of these ideas were developed 
much earlier in a thesis and following paper 
by Tsitsiklis (1984) and Tsitsiklis et al. (1986), 
in the context of distributed and asynchronous 
computation of global averages in changing 
graphs, and in a paper by Chatterjee and Seneta 
(1977), in the context of nonhomogeneous 
Markov chains. The machinery for proving 
such a results, however, is classical and has a 
long history in the theory of inhomogeneous 
Markov chains, a subject studied since the time 
of Markov himself, followed by Birkhoff and 
other mathematicians such as Hajnal, Dobrushin, 
Seneta, Hatfield, Daubachies, and Lagarias, to 
name a few. 

The interesting twist in analysis of Vicsek’s 
model is that the Euclidean norm of the distance 
to the globally converged “consensus angle” (or 
consensus opinion in the case of opinion models) 
can actually grow in a single step of the process; 
therefore, standard quadratic Lyapunov function 
arguments which serve as the main tool for anal¬ 
ysis of switched linear systems are not suitable 
for the analysis of such models. However, it 
is fairly easy to see that under the process of 
local averaging, the largest value cannot increase 
and the smallest value cannot decrease. In fact, 
one can show that if enough connected graphs 
occur as the result of the switching process, the 
maximum value will be strictly decreasing and 
the minimum value will be strictly increasing 
The paper by Jadbabaie and coauthors (2003) 
has lead to a flurry of results in this area over 
the past decade. One important generalization to 
the results of Jadbabaie et al. (2003), Tsitsiklis 
(1984), and Tsitsiklis et al. (1986) came 2 years 
later in a paper by Moreau (2005), who showed 
that these results can be generalized to nonlinear 
updates and directed graphs. Moreau showed 
that any dynamic process that assigns a point in 
the interior of the convex hull of the value of 
each node and its neighbors will eventually result 
in agreement and consensus, if and only if the 
union of graphs from every time step till infinity 


contains a directed spanning tree (a node who has 
direct links to every other node). 

Some of these results were also extended to 
the analysis of the Reynolds’ model of flocking 
including the other two behaviors. First, Tanner 
and coauthors (2003, 2007) showed in a series 
of papers in 2003 and 2007 that a zone-based 
model similar to Reynolds can result in flocking 
for dynamic agents, provided that the graph 
representing interagent communications stays 
connected. Olfati-Saber and coauthors (2007) 
developed similar results with a slightly different 
model. 

Many generalizations and extension for these 
results exist in a diverse set of disciplines, re¬ 
sulting in a rich theory which has had appli¬ 
cations from robotics (such as rendezvous in 
mobile robots) (Cortes et al. 2006) to mathemati¬ 
cal sociology (Hegselmann and Krause 2002) and 
from economics (Golub and Jackson 2010) to dis¬ 
tributed optimization theory (Nedic and Ozdaglar 
2009). However, some of the fundamental math¬ 
ematical questions related to flocking still remain 
open. 

First, most results focus on endogenous mod¬ 
els of network change. A notable extension is 
a paper by Cucker and Smale (2007), in which 
the authors develop and analyze an endogenous 
model of flocking that cleverly smoothens out 
the discontinuous change in network structure by 
allowing each node’s influence to decay smoothly 
as a function of distance. 

Recently, in a series of papers, Chazelle has 
made progress in this arena by using tools from 
computational geometry and algorithms for anal¬ 
ysis of endogenous models of flocking (Chazelle 
2012). Chazelle has introduced the notion of the 
s-energy of a flock, which can be thought of as a 
parameterized family of Lyapunov functions that 
represent the evolution of global misalignment 
between flockmates. Via tools from dynamical 
systems, computational geometry, combinatorics, 
complexity theory, and algorithms, Chazelle cre¬ 
ates an “algorithmic calculus,” for diffusive in¬ 
fluence systems: surprisingly, he shows that the 
orbit or flow of such systems is attracted to a 
fixed point in the case of undirected graphs and a 
limit cycle for almost all arbitrarily small random 
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perturbations. Furthermore, the convergence time 
can also be bounded in both cases and the bounds 
are essentially optimal. The setup of the diffusive 
influence system developed by Chazelle creates a 
near-universal setup for analyzing various prob¬ 
lems involving collective behavior in networked 
multiagent systems, from flocking, opinion dy¬ 
namics, and information aggregation to synchro¬ 
nization problems. 

To make further progress on analysis of what 
one might call networked dynamical systems 
(which Chazelle calls influences systems), one 
needs to combine mathematics of algorithms, 
complexity, combinatorics, and graphs with 
systems theory and dynamical systems. 

Summary and Future Directions 

This article presented a brief summary of the 
literature on flocking and distributed motion co¬ 
ordination. Flocking is the process by which 
various species exhibit synchronous collective 
motion from simple local interaction rules. Mo¬ 
tivated by social aggregation in various species, 
various algorithms have been developed in the 
literature to design distributed control laws for 
group behavior in collective robotics and analysis 
of opinion dynamics in social networks. The 
models describe each agent as a kinematic or 
point mass particle that aligns each agent’s direc¬ 
tion with that of its neighbors using repeated local 
averaging of directions. Since the neighborhood 
structures change due to motion, this results in a 
distributed switched dynamical system. If a weak 
notion of connectivity among agents is preserved 
over time, then agents reach consensus in their 
direction of motion. Despite the flurry of results 
in this area, the analysis of this phenomenon that 
accounts for endogenous change in dynamics is 
for the most part open. 

Cross-References 

► Averaging Algorithms and Consensus 

► Oscillator Synchronization 


Bibliography 

Chatterjee S, Seneta E (1977) Towards consensus: some 
convergence theorems on repeated averaging. J Appl 
Probab 14:89-97 

Chazelle B (2012) Natural algorithms and influence sys¬ 
tems. Commun ACM 55(12): 101-110 

Cortes J, Martinez S, Bullo F (2006) Robust rendezvous 
for mobile autonomous agents via proximity graphs 
in arbitrary dimensions. IEEE Trans Autom Control 
51(8): 1289-1298 

Cucker F, Smale S (2007) Emergent behavior 
in flocks. IEEE Trans Autom Control 52(5): 
852-862 

DeGroot MH (1974) Reaching a consensus. J Am Stat 
Assoc 69(345): 118-121 

Golub B, Jackson MO (2010) Naive learning in social 
networks and the wisdom of crowds. Am Econ J 
Microecon 2(1): 112—149 

Hegselmann R, Krause U (2002) Opinion dynamics and 
bounded confidence models, analysis, and simulation. 
J Artif Soc Soc Simul 5(3):2 

Jadbabaie A, Lin J, Morse AS (2003) Coordination of 
groups of mobile autonomous agents using nearest 
neighbor rules. IEEE Trans Autom Control 48(6):988- 
1001 

Moreau L (2005) Stability of multiagent systems with 
time-dependent communication links. IEEE Trans Au¬ 
tom Control 50(2): 169-182 

Nedic A, Ozdaglar A (2009) Distributed subgradient 
methods for multi-agent optimization. IEEE Trans 
Autom Control 54(1):48-61 

Olfati-Saber R, Fax JA, Murray RM (2007) Consensus 
and cooperation in networked multi-agent systems. 
Proc IEEE 95(l):215-233 

Reynolds CW (1987) Flocks, herds, and schools: a dis¬ 
tributed behavioral model. Comput Graph 21(4):25- 
34. (SIGGRAPH ’87 Conference Proceedings) 

Tanner HG, Jadbabaie A, Pappas GJ (2003) Stable flock¬ 
ing of mobile agents, Part I: fixed Topology, Part II: 
switching topology. In: Proceedings of the 42nd IEEE 
conference on decision and control, Maui, vol 2. IEEE, 
pp 2010-2015 

Tanner HG, Jadbabaie A, Pappas GJ (2007) Flocking 
in fixed and switching networks. IEEE Trans Autom 
Control 52(5):863-868 

Tsitsiklis JN (1984) Problems in decentralized decision 
making and computation (No. LIDS-TH-1424). Lab¬ 
oratory for Information and Decision Systems, Mas¬ 
sachusetts Institute of Technology 

Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed 
asynchronous deterministic and stochastic gradient 
optimization algorithms. IEEE Trans Autom Control 
31(9):803—812 

Vicsek T, Czirok A, Ben-Jacob E, Cohen I, Shochet 
O (1995) Novel type of phase transition in a sys¬ 
tem of self-driven particles. Phys Rev Lett 75(6): 
1226 



Force Control in Robotics 


463 


Force Control in Robotics 

Luigi Villani 

Dipartimento di Ingeneria Elettrica e Tecnologie 
dell’Informazione, Universita degli Studi di 
Napoli Federico II, Napoli, Italy 


Abstract 

Force control is used to handle the physical inter¬ 
action between a robot and the environment and 
also to ensure safe and dependable operation in 
the presence of humans. The control goal may be 
that to keep the interaction forces limited or that 
to guarantee a desired force along the directions 
where interaction occurs while a desired motion 
is ensured in the other directions. This entry 
presents the basic control schemes, focusing on 
robot manipulators. 
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Introduction 

Control of the physical interaction between 
a robot manipulator and the environment is 
crucial for the successful execution of a number 
of practical tasks where the robot end effector 
has to manipulate an object or perform some 
operation on a surface. Typical examples in 
industrial settings include polishing, deburring, 
machining, or assembly. 

During contact, the environment may set con¬ 
straints on the geometric paths that can be fol¬ 
lowed by the robot’s end effector (kinematic 
constraints) as in the case of sliding on a rigid 
surface. In other situations, the interaction occurs 
with a dynamic environment as in the case of 


collaboration with a human. In all cases, a pure 
motion control strategy is not recommended, es¬ 
pecially if the environment is stiff. 

The higher the environment stiffness and po¬ 
sition control accuracy are, the more easily the 
contact forces may rise and reach unsafe values. 
This drawback can be overcome by introducing 
compliance, either in a passive or in an active 
fashion, to accommodate the robot motion in 
response to interaction forces. 

Passive compliance may be due to the struc¬ 
tural compliance of the links, joints, and end ef¬ 
fector or to the compliance of the position servo. 
Soft robot arms with elastic joints or links are 
purposely designed for intrinsically safe interac¬ 
tion with humans. In contrast, active compliance 
is entrusted to the control system, denoted inter¬ 
action control or force control. In same cases, the 
measurement of the contact force and moment 
is required, which is fed back to the controller 
and used to modify or even generate online the 
desired motion of the robot (Whitney 1977). 

The passive solution is faster than active re¬ 
action commanded by a computer control algo¬ 
rithm. However, the use of passive compliance 
alone lacks of flexibility and cannot guarantee 
that high contact forces will never occur. Hence, 
the most effective solution is that of using ac¬ 
tive force control (with or without force feed¬ 
back) in combination with some degree of passive 
compliance. 

In general, six force components are required 
to provide complete contact force information: 
three translational force components and three 
torques. Often, a force/torque sensor is mounted 
at the robot wrist (see an example in Fig. 1), 
but other possibilities exist, for example, force 
sensors can be placed on the fingertips of robotic 
hands; also, external forces and moments can be 
estimated via shaft torque measurements of joint 
torque sensors. 

The force control strategies can be grouped 
into two categories (Siciliano and Villani 1999): 
those performing indirect force control and those 
performing direct force control. The main dif¬ 
ference between the two categories is that the 
former achieve force control via motion control, 
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Force Control in Robotics, Fig. 1 Industrial robot with 
wrist force/torque sensor and deburring tool 

without explicit closure of a force feedback loop; 
the latter instead offer the possibility of control¬ 
ling the contact force and moment to a desired 
value, thanks to the closure of a force feedback 
loop. 

Modeling 

The case of interaction of the end effector of a 
robot manipulator with the environment is con¬ 
sidered, which is the most common situation in 
industrial applications. 

The end-effector pose can be represented by 
the position vector p e and the rotation matrix R e , 
corresponding to the position and orientation of a 
frame attached to the end effector with respect to 
a fixed-base frame. 

The end-effector velocity is denoted by the 
6x1 twist vector v e = (p J where p Q is 

the translational velocity and co Q is the angular 
velocity and can be computed from the joint 
velocity vector q using the linear mapping 


Ve = • 

The matrix J is the end-effector Jacobian. For 
simplicity, the case of nonredundant nonsingular 
manipulators is considered; therefore, the Jaco¬ 
bian is a square nonsingular matrix. 

The force / e and moment m e applied by the 
end effector to the environment are the compo¬ 
nents of the wrench h e = (fl ml ) T . The joint 
torques r corresponding to h Q can be computed 
as 

r = J T (q)h e . 

It is useful to consider the operational space 
formulation of the dynamic model of a rigid 
robot manipulator in contact with the environ¬ 
ment (Khatib 1987): 

A(tf)v e + r(q , q)\ Q + rj(q) = h c - h Q , (1) 

where A (<q ) is the 6 x 6 operational space inertia 
matrix, r(q,q ) is the wrench including centrifu¬ 
gal and Coriolis effects, and rj(q) is the wrench of 
the gravitational effects. The vector h c = J~ T r is 
the equivalent end-effector wrench corresponding 
to the input joint torques r c . 

Equation (1) can be seen as a representation 
of the Newton’s Second Law of Motion where all 
the generalized forces acting on the joints of the 
robot are reported at the end effector. 

The full specification of the system dynamics 
would require also the analytic description of the 
interaction force and moment h Q . This is a very 
demanding task from a modeling viewpoint. 

The design of the interaction control and the 
performance analysis are usually carried out un¬ 
der simplifying assumptions. The following two 
cases are considered: 

1. The robot is perfectly rigid, all the compliance 
in the system is localized in the environment, 
and the contact wrench is approximated by 
a linear elastic model. 

2. The robot and the environment are perfectly 
rigid and purely kinematics constraints are 
imposed by the environment. 

It is obvious that these situations are only ideal. 
However, the robustness of the control should be 
able to cope with situations where some of the 
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ideal assumptions are relaxed. In that case the 
control laws may be adapted to deal with nonideal 
characteristics. 

Indirect Force Control 

The aim of indirect force control is that of achiev¬ 
ing a desired compliant dynamic behavior of the 
robot’s end effector in the presence of interaction 
with the environment. 

Stiffness Control 

The simpler approach is that of imposing a suit¬ 
able static relationship between the deviation of 
the end-effector position and orientation from a 
desired pose and the force exerted on the envi¬ 
ronment, by using the control law 

h c = KpAx d e - K D v e + r]{q) , (2) 

where K P and K D are suitable matrix gains and 
AjCde is a suitable error between a desired and the 
actual end-effector position and orientation. The 
position error component of Ajt de can be simply 
chosen as p d — p Q . Concerning the orientation 
error component, different choices are possible 
(Caccavale et al. 1999), which are not all equiv¬ 
alent, but this issue is outside the scope of this 
entry. 

The control input (2) corresponds to a wrench 
(force and moment) applied to the end effector, 
which includes a gravity compensation term rj (< q ), 
a viscous damping term K D v e , and an elastic 
wrench provided by a virtual spring with stiffness 
matrix K P (or, equivalently, compliance matrix 
Kp 1 ) connecting the end-effector frame with a 
frame of desired position and orientation. This 
control law is known as stiffness control or com¬ 
pliance control (Salisbury 1980). 

Using the Lyapunov method, it is possible to 
prove the asymptotic stability of the equilibrium 
solution of equation 

K P Ax de = h Q , 

meaning that, at steady state, the robot’s end 
effector has a desired elastic behavior under the 


action of the external wrench h e . It is clear that, 
if h e 7 ^ 0 , then the end effector deviates from the 
desired pose, which is usually denoted as virtual 
pose. 

Physically, the closed-loop system (1) with 
(2) can be seen as a 6 -DOF nonlinear and 
configuration-dependent mass-spring-damper 
system with inertia (mass) matrix A(#) and 
adjustable damping K D and stiffness K P , under 
the action of the external wrench h t . 

Impedance Control 

A configuration-independent dynamic behavior 
can be achieved if the measure of the end-effector 
force and moment h Q is available, by using the 
control law, known as impedance control (Hogan 
1985): 

h c = A(q)a + r(q,q)q + q(q) + h e , 
where a is chosen as: 

ct = v d + K-^KcAvde + K P Ax de - h e ) . 

The following expression can be found for the 
closed-loop system 

K M Av de + K D Av de + K P Ax de = h & , (3) 

representing the equation of a 6 -DOF 
configuration-independent mass-spring-damper 
system with adjustable inertia (mass) matrix 
Km, damping Kd, and stiffness K P , known as 
mechanical impedance. 

A block diagram of the resulting impedance 
control is sketched in Fig. 2. 

The selection of good impedance parameters 
ensuring a satisfactory behavior is not an easy 
task and can be simplified under the hypothesis 
that all the matrices are diagonal, resulting in 
a decoupled behavior for the end-effector coor¬ 
dinates. 

Moreover, the dynamics of the controlled sys¬ 
tem during the interaction depends on the dynam¬ 
ics of the environment that, for simplicity, can 
be approximated as a simple elastic law for each 
coordinate, of the form 
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Force Control in Robotics, Fig. 2 Impedance control 

A e = k A x eo , 

where Ax eo = x e — x 09 while v 0 and k are the 
undeformed position and the stiffness coefficient 
of the spring, respectively. 

In the above hypotheses, the transient behavior 
of each component of Eq. (3) can be set by as¬ 
signing the natural frequency and damping ratio 
with the relations 


<^n 



1 kj) 

2 VMfcp + k) ' 


Hence, if the gains are chosen so that a given 
natural frequency and damping ratio are ensured 
during the interaction (i.e., for k ^ 0), a smaller 
natural frequency with a higher damping ratio 
will be obtained when the end effector moves in 
free space (i.e., for k = 0). As for the steady- 
state performance, the end-effector error and the 
interaction force for the generic component are 


AXde — 


k 

(k P + k) 


Axdo > 


h = 


kpk 
kp + k 


Axdo > 


showing that, during interaction, the contact 
force can be made small at the expense of 
a large position error in steady state, as long 
as the robot stiffness kp is set low with respect 


to the stiffness of the environment k and vice 
versa. 


Direct Force Control 

Indirect force control does not require explicit 
knowledge of the environment, although to 
achieve a satisfactory dynamic behavior, the 
control parameters have to be tuned for 
a particular task. On the other hand, a model 
of the interaction task is usually required for the 
synthesis of direct force control algorithms. 

In the following, it is assumed that the en¬ 
vironment is rigid and frictionless and imposes 
kinematic constraints to the robot’s end-effector 
motion (Mason 1981). These constraints reduce 
the dimension of the space of the feasible end- 
effector velocities and of the contact forces and 
moments. In detail, in the presence of m inde¬ 
pendent constraints (m < 6), the end-effector 
velocity belongs to a subspace of dimension 6 — 
m, while the end-effector wrench belongs to a 
subspace of dimension m and can be expressed 
in the form 


v e = S v (#)v , h e = S f (q)X 

where v is a suitable (6 — m) x 1 vector and A is 
a suitable m x 1 vector. Moreover, the subspaces 
of forces and velocity are reciprocal , i.e.: 
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/*[v e =0, Sj(q)S v (q) = 0 . 

The concept of reciprocity expresses the physical 
fact that, in the hypothesis of rigid and friction¬ 
less contact, the wrench does not cause any work 
against the twist. 

An interaction task can be assigned in terms of 
a desired end-effector twist Vd and wrench h d that 
are computed as: 

Vd = S v Vd, ha = SfAd , 

by specifying vectors Ad and Vd- 

In many robotic tasks it is possible to set an or¬ 
thogonal reference frame, usually referred as task 
frame (De Schutter and Van Brussel 1988), in 
which the matrices S v and Sf are constant. More¬ 
over, the interaction task is specified by assigning 
a desired force/torque or a desired linear/angular 
velocity along/about each of the frame axes. 

An example of task frame definition and task 
specification is given below. 

Peg-in-Hole: The goal of this task is to push 
the peg into the hole while avoiding wedging and 
jamming. The peg has two degrees of motion 
freedom; hence, the dimension of the velocity- 
controlled subspace is 6 — m = 2, while the 
dimension of the force-controlled subspace is 
m = 4. The task frame can be chosen as shown 
in Fig. 3, and the task can be achieved by assign¬ 
ing the following desired forces and torques: 

• Zero forces along the x t and y t axes 

• Zero torques about the x t and y t axes and the 
desired velocities 

• A nonzero linear velocity along the z t -axis 

• An arbitrary angular velocity about the z r axis 
The task continues until a large reaction force in 
the zt direction is measured, indicating that the 
peg has hit the bottom of the hole, not represented 
in the figure. Hence, the matrices Sf and S v can be 
chosen as 


n o o o\ 


(0 0\ 

0 100 


00 

000 0 


1 0 

00 10 

, Sv — 

00 

000 1 


00 

\0 0 0 o) 


\0 1/ 



Force Control in Robotics, Fig. 3 Insertion of a cylin¬ 
drical peg into a hole 

The task frame can be chosen attached either to 
the end effector or to the environment. 

Hybrid Force/Motion Control 

The reciprocity of the velocity and force sub¬ 
spaces naturally leads to a control approach, 
known as hybrid force/motion control (Raibert 
and Craig 1981; Yoshikawa 1987), aimed at con¬ 
trolling simultaneously both the contact force 
and the end-effector motion in two reciprocal 
subspaces. 

The reduced order dynamics of the robot with 
kinematic constraints is described by 6 — m 
second-order equations 

A V (?)v = S l[h c -n(q,q)] , 

where A v = Sf AS V and n(q, q) = r(q.q)\ c + 
rj(q ), assuming constant matrices S v and Sf. 
Moreover, the vector A can be computed as 

A = Sl(q)[h c - fi(q,q)] , 

revealing that the contact force is a constraint 
force which instantaneously depends on the ap¬ 
plied input wrench h c . 

An inverse-dynamics inner control loop can be 
designed by choosing the control wrench h c as 

h c = A(gr)S v a v + Sf f x + fi(q, q) , 
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where a v and f x are properly designed control 
inputs, which leads to the equations 

v=a v , X=f x , 

showing a complete decoupling between motion 
control and force control. 

Then, the desired force A d (0 can be achieved 
by setting 

fx = MO. 

but this choice is very sensitive to disturbance 
forces, since it contains no force feedback. Al¬ 
ternative choices are 

fx=ld(t) + Kj> k [X d (t)-X(t)], 


or 


fx — (0 + Ku 


/' 


[A d (r) - A(r)]dr , 


where Kp^ and K\x are suitable positive-definite 
matrix gains. The proportional feedback is able to 
reduce the force error due to disturbance forces, 
while the integral action is able to compensate for 
constant bias disturbances. 

Velocity control is achieved by setting 


Civ 


= Vd(0 + Kp v [vd(0 — v(0] 

+ Ki v f [v d (r) - v(r)]dr , 
Jo 


where Kp v and Ki v are suitable matrix gains. It is 
straightforward to show that asymptotic tracking 
of v d (0 and v d (0 is ensured with exponential 
convergence for any choice of positive-definite 
matrices Kp v and Ki v . 

Notice that the implementation of force feed¬ 
back requires the computation of vector A from 
the measurement of the end-effector wrench h Q as 
Sjv e , being sj a suitable pseudoinverse of matrix 
Sf. Analogously, vector v can be computed from 
v e as Sjv e . 

The hypothesis of rigid contact can be re¬ 
moved, and this implies that along some direc¬ 
tions both motion and force are allowed, although 
they are not independent. Hybrid force/motion 


control schemes can be defined also in this case 
Villani and De Schutter (2008). 


Summary and Future Directions 

This entry has sketched the main approaches to 
force control in a unifying perspective. However, 
there are many aspects that have not been consid¬ 
ered here. The two major paradigms of force con¬ 
trol (impedance and hybrid force/motion control) 
are based on several simplifying assumptions that 
are only partially satisfied in practical implemen¬ 
tations and that have been partially removed in 
more advanced control methods. 

Notice that the performance of a force- 
controlled robotic system depends on the 
interaction with a changing environment which 
is very difficult to model and identify correctly. 
Hence, the standard performance indices used 
to evaluate a control system, i.e., stability, 
bandwidth, accuracy, and robustness, cannot 
be defined by considering the robotic system 
alone, as for the case of robot motion control, but 
must be always referred to the particular contact 
situation at hand. 

Force control in industrial applications can 
be considered as a mature technology, although, 
for the reason explained above, standard design 
methodologies are not yet available. Force 
control techniques are employed also in medical 
robotics, haptic systems, telerobotics, humanoid 
robotics, micro-robotics, and nano robotics. 
An interesting field of application is related to 
human-centered robotics, where control plays 
a key role to achieve adaptability, reaction 
capability, and safety. Robots and biomechatronic 
systems based on the novel variable impedance 
actuators, with physically adjustable compliance 
and damping, capable to react softly when 
touching the environment, necessitate the design 
of specific control laws. The combined use of 
exteroceptive sensing (visual, depth, proximity, 
force, tactile sensing) for reactive control in 
the presence of uncertainty represents another 
challenging research direction. 
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Cross-References 

► Robot Grasp Control 

► Robot Motion Control 


Recommended Reading 

This entry has presented a brief overview of the 
basic force control techniques, and the cited refer¬ 
ences represent a selection of the main pioneering 
contributions. A more extensive treatment of this 
topic with related bibliography can be found 
in Villani and De Schutter (2008). Besides 
impedance control and hybrid force/position 
control, an approach designed to cope with 
uncertainties in the environment geometry is 
the parallel force/position control (Chiaverini 
and Sciavicco 1993; Chiaverini et al. 1994). In 
the paper Ott et al. (2008) the passive compliance 
of lightweight robots is combined with the active 
compliance ensured by impedance control. A 
systematic constraint-based methodology to 
specify complex tasks has been presented by 
De Schutter et al. (2007). 
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Abstract 

In this chapter we give an introduction to fre¬ 
quency domain system identification. We start 
from the identification work loop in ► System 
Identification: An Overview, Fig. 4, and we dis¬ 
cuss the impact of selecting the time or frequency 
domain approach on each of the choices that are 
in this loop. Although there is a full theoreti¬ 
cal equivalence between the time and frequency 
domain identification approach, it turns out that, 
from practical point of view, there can be a 
natural preference for one of both domains. 

Keywords 

Discrete and continuous time models; Experi¬ 
ment setup; Frequency and time domain identi¬ 
fication; Plant and noise model 
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Introduction 

System identification provides methods to build 
a mathematical model for a dynamical system 
starting from measured input and output signals 
(► System Identification: An Overview; see the 
section “Models and System Identification”). 
Initially, the field was completely dominated by 
the time domain approach, and the frequency 
domain was used to interpret the results (Ljung 
and Glover 1981). This picture changed in the 
nineteenth of the last century by the development 
of advanced frequency domain methods (Ljung 
2006; Pintelon and Schoukens 2012), and 
nowadays it is widely accepted that there is 
a full theoretical equivalence between time 
and frequency domain system identification 
under some weak conditions (Agiiero et al. 
2009; Pintelon and Schoukens 2012). Dedicated 
toolboxes are available for both domains (Kollar 
1994; Ljung 1988). This raises the question how 
to choose between time and frequency domain 
identification. Many times the choice between 
both approaches can be made based on the user’s 
familiarity with one of two methods. However, 
for some problems it turns out that there is a 
natural preference for the time or the frequency 
domain. This contribution discusses the main 
issues that need to be considered when making 
this choice, and it provides additional insights to 
guide the reader to the best solutions for her/his 
problem. 

In the identification work loop ► System 
Identification: An Overview, Fig. 4, we need 
to address three important questions (Ljung 
1999, Sect. 1.4; Soderstrom and Stoica 1989, 
Chap. 1; Pintelon and Schoukens 2012, Sect. 1.4) 
that directly interact with the choice between 
time and frequency domain system identifica¬ 
tion: 

• What data are available? What data are 
needed? This discussion will influence the 
selection of the measurement setup, the model 
choice, and the design of the experiment. 

• What kind of models will be used? We will 
mainly focus on the identification of discrete 
time and continuous time models, using ex¬ 
actly the same frequency domain tools. 


• How will the model be matched to the data? 
This question boils down to the choice of a 
cost function that measures the distance be¬ 
tween the data and the model. We will discuss 
the use of nonparametric weighting functions 
in the frequency domain. 

In the next sections, we will address these 
and similar questions in more detail. First, 
we discuss the measurement of the raw 
data. The choices that are made in this 
step will have a strong impact on many 
user aspects of the identification process. 
The frequency domain formulation will turn 
out to be a natural choice to propose a 
unified formulation of the system identification 
problem, including discrete and continuous 
time modeling. Next, a generalized frequency 
domain description of the system relation will 
be proposed. This model will be matched to 
the data, using a weighted least squares cost 
function, formulated in the frequency domain 
identification. This will allow for the use of 
nonparametric weighting functions, based on 
a nonparametric preprocessing of the data. 
Eventually, some remaining user aspects are 
discussed. 

Data Collection 

In this section, we discuss the measurement as¬ 
sumptions that are made when the raw data are 
collected. It will turn out that these will directly 
influence the natural choice of the models that 
are used to describe the continuous time physical 
systems. 

Time Domain and Frequency Domain 
Measurements 

The data can be collected either in the time or 
in the frequency domain, and we discuss briefly 
both options. 

Time Domain Measurements 
Most measurements are nowadays made in the 
time domain because very fast high-quality 
analog-to-digital convertors (ADC) became 
available at a low price. These allow us to sample 
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Frequency Domain 
System Identification, 
Fig. 1 Comparison of the 
ZOH and BL signal 
reconstruction of a discrete 
time sequence: (a) time 
domain: • • • BL, - ZOH; 

(b) spectrum ZOH signal; 

(c) spectrum BL signal 



and discretize the continuous time input and 
output signals and process these on a digital 
computer. Also the excitation signals are mostly 
generated from a discrete time sequence with a 
digital-to-analog convertor (DAC). 

The continuous time input and output sig¬ 
nals u c (t),y c (t ) of the system to be modeled 
are measured at the sampling moments t k = 
kT s , with T s = 1 / f s the sampling period and 
f s the sampling frequency: u(k) = u c (kT s ), 
and y(k) = y c (kT s ). The discrete time signals 
u(k),y(k), k = l,--- ,N are transformed to 
the frequency domain using the discrete Fourier 
transform (DFT) (► Nonparametric Techniques 
in System Identification, Eq. 1), resulting in the 
DFT spectra U(l),Y(l ), at the frequencies // = 
/^. Making some abuse of notation, we will 
reuse the same symbols later in this text, to de¬ 
note the Z-transform of the signals, for example, 
Y(z) will also be used for the Z-transform of 
y(k). 

Frequency Domain Measurements 
A major exception to this general trend toward 
time domain measurements are the (high- 
frequency) network analyzers that measure 
the transfer function of a system frequency 
by frequency, starting from the steady- 
state response to a sine excitation. The 
frequency is stepped over the frequency 


band of interest, resulting directly in a 
measured frequency response function at a 
user selected set of frequencies od k ,k = 
1 ,••• ,F: 

G(cok). 

From the identification point of view, we 
can easily fit the latter situation in the 
frequency domain identification framework, by 
putting 

U(k ) = 1 ,Y(k) = G{co k ). 

For that reason we will focus completely on 
the time domain measurement approach in the 
remaining part of this contribution. 

Zero-Order-Hold and Band-Limited Setup: 
Impact on the Model Choice 

No information is available on how the continu¬ 
ous time signals u c (t), y c (t ) vary in between the 
measured samples u(k),y(k). For that reason 
we need to make an assumption and make sure 
that the measurement setup is selected such that 
the intersample assumption is met. Two inter¬ 
sample assumptions are very popular (Pintelon 
and Schoukens 2012; Schoukens et al. 1994, 
pp. 498-512): the zero-order-hold (ZOH) and the 
band-limited assumption (BL). Both options are 
shown in Fig. 1 and discussed below. The choice 
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of these assumptions does not only affect the 
measurement setup; it has also a strong impact on 
the selection between a continuous or a discrete 
time model choice. 

Zero-Order Hold 

The ZOH setup assumes that the excitation 
remains constant in between the samples. In 
practice, the model is identified between the 
discrete time reference signal in the memory 
of the generator and the sampled output. The 
intersample behavior is an intrinsic part of the 
model: if the intersample behavior changes, also 
the corresponding model will change. The ZOH 
assumption is very popular in digital control. In 
that case the sampling frequency f s is commonly 
chosen 10 times larger than the frequency band 
of interest. 

A discrete time model gives, in case of noise- 
free data, an exact description between the sam¬ 
pled input u(k ) and output y (k) of the continuous 
time system: 

y(k) = G(q, 6)u(k) 

in the time domain (► System Identification: An 
Overview, Eq. 6). In this expression, q denotes 
the shift operator (time domain), and it is replaced 
by z in the z-domain description (transfer func¬ 
tion description): 

Y(z) = G(z, 0)U(z). 

Evaluating the transfer function at the unit circle 
by replacing z = e lco results in the frequency 
domain description of the system: 

Y(e ia) ) = G(e ico , 0)U(e ico ) 

(► System Identification: An Overview, Eq. 34). 
Band-Limited Setup 

The BL setup assumes that above a given 
frequency / max < fs/ 2 , there is no power in 
the signals. The continuous time signals are 
filtered by well-tuned anti-alias filters (cutoff 
frequency / max < f s / 2 ), before they are 
sampled. Outside the digital control world, the 


BL setup is the standard choice for discrete 
time measurements. Without using anti-alias 
filters, large errors can be created due to aliasing 
effects: the high frequency (/ > f s /2 ) content 
of the measured signals is folded down in the 
frequency band of interest and act there as a 
disturbance. For that reason it is strongly advised 
to use always anti-alias filters in the measurement 
setup. 

The exact relation between BL signals is de¬ 
scribed by a continuous time model, for example, 
in the frequency domain: 

Y(co) = G(co, 6 )U(go) 

(Schoukens et al. 1994). 

Combining Discrete Time Models and BL Data 
It is also possible to identify a discrete time 
model between the BL data, at a cost of creat¬ 
ing (very) small model errors (Schoukens et al. 
1994). This is the standard setup that is used in 
digital signal procession applications like digital 
audio processing. The level of the model errors 
can be reduced by lowering the ratio /max /fs 
or by increasing the model order. In a properly 
designed setup, the discrete time model errors can 
be made very small, e.g., relative errors below 
10 -5 . In Table 1 an overview of the models 
corresponding to the experimental conditions is 
given. 

Extracting Continuous Time Models from 
ZOH Data 

Although the most robust and practical choice 
to identify a continuous time model is to start 
from BL data, it is also possible to extract a 
continuous time model under the ZOH setup. 
A first possibility is to assume that the ZOH 
assumption is perfectly met, which is a very 
hard assumption to realize in practice. In that 
case the continuous time model can be retrieved 
by a linear step invariant transformation of the 
discrete time model. A second possibility is to 
select a very high sample frequency with respect 
to the bandwidth of the system. In that case it is 
advantageous to describe the discrete time model 
using the delta operator (Goodwin 2010), and 
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Frequency Domain System Identification, Table 1 Relations between the continuous time system G(s) and the 
identified models as a function of the signal and model choices 


DT-model (Assuming ZOH-setup) 

CT-model (Assuming BL-setup) 

ZOH setup 

Exact DT-model 



G(z) = (l-z-‘)z{^} 

‘standard conditions DT modelling’ 

Not studied 

BL setup 

Approximate DT model 

G(z) G (z = e^ r ’) ss G(s - 
jw), |<y| < f 

Exact CT-model G(s) 


‘digital signal processing field’ 

‘standard conditions CT modelling’ 


we have that the coefficients of the discrete time 
model converge to those of the continuous time 
model. 

Models of Cascaded Systems 
In some problems we want to build models for a 
cascade of two systems G\, G 2 . It is well known 
that the overall transfer function G is given by 
the product G(co) = Gi(co)G 2 (co). This result 
holds also for models that are identified under the 
BL signal assumption: the model for the cascade 
will be the product of the models of the individual 
systems. However, the result does not hold under 
the ZOH assumption, because the intermediate 
signal between G \, G 2 does not meet the ZOH 
assumption. For that reason, the ZOH model of 
a cascaded system is not obtained by cascading 
the ZOH models of the individual systems. 

Experiment Design: Periodic or Random 
Excitations? 

In general, arbitrary data can be used to identify 
a system as long as some basic requirements 
are respected (►System Identification: An 
Overview, section on Experiment Design). 
Imposing periodic excitations can be an 
important restriction of the user’s freedom to 
design the experiment, but we will show in the 
next sections that it offers also major advantages 
at many steps in the identification work 
loop (► System Identification: An Overview, 
Fig. 4). 

With the availability of arbitrary wave form 
generators, it became possible to generate arbi¬ 
trary periodic signals. The user should make two 
major choices during the design of the periodic 


excitation: the selection of the amplitude spec¬ 
trum (How is the available power distributed over 
the frequency?) and the choice of the frequency 
resolution (What is the frequency step between 
two successive points of the measured FRF?) 
(Pintelon and Schoukens 2012, Sect. 5.3). 

The amplitude spectrum is mainly set by the 
requirement that the excited frequency band 
should cover the frequency band of interest. 
A white noise excitation covers the full frequency 
band, including those bands that are of no interest 
for the user. This is a waste of power and it should 
be avoided. Designing a good power spectrum for 
identification and control purposes is discussed 
in ►Experiment Design and Identification for 
Control. 

The frequency resolution /o = 1 / T is set by 
the inverse of the period of the signal. It should be 
small enough so that no important dynamics are 
missed, e.g., a very sharp mechanical resonance. 

The reader should be aware that exactly the 
same choices have to be made during the design 
of nonperiodic excitations. If, for example, a 
random noise excitation is used, the frequency 
resolution is also restricted by the length of the 
experiment T m and the corresponding frequency 
resolution is again /o = 1 / T m . The power 

spectrum of the noise excitation should be well 
shaped using a digital filter. 

Nonparametric Preprocessing of the Data 
in the Frequency Domain 

Before a parametric model is identified from the 
raw data, a lot of information can be gained, 
almost for free, by making a nonparametric ana¬ 
lysis of the data. This can be done with very 
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little user interaction. Some of these methods 
are explicitly linked to the periodic nature of 
the data, other methods apply also to random 
excitations. 

Nonparametric Frequency Analysis 
of Periodic Data 

By using simple DFT techniques, the follow¬ 
ing frequency domain information is extracted 
from sampled time domain data = 

l,N (Pintelon and Schoukens 2012): 

• The signal information: £/(/), T(/), the DFT 
spectra of the input and output, evaluated at 
the frequencies / /o, with k = 1,2 ,--- , F. 

• Disturbing noise variance information: The 

full data record is split, so that each sub¬ 
record contains a single period. For each of 
these, the DFT spectrum is calculated. Since 
the signals are periodic, they do not vary 
from one period to the other, so that the 
observed variations can be attributed to the 
noise. By calculating the sample mean and 
variance over the periods at each frequency, a 
nonparametric noise analysis is available. The 
estimated variances (k ), ( k ) measure the 

disturbing noise power spectrum at frequency 
fk on the input and the output respectively. 
The covariance Gyuik) characterizes the lin¬ 
ear relations between the noise on the input 
and the output. 

This is very valuable information because, even 
before starting the parametric identification step, 
we get already full access to the quality of the 
raw data. As a consequence, there is also no 
interference between the plant model estimation 
and the noise analysis: plant model errors do not 
affect the estimated noise model. It is also im¬ 
portant to realize that there is no user interaction 
requested to make this analysis and it follows 
directly from a simple DFT analysis of the raw 
data. These are two major advantages of using 
periodic excitations. 

Nonlinear Analysis 

Using well-designed periodic excitations, it is 
possible to detect the presence of nonlinear dis¬ 
tortions during the nonparametric frequency step. 
The level of the nonlinear distortions at the output 


of the system is measured as a function of the 
frequency, and it is even possible to differenti¬ 
ate between even (e.g., x 2 ) and odd (e.g., x 3 ) 
distortions. While the first only act as disturb¬ 
ing noise in a linear modeling framework, the 
latter will also affect the linearized dynamics 
and can change, for example, the pole posi¬ 
tions of a system (Pintelon and Schoukens 2012, 
Sect. 4.3). 


Noise and Data Reduction 

By averaging the periodic signals over the suc¬ 
cessive periods, we get a first reduction of the 
noise. An additional noise reduction is possible 
when not all frequencies are excited. If a very 
wide frequency band has to be covered, a fine 
frequency resolution is needed at the low frequen¬ 
cies, whereas in the higher frequency bands, the 
resolution can be reduced. Signals with a loga¬ 
rithmic frequency distribution are used for that 
purpose. Eliminating the unexcited frequencies 
does not only reduce the noise, it also reduces sig¬ 
nificantly the amount of raw data to be processed. 
By combining different experiments that cover 
each a specific frequency band, it is possible to 
measure a system over multiple decades, e.g., 
electrical machines are measured from a few mHz 
to a few kHz. 

In a similar way, it is also possible to focus the 
fit on the frequency band of interest by including 
only those frequencies in the parametric model¬ 
ing step. 


High-Quality Frequency Response Function 
Measurements 

For periodic excitations, it is very simple to ob¬ 
tain high-quality measurements of the nonpara¬ 
metric frequency response function of the system. 
These results can be extended to random excita¬ 
tions at a cost of using more advanced algorithms 
that require more computation time (Pintelon and 
Schoukens, Chap. 7). This approach is discussed 
in detail in ► Nonparametric Techniques in Sys¬ 
tem Identification (Eq. 1). 
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Generalized Frequency Domain 
Models 

A very important step in the identification work 
loop is the choice of the model class (► System 
Identification: An Overview, Fig. 4). Although 
most physical systems are continuous time, the 
models that we need might be either discrete time 
(e.g., digital control, computer simulations, dig¬ 
ital signal processing) or continuous time (e.g., 
physical interpretation of the model, analog con¬ 
trol) (Ljung 1999, Sects. 2.1 and 4.3). A major 
advantage of the frequency domain is that both 
model classes are described by the same transfer 
function model. The only difference is the choice 
of the frequency variable. For continuous time 
models we operate in the Laplace domain, and the 
frequency variable is retrieved on the imaginary 
axis by putting s = jco. For discrete time systems 
we work on the unit circle that is described by the 
frequency variable z = e jl7Z ^^ s . The application 
class can be even extended to include diffusion 
phenomena by putting = ' S /Jco (Pintelon et 
al. 2005). From now on we will use the gen¬ 
eralized frequency variable f2, and depending 
on the selected domain, the proper substitution 
jco , e jljt k/fs , +/Jgo should be made. The unified 
discrete and continuous time description 

Y(k) = G(Q k , 6)U(k). 


the begin and end effects (called leakage in the 
frequency domain). See ► Nonparametric Tech¬ 
niques in System Identification, “The Leakage 
Problem” section. The amazing result is that in 
the frequency domain, both effects are described 
by exactly the same mathematical expression. 
This leads eventually to the following model in 
the frequency domain that is valid for periodic 
and arbitrary (nonperiodic) BL or ZOH exci¬ 
tations (Pintelon et al. 1997; McKelvey 2002; 
Pintelon and Schoukens 2012, Chap. 6): 

Y(k ) = G(fl, 9)U(k) + T g (C2, 9) 


which becomes for SISO (single-input-single- 
output) systems: 


G(f2, 9) 


B(C2, 9) 
A(Q,0)' 


and Tg(£2, 9) = 


/(fl,0) 

A(Q,0)' 


A, B, I are all polynomials in f2. The transient 
term Tq(£2, 9) models transient and leakage ef¬ 
fects. It is most important for the reader to realize 
that this is an exact description for noise free data. 
Observe that it is very similar to the description in 
the time domain: y(t ) = G(q, 9)u(t) + tc(t, 9). 
In that case the transient term to (t, 9) models the 
initial transient that is due to the initial condi¬ 
tions. 


illustrates also very nicely that in the frequency 
domain there is a strong similarity between dis¬ 
crete and continuous time system identification. 

To apply this model to finite length measure¬ 
ments, it should be generalized to include the 
effect of the initial conditions (time domain) or 


Parametric Identification 

Once we have the data and the model available 
in the frequency domain, we define the following 
weighted least squares cost function to match the 
model to the data (Schoukens et al. 1997): 


v m = ±Z 


Y (fc) — G (fl*, 9) U ( k ) - T G {S2 k , 9) 


“ ( k ) + g z v ( k ) IG (flit, 9)\ z - 2Re(& yu (k) G (fl*, 9)) 


The properties of this estimator are fully stud- The formal link with the time domain cost 
ied in Schoukens et al. (1999) and Pintelon and function, as presented in ► System Identification: 
Schoukens (2012, Sect. 10.3), and it is shown that An Overview, can be made by assuming 
it is a consistent and almost efficient estimator that the input is exactly known (<r^ ( k ) = 

under very mild conditions. 0 , Gy v (k) = 0), and replacing the nonparametric 
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noise model on the output by a parametric 
model: 


These changes reduce the cost function, within 
a parameter independent constant A to 


o 2 Y {k) = X\H (C2 k ,0)\ 7 


V r (») = y E 


r\Y(k)-G (S2 k ,&) U 0 (k) - T G (C2 k , 6) 


k= 1 


\H{S2 k ,0)\ 


which is exactly the same expression as Eq. 37 in 
► System Identification: An Overview, provided 
that the frequencies cover the full unit circle 
and the transient term Tq is omitted. The latter 
models the initial condition effects in the time do¬ 
main. The expression shows the full equivalence 
with the classical discrete time domain formula¬ 
tion. If only a subsection of the full unit circle 
is used for the fit, the additional term A log detA 
in ► System Identification: An Overview, Eq. 21 
should be added to the cost function Vp ( 9 ). 


Additional User Aspects in Parametric 
Frequency Domain System 
Identification 

In this section we highlight some additional user 
aspects that are affected by the choice for a time 
or frequency approach to system identification. 

Nonparametric Noise Models 

The use of a nonparametric noise model is a 
natural choice in the frequency domain. It is of 
course also possible to use the parametric noise 
model in the frequency domain formulation, but 
then we would lose two major advantages of the 
frequency domain formulation: (i) For periodic 
excitations, there is no interaction between the 
identification of the plant model and the nonpara¬ 
metric noise model. Plant model errors do not 
show up in the noise model, (ii) The availability 
of the nonparametric noise models eliminates the 
need for tuning the parametric noise model order, 
resulting in algorithms that are easier to use. 

A disadvantage of using a nonparametric noise 
model is that we can no longer express that the 


noise model can share some dynamics with the 
plant model, for example, when the disturbance 
is an unobserved plant input. 

It is also possible to use a nonparametric 
noise model in the time domain. This leads to a 
Toeplitz weighting matrix, and the fast numerical 
algorithms that are used to deal with these make 
use internally of FFT (fast Fourier transform) 
algorithms which brings us back to the frequency 
domain representation of the data. 

Stable and Unstable Plant Models 

In the frequency domain formulation, there is no 
special precaution needed to deal with unstable 
models, so that we can tolerate these models 
without any problem. There are multiple reasons 
why this can be advantageous. The most obvious 
one is the identification of an unstable system, 
operating in a stabilizing closed loop. It can 
also happen that the intermediate models that 
are obtained during the optimization process are 
unstable. Imposing stability at each iteration can 
be too restrictive, resulting in estimates that are 
trapped in a local minimum. A last significant ad¬ 
vantage is the possibility to split the identification 
problem (extract a model from the noisy data) 
and the approximation problem (approximate the 
unstable model by a stable one). This allows us to 
use in each step a cost function that is optimal for 
that step: maximum noise reduction in the first 
step, followed by a user-defined approximation 
criterion in the second step. 

Model Selection and Validation 

An important step in the system identification 
procedure is the tuning of the model complexity, 
followed by the evaluation of the model qual¬ 
ity on a fresh data set. The availability of a 
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nonparametric noise model and a high-quality 
frequency response function measurement sim¬ 
plifies these steps significantly: 

Absolute Interpretation of the Cost Function: 
Impact on Model Selection and Validation 
The weighted least squares cost function, 
using the nonparametric noise weighting, is a 
normalized function. Its expected value equals 
E{V(0)} = (F — ne/2)/F , with no the number 
of free parameters in the model and F the 
number of frequency points. A cost function 
that is far too large points to remaining model 
errors. Unmodeled dynamics result in correlated 
residuals (difference between the measured and 
the modeled FRF): the user should increase the 
model order to capture these dynamics in the 
linear model. A cost function that is far too 
large, while the residuals are white, points to the 
presence of nonlinear distortions: the best linear 
approximation is identified, but the user should be 
aware that this approximation is conditioned on 
the actual excitation signal. A cost function that is 
far too low points to an error in the preprocessing 
of the data, resulting in a bad noise model. 

Missing Resonances 

Some systems are lightly damped, resulting in a 
resonant behavior, for example, a vibrating me¬ 
chanical structure. By comparing the parametric 
transfer function model with the nonparametric 
FRF measurements, it becomes clearly visible if 
a resonance is missed in the model. This can be 
either due to a too simple model structure (the 
model order should be increased), or it can appear 
because the model is trapped in a local minimum. 
In the latter case, better numerical optimization 
and initialization procedures should be looked 
for. 

Identification in the Presence of Noise on 
the Input and Output Measurements 

Within the band-limited measurement setup, both 
the input and the output have to be measured. 
This leads in general to an identification 
framework where both the input and the 
output are disturbed by noise. Such problems 
are studied in the errors-in-variables (EIV) 


framework (Soderstrom 2012). A special case 
is the identification of a system that is captured 
in a feedback loop. In that case we have that the 
noisy output measurements are fed back to the 
input of the system which creates a dependency 
between the input and output disturbance. We 
discuss both situations briefly below. 

Errors-in-Variables Framework 
The major difficulty of the EIV framework 
is the simultaneous identification of the plant 
model describing the input-output relations, 
the noise models that describe the input and 
the output noise disturbances, and the signal 
model describing the coloring of the excitation 
(Soderstrom 2012). Advanced identification 
methods are developed, but today it is still 
necessary to impose strong restrictions on the 
noise models, e.g., correlations between input 
and output noise disturbances are not allowed. 
The periodic frequency domain approach 
encapsulates the general EIV, including mutually 
correlated colored input-output noise. Again, a 
full nonparametric noise model is obtained in the 
preprocessing step. This reduces the complexity 
of the EIV problem to that of a classical 
weighted least squares identification problem 
which makes a huge difference in practice 
(Pintelon and Schoukens 2012; Soderstrom et 
al. 2010). 

Identification in a Feedback Loop 
Identification under feedback conditions can 
be solved in the time domain prediction error 
method (Ljung 1999, Sect. 13.4; Soderstrom 
and Stoica 1989, Chap. 10). This leads to 
consistent estimates, provided that the exact 
plant and noise model structure and order is 
retrieved. In the periodic frequency domain 
approach, a nonparametric noise model is 
extracted (variances input and output noise, and 
covariance between input and output noise) in the 
preprocessing step, without any user interaction. 
Next, these are used as a weighting in the 
weighted least squares cost function which leads 
to consistent estimates provided that the plant 
model is flexible enough to capture the true plant 
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transfer function (Pintelon and Schoukens 2012, 
Sect. 9.18). 

Summary and Future Directions 

Theoretically, there is a full equivalence between 
the time and frequency domain formulation 
of the system identification problem. In many 
practical situations the user can make a free 
choice between both approaches, based on 
nontechnical arguments like familiarity with one 
of both domains. However, some problems can 
be easier formulated in the frequency domain. 
Identification of continuous time models is not 
more involved than retrieving a discrete time 
model. The frequency domain formulation is 
also the natural choice to use nonparametric 
noise models. This eliminates the request to 
select a specific noise model structure and 
order, although this might be a drawback for 
experienced users who can take advantage of a 
clever choice of this structure. The advantages 
that are directly linked to periodic excitation 
signals can be explored most naturally in the 
frequency domain: the noise model is available 
for free, EIV identification is not more involved 
than the output error identification problem, 
and identification under feedback conditions 
does not differ from open-loop identification. 
Nonstationary effects are an example of a 
problem that will be easier detected in the time 
domain. In general, we advise the reader to take 
the best of both approaches and to swap from 
one domain to the other whenever it gives some 
advantage to do so. In the future, it will be 
necessary to extend the framework to include 
a characterization of nonlinear and time-varying 
effects. 

Cross-References 

► System Identification: An Overview 

► Nonparametric Techniques in System Identifi¬ 
cation 

► Experiment Design and Identification for 
Control 


Recommended Reading 

We recommend the reader the books of Ljung 
(1999) and Soderstrom and Stoica (1989) for a 
systematic study of time domain system iden¬ 
tification. The book of Pintelon and Schoukens 
(2012) gives a comprehensive introduction to fre¬ 
quency domain identification. An extended dis¬ 
cussion of the basic choices (intersample behav¬ 
ior, measurement setup) is given in Chap. 13 of 
Pintelon and Schoukens (2012) or in Schoukens 
et al. (1994). The other references in this list 
highlight some of the technical aspects that were 
discussed in this text. 
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Abstract 

A major advantage of using frequency response 
is the ease with which experimental information 
can be used for design purposes. Raw measure¬ 
ments of the output amplitude and phase of a 
plant undergoing a sinusoidal input excitation are 
sufficient to design a suitable feedback control. 
No intermediate processing of the data (such 
as finding poles and zeros or determining sys¬ 
tem matrices) is required to arrive at the system 
model. The wide availability of computers has 
rendered this advantage less important now than 
it was years ago; however, for relatively simple 
systems, frequency response is often still the 
most cost-effective design method. The method 
is most effective for systems that are stable in 
open-loop. Yet another advantage is that it is 


the easiest method to use for designing dynamic 
compensation. 

Keywords 

Bandwidth; Bode plot; Frequency response; Gain 
margin (GM); Magnitude; Phase; Phase margin 
(PM); Resonant peak; Stability 

Introduction: Frequency Response 

A very common way to use the exponential re¬ 
sponse of linear time-invariant systems (LTIs) is 
in finding the frequency response, or response to 
a sinusoid. First we express the sinusoid as a sum 
of two exponential expressions (Euler’s relation): 

Acosiat) = j(e Jo>l + e~ j(0t ). (1) 

Suppose we have an LTI system with input u 
and output y. If we let s = jco in the transfer 
function G(s ), then the response to u{t) = e Ja)t 
is y{t) = G(j(o)e jo)t ; similarly, the response to 
u{t) = e~ jcot is G(—jco)e ~ j(i)t . By superposition, 
the response to the sum of these two exponentials, 
which make up the cosine signal, is the sum of the 
responses: 

y(t) = jlG(jco)e + G(-jco)e-^]. (2) 

The transfer function G(jco) is a complex 
number that can be represented in polar form 
or in magnitude-and-phase form as G(jco) = 
M(a))e J ^ (0 \ or simply G = Me^. With this 
substitution, Eq. (2) becomes for a specific input 
frequency co = co 0 

y(t) = ^M (e^ at+ ^ + e-J^+v)) , 

= AM cos (cot + (p), (3) 

M = | GOV) | = |G(i)| I=M 

= \/{Re[G(;w 0 )]} 2 + {Im[G(;w 0 )]} 2 , 





480 


Frequency-Response and Frequency-Domain Models 


(p = AG(jco) = tan 


Im[G( j(o 0 )] 
Re[G(joj 0 )] 


This means that if an LTI system represented 
by the transfer function G(s ) has a sinusoidal 
input with magnitude A, the output will be sinu¬ 
soidal at the same frequency with magnitude AM 
and will be shifted in phase by the angle cp. M 
is usually referred to as the amplitude ratio or 
magnitude and cp is referred to as the phase and 
they are both functions of the input frequency, co. 
The frequency response can be measured experi¬ 
mentally quite easily in the laboratory by driving 
the system with a known sinusoidal input, letting 
the transient response die, and measuring the 
steady-state amplitude and phase of the system’s 
output as shown in Fig. 1. The input frequency 
is set to sufficiently many values so that curves 
such as the one in Fig. 2 are obtained. Bode sug¬ 
gested that we plot log \ M\ vs. logo; and (p{co) 
vs. \ogoo to best show the essential features of 
G(jco). Hence, such plots are referred to as Bode 
plots. Bode plotting techniques are discussed in 
Franklin et al. (2015). 

We are interested in analyzing the frequency 
response not only because it will help us un¬ 
derstand how a system responds to a sinusoidal 
input, but also because evaluating G(s) with s 
taking on values along the jco axis will prove 
to be very useful in determining the stability of 
a closed-loop system. Since the jco axis is the 


boundary between stability and instability; evalu¬ 
ating G( jco) provides information that allows us 
to determine closed-loop stability from the open- 
loop G(s). 

For the second-order system 


G(s) = 


1 

(s/co n ) 2 + 2 £(s/co n ) + 1 


(5) 


the Bode plot is shown in Fig. 3 for various values 

of f. 

A natural specification for system per¬ 
formance in terms of frequency response is 
the bandwidth, defined to be the maximum 
frequency at which the output of a system will 
track an input sinusoid in a satisfactory manner. 
By convention, for the system shown in Fig. 4 
with a sinusoidal input r, the bandwidth is the 
frequency of r at which the output y is attenuated 
to a factor of 0.707 times the input (If the output 
is a voltage across a 1-2 resistor, the power is 
v 2 and when \v\ = 0.707, the power is reduced 
by a factor of 2. By convention, this is called 
the half-power point.). Figure 5 depicts the idea 
graphically for the frequency response of the 
closed-loop transfer function 


m A KG(S) 

R(s ) S 1 + KG(s )' 


( 6 ) 


Frequency-Response 
and Frequency-Domain 
Models, Fig. 1 Response 
of G(s) = to the 
input u = sin 10 1 (Source: 
Franklin et al. (2010), 
p. 298, reprinted by 
permission of Pearson 
Education, Inc., Upper 
Saddle River, NJ) 
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Frequency-Response 
and Frequency-Domain 
Models, Fig. 2 Frequency 
response for G(s) = ^j-j- 
(Source: Franklin et al. 
(2010), p. 83, reprinted by 
permission of Pearson 
Education, Inc., Upper 
Saddle River, NJ) 



Phase 



The plot is typical of most closed-loop systems 
in that (1) the output follows the input (|T| = 1) 
at the lower excitation frequencies and (2) the 
output ceases to follow the input (|T| < 1) 
at the higher excitation frequencies. The 
maximum value of the frequency-response 
magnitude is referred to as the resonant 
peak M r . 

Bandwidth is a measure of speed of response 
and is therefore similar to time-domain measures 
such as rise time and peak time or the s -plane 
measure of dominant-root(s) natural frequency. 
In fact, if the KG(s ) in Fig. 4 is such that the 
closed-loop response is given by Fig. 3a, we can 
see that the bandwidth will equal the natural 
frequency of the closed-loop root (that is ,cobw = 
co n for a closed-loop damping ratio of £ = 
0.7). For other damping ratios, the bandwidth is 
approximately equal to the natural frequency of 
the closed-loop roots, with an error typically less 
than a factor of 2. 

For a second-order system, the time responses 
are functions of the pole-location parameters £ 
and co n . If we consider the curve for £ = 0.5 to 


be an average, the rise time (Rise time t r .) from 
y = 0.1 to y = 0.9 is approximately co n t r = 1.8. 
Thus, we can say that 

1.8 

t r = — • (7) 

G) n 

Although this relationship could be embellished 
by including the effect of the damping ratio, 
it is important to keep in mind how Eq. (7) is 
typically used. It is accurate only for a second- 
order system with no zeros; for all other systems 
it is a rough approximation to the relationship 
between t r and co n . Most systems being analyzed 
for control systems design are more complicated 
than the pure second-order system, so design¬ 
ers use Eq. (7) with the knowledge that it is a 
rough approximation only. Hence, for a second- 
order system the bandwidth is inversely propor¬ 
tional to the rise time, t r . Hence we are able to 
link the time and frequency domain quantities in 
this way. 

The definition of the bandwidth stated here is 
meaningful for systems that have a low-pass filter 
behavior, as is the case for any physical control 
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Frequency-Response 
and Frequency-Domain 
Models, Fig. 3 Frequency 
responses of standard 
second-order systems (a) 
magnitude (b) phase 
(Source: Franklin et al. 
(2010), p. 303, reprinted by 
permission of Pearson 
Education, Inc., Upper 
Saddle River, NJ) 


a 


- 

2 






system. In other applications the bandwidth may 
be defined differently. Also, if the ideal model of 
the system does not have a high-frequency roll¬ 
off (e.g., if it has an equal number of poles and 
zeros), the bandwidth is infinite; however, this 
does not occur in nature as nothing responds well 
at infinite frequencies. 

In many cases, the designer’s primary concern 
is the error in the system due to disturbances 
rather than the ability to track an input. For error 


analysis, we are more interested in the sensitivity 
function S(s ) = 1 —T(s), rather than T ( s ). For 
most open-loop systems with high gain at low 
frequencies, S(s ) for a disturbance input has very 
low values at low frequencies and grows as the 
frequency of the input or disturbance approaches 
the bandwidth. For analysis of either T(s) or 
<S(s), it is typical to plot their response versus the 
frequency of the input. Either frequency response 
for control systems design can be evaluated using 
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the computer or can be quickly sketched for 
simple systems using the efficient methods de¬ 
scribed in Franklin et al. (2015). The methods 
described next are also useful to expedite the 



Frequency-Response and Frequency-Domain 
Models, Fig. 4 Unity feedback system (Source: Franklin 
et al. (2010), p. 304, reprinted by permission of Pearson 
Education, Inc., Upper Saddle River, NJ) 


design process as well as to perform sanity checks 
on the computer output. 


Neutral Stability: Gain and Phase 
Margins 

In the early days of electronic communications, 
most instruments were judged in terms of their 
frequency response. It is therefore natural that 
when the feedback amplifier was introduced, 
techniques to determine stability in the presence 
of feedback were based on this response. 

Suppose the closed-loop transfer function 
of a system is known. We can determine the 
stability of a system by simply inspecting the 


Frequency-Response 
and Frequency-Domain 
Models, Fig. 5 

Definitions of bandwidth 
and resonant peak (Source: 
Franklin et al. (2010), 
p. 304, reprinted by 
permission of Pearson 
Education, Inc., Upper 
Saddle River, NJ) 




Frequency-Response and Frequency-Domain reprinted by permission of Pearson Education, Inc., Upper 
Models, Fig. 6 Stability example: (a) system definition; Saddle River, NJ) 

(b) root locus (Source: Franklin et al. (2010), p. 318, 
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denominator in factored form (because the 
factors give the system roots directly) to observe 
whether the real parts are positive or negative. 
However, the closed-loop transfer function is 
usually not known. In fact, the whole purpose 
behind understanding the root-locus technique is 
to be able to find the factors of the denominator 
in the closed-loop transfer function, given only 
the open-loop transfer function. Another way 
to determine closed-loop stability is to evaluate 
the frequency response of the open-loop transfer 
function KG (jco) and then perform a test on 
that response. Note that this method also does 
not require factoring the denominator of the 
closed-loop transfer function. In this section we 
will explain the principles of this method. Note 
that this method also does not require factoring 
the denominator of the closed-loop transfer 
function. Here we will explain the principles 
of this method. 

Suppose we have a system defined by Fig. 6a 
and whose root locus behaves as shown in 
Fig. 6b; that is, instability results if K is larger 
than 2. The neutrally stable points lie on the 
imaginary axis - that is, where K = 2 and 
s = j 1.0. Furthermore, all points on the root 
locus have the property that 

|^G(s)| = 1 and ZG(s) = -180°. 

At the point of neutral stability we see that these 
root-locus conditions hold for s = jco , so 

\KG(jco)\ = 1 and ZG(jco) = -180°. 

( 8 ) 

Thus, a Bode plot of a system that is neutrally 
stable (that is, with K defined such that a closed- 
loop root falls on the imaginary axis) will satisfy 
the conditions of Eq. (8). Figure 7 shows the 
frequency response for the system whose root 
locus is plotted in Fig. 6 for various values of K. 
The magnitude response corresponding to K = 2 
passes through 1 at the same frequency (co = 
1 rad/s) at which the phase passes through — 180°, 
as predicted by Eq. (8). 

Having determined the point of neutral stabil¬ 
ity, we turn to a key question: Does increasing the 
gain increase or decrease the system’s stability? 


We can see from the root locus in Fig. 6b that 
any value of K less than the value at the neutrally 
stable point will result in a stable system. At the 
frequency co where the phase ZG( jco) = —180° 
(co = 1 rad/s), the magnitude | KG( j co) | <1.0 
for stable values of K and >1 for unstable values 
of K. Therefore, we have the following trial 
stability condition, based on the character of the 
open-loop frequency response: 

\KG(jco)\ < 1 at ZG(jco) = —180°. (9) 

This stability criterion holds for all systems for 
which increasing gain leads to instability and 
\KG(jco)\ crosses the magnitude (=1) once, the 
most common situation. However, there are sys¬ 
tems for which an increasing gain can lead from 
instability to stability; in this case, the stability 
condition is 

\KG(jco)\ > 1 at AG (jco) = -180°. (10) 

Based on the above ideas, we can now define the 
robustness metrics gain and phase margins: 
Phase Margin: Suppose at co \, \G(jco\)\ = . 

How much more phase could the system toler¬ 
ate (as a time delay, perhaps) before reaching 
the stability boundary? The answer to this 
question follows from Eq. (8), i.e., the phase 
margin (PM) is defined as 

PM = ZG(jcoi) - (-180°). (11) 

Gain Margin: Suppose at CO 2 , ZG(jco 2 ) = 
— 180°. How much more gain could the 
system tolerate (as an amplifier, perhaps) 
before reaching the stability boundary? The 
answer to this question follows from Eq. (9), 
i.e., the gain margin (GM) is defined as 

GM = rirr \ 1 • (12) 

K\G(jco 2 )\ 

There are also rare cases when | KG( jco) | crosses 
magnitude (=1) more than once, or where an in¬ 
creasing gain leads to instability. A rigorous way 
to resolve these situations is to use the Nyquist 
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Frequency-Response 
and Frequency-Domain 
Models, Fig. 7 Stability 

example: (a) system 
definition; (b) root locus 
(Source: Franklin et al. 
(2010), p. 319, reprinted by 
permission of Pearson 
Education, Inc., Upper 
Saddle River, NJ) 




co (rad/sec) 


stability criterion as discussed in Franklin et al. where co c is the crossover frequency. The closed- 
(2015). loop frequency-response magnitude is approxi¬ 

mated by 


Closed-Loop Frequency Response 

The closed-loop bandwidth was defined earlier 
in this section. The natural frequency is always 
within a factor of 2 of the bandwidth for a second- 
order system. We can help establish a more exact 
correspondence by making a few observations. 
Consider a system in which | KG( jco) | shows the 
typical behavior 

\KG(jco)\ 1 for co co c , 
\KG(jco)\ 1 for co co c , 


co <$C co c , 
co CO c . 

(13) 

In the vicinity of crossover, where 
\KG(jco)\ = 1, \T(jco)\ depends heavily on the 
PM. A PM of 90° means that ZG( jco c ) = —90°, 
and therefore \T(jco c )\ = 0.707. On the other 
hand, PM = 45° yields \T(jco c )\ = 1.31. 

The exact evaluation of Eq. (13) was used 
to generate the curves of \T(jco)\ in Fig. 8. It 
shows that the bandwidth for smaller values of 
PM is typically somewhat greater than co c , though 
usually it is less than 2 co c ; thus, 


|T(»| = 


KG (jco) 


1 +KG(jco) 


1 , 

\KG\, 
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Frequency-Response and Frequency-Domain Models, Fig. 8 Closed-loop bandwidth with respect to PM (Source: 
Franklin et al. (2010), p. 347, reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ) 


0J C < m bw < 2u> c . (14) 

Another specification related to the closed- 
loop frequency response is the resonant-peak 
magnitude M r , defined in Fig. 5. For linear 
systems, M r is generally related to the damping 
of the system. In practice, M r is rarely used; most 
designers prefer to use the PM to specify the 
damping of a system, because the imperfections 
that make systems nonlinear or cause delays 
usually erode the phase more significantly than 
the magnitude. 

It is also important in the design to achieve 
certain error characteristics and these are often 
evaluated as a function of the input or disturbance 
frequency. In some cases, the primary function 
of the control system is to regulate the output 
to a certain constant input in the presence of 
disturbances. For these situations, the key item of 
interest for the design would be the closed-loop 
frequency response of the error with respect to 
disturbance inputs. 

Summary and Future Directions 

The frequency response methods are the 
most popular because they can deal with 
model uncertainty and can be measured in 
the laboratory. A wide range of information 
about the system can be displayed in a Bode 


plot. The dynamic compensation can be carried 
out directly from the Bode plot. Extension of 
the ideas to multivariable systems has been 
done via singular value plots. Extension to 
nonlinear systems is still the subject of current 
research. 
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► Classical Frequency-Domain Design 
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► Frequency Domain System Identification 
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Abstract 

Feedback systems are designed to meet many 
different objectives. Yet, not all design objec¬ 
tives are achievable as desired due to the fact 
that they are often mutually conflicting and that 
the system properties themselves may impose 
design constraints and thus limitations on the 
performance attainable. An important step in the 
control design process is then to analyze what 
and how system characteristics may impose con¬ 
straints, and accordingly, how to make tradeoffs 
between different objectives by judiciously navi¬ 
gating between the constraints. Fundamental lim¬ 
itation of feedback control is an area of research 
that addresses these constraints, limitations, and 
tradeoffs. 


Keywords 

Bode integrals; Design tradeoff; Performance 
limitation; Tracking and regulation limits 


Introduction 

Fundamental control limitations are referred 
to those intrinsic of feedback that can neither 
be overcome nor circumvent regardless how it 
may be designed. By this nature, the study of 
fundamental limitations dwells on a Hamletian 
question: Can or can’t it be done? What can 
and cannot be done? To be more specific, yet 
still general enough, at heart here are issues 
concerning the benefit and cost of feedback. 
We ask such questions as (1) What system 
characteristics may impose inherent limitations 
regardless of controller design? (2) What inherent 


constraints may exist in design, what kind of 
tradeoffs are to be made? (3) What are the best 
achievable performance limits? (4) How can the 
constraints, limitations, and limits be quantified, 
in ways meaningful for control analysis and 
design? Needless to say, issues of this kind 
are very general and in fact are commonplace 
in science and engineering. Analogies can be 
made, for example, to Shannon’s theorems 
in communications theory, the Cramer-Rao 
bound in statistics, and Heisenberg’s uncertainty 
principle in quantum mechanics; they all address 
the fundamental limits and limitations, though for 
different problems and in different contexts. The 
search for fundamental limitations of feedback 
control, as such, may be considered a quest for 
an “ultimate truth” or the “law of feedback.” 

For their fundamentality and importance, 
inquiries into control performance limitations 
have persisted over time and continue to be of 
vital interest. It is worth emphasizing, however, 
that performance limitation studies are not 
merely driven by intellectual curiosity, but are 
tantamount to better and more realistic feedback 
systems design and hence of tangible practical 
value. An analysis of performance limitations 
can aid control design in several aspects. First, 
it may provide a fundamental limit on the best 
performance attainable irrespective of controller 
design, thus furnishing a guiding benchmark in 
the design process. Secondly, it helps a designer 
assess what and how system properties may be 
inherently conflicting and thus pose inherent 
difficulties to performance objectives, which 
in turn helps the designer specify reasonable 
goals, and make judicious modifications and 
revisions on the design. In this process, the theory 
of fundamental control limitations promises 
to provide valuable insights and analytical 
justifications to long-held design heuristics and, 
indeed, to extend such heuristics further beyond. 
This has become increasingly more relevant, as 
modern control design theory and practice relies 
heavily on optimization-based numerical routines 
and tools. 

Systematic investigation and understanding 
of fundamental control limitations began with 
the classical work of Bode in the 1940s on 
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logarithmic sensitivity integrals, known as the 
Bode integrals. Bode’s work has had a lasting 
impact on the theory and practice of control 
and has inspired continued research effort dated 
most recently, leading to a variety of extensions 
and new results which seek to quantify design 
constraints and performance limitations by 
logarithmic integrals of Bode and Poisson type. 
On the other hand, the search for the best 
achievable performance is a natural goal in 
optimal control problems, which has lend bounds 
on optimal performance indices defined under 
various criteria. Likewise, the latter developments 
have also been substantial and are continuing to 
branch to different problems and different system 
categories. 

In this entry we attempt to provide a summary 
overview of the key developments in the study 
of fundamental limitations of feedback control. 
While the understanding on this subject has been 
compelling and the results are rather prolific, we 
focus on Bode-type integral relations and the 
best achievable performance limits, two branches 
of the study that are believed to be most well- 
developed. Roughly speaking, the Bode-type 
integrals are most useful for quantifying the 
inherent design constraints and tradeoffs in the 
frequency domain, while the performance results 
provide fundamental limits of canonical control 
objectives defined using frequency- and time- 
domain criteria. Invariably, the two sets of results 
are intimately related and reinforce each other. 
The essential message then is that despite its 
many benefits, feedback has its own limitations 
and is subject to various constraints. Feedback 
design, for that sake, requires often times a hard 
tradeoff. 


Control Design Specifications 


singular value of a matrix A will be written as 
o(A). If A is a Hermitian matrix, we denote 
by A (^4) its largest eigenvalue. For any unitary 
vectors u, v e C n , we denote by Z(u, v) the 
principal angle between the two one-dimensional 
subspaces, called the directions, spanned by u and 
v : 

cosZ(w, v) := \u H v\. 

For a stable continuous-time system with transfer 
function matrix G(s), we define its Hoo norm by 

||G||oo := sup a(G(s)). 

Reo>o 

We consider the standard configuration of 
finite-dimensional linear time-invariant (LTI) 
feedback control systems given in Fig. 1. In this 
setup, P and K represent the transfer functions 
of the plant model and controller, respectively, 
r is a command signal, d a disturbance, n a 
noise signal, and y the output response. Define 
the open-loop transfer function, the sensitivity 
function, and the complementary sensitivity 
function by 

L = PK , S = (I+L)~\ T = L(I+L)~\ 
respectively. Then the output can be expressed as 
y = Sd — Tn + SPr. 

The goal of feedback control design is to design a 
controller K so that the closed-loop system is sta¬ 
ble and that it achieves certain performance spec¬ 
ifications. Typical design objectives include: 

• Disturbance attenuation. The effect of the 
disturbance signal on the output should be 
kept small, which translates into the require- 


We begin by introducing the basic notation to be 
used in the sequel. Let C+ := {z : Re(z) > 0} 
denote the open right half plane (RHP) and C+ 
the closed RHP (CRHP). For a complex number 
z, we denote its conjugate by z. For a complex 
vector x, we denote its conjugate transpose by 
x H , and its Euclidean norm by \\x\\ 2 . The largest 


r ► c 
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p 
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r n 
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Feedback configuration 
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ment that the sensitivity function be small in 
magnitude at the frequencies of interest. For a 
single-input single-output (SISO) system, this 
mandates that 

|S(j&>)| <1, V co e [0, co\). 

The sensitivity magnitude |S(j&>)| is to be 
kept as small as possible in the low frequency 
range. 

• Noise reduction. The noise response should be 
reduced at the output. This requires that the 
complementary sensitivity function be small 
in magnitude at frequencies of interest. For a 
SISO system, the objective is to achieve 

\T(jco)\ <1, Vwg [a> 2 , oo). 

Similarly, the magnitude | T ( jco ) | is desired to 
be the smallest at high frequencies. 

Moreover, feedback can be introduced to achieve 
many other objectives including regulation, com¬ 
mand tracking, improved sensitivity to parameter 
variations, and, more generally, system robust¬ 
ness, all by manipulating the three key transfer 
functions: the open-loop transfer function, the 
sensitivity function, and the complementary sen¬ 
sitivity function. 

The design and implementation of feedback 
systems, on the other hand, are also subject to 
many constraints, which include 

1. Causality: A system must be causal for it to be 
implementable. This constraint requires that 
no ideal filter can be used for compensation 
and that the system’s relative degree and delay 
be preserved. 

2. Stability: The closed-loop system must be sta¬ 
ble. This implies that every closed-loop trans¬ 
fer function must be bounded and analytic in 
CRHP. 

3. Interpolation: There should be no unstable 
pole-zero cancelation between the plant and 
controller, in order to rid of hidden instability. 
Thus, at each RHP pole pi and zero Zi, it is 
necessary that 

T(Pi) = 1, 


S(zi) = 1, T( Zi ) = 0. 

4. Structural constraints: Constraints in this cate¬ 
gory arise from the feedback structure itself; 
for example, S(s) + T(s) = 1. The im¬ 
plication then is that the closed-loop transfer 
functions cannot be independently designed, 
thus resulting in conflicting design objectives. 
For a given plant, each of these constraints is 
unalterable and hence is fundamental, and each 
will constrain the performance attainable in one 
way or another. The question we face then is how 
the constraints may be captured in a form that is 
directly pertinent and useful to feedback design. 


Bode Integral Relations 

In the classical feedback control theory, Bode’s 
gain-phase formula (Bode 1945) is used to ex¬ 
press the aforementioned design constraints for 
SISO systems. 

Bode Gain-Phase Integral Suppose that L(s) 
has no pole and zero in C+. Then at any fre¬ 
quency COo, 

= - r d '°f W lo,cothM^„. 

7x J -oo dv 2 

A special form of the Hilbert transform, this 
gain-phase formula relates the gain and phase of 
the open-loop transfer function evaluated along 
the imaginary axis, whose implication may be 
explained as follows. In order to make the sensi¬ 
tivity response small in the low frequency range, 
the open-loop transfer function is required to 
have a high gain; the higher, the better. On the 
other hand, for noise reduction and robustness 
purposes, we need to keep the loop gain low at 
high frequencies, the lower the better. Evidently, 
to maximize these objectives, we want the two 
frequency bands as wide as possible. This then 
requires a steep decrease of the loop gain and 
hence a rather negative slope in the crossover 
region, say, the intermediate frequency range near 
coo. But the gain-phase relationship tells that a 
very negative derivative in the gain will lead to 


s(Pi) = o, 
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a very negative phase, driving the phase closer 
to the negative 180 degree, namely, the critical 
point of stability. It consequently reduces the 
phase margin and may even cause instability. As 
a result, the gain-phase relationship demonstrates 
a conflict between the two design objectives. It is 
safe to claim that much of the classical feedback 
design theory came as a consequence of this 
simple relationship, aiming to shape the open- 
loop frequency response in the crossover region 
by trial and error, using lead or lag filters. 

While in using the gain-phase formula, the de¬ 
sign specifications imposed on closed-loop trans¬ 
fer functions are translated approximately into 
the requirements on the open-loop transfer func¬ 
tion, and the tradeoff between different design 
goals is achieved by shaping the open-loop gain 
and phase; a more direct vehicle to accomplish 
this same goal is Bode’s sensitivity integral (Bode 
1945). 

Bode Sensitivity Integrals Let p\ e C+ be the 

unstable poles and Zi E C+ the nonminimum 
phase zeros ofL(s). Suppose that the closed-loop 
system in Fig. 1 is stable. 

(i) If L(s) has relative degree greater than one, 
then 


abide some kind of conservation law, or invari¬ 
ance property: the integral of the logarithmic 
sensitivity magnitude over the entire frequency 
range must be a nonnegative constant, determined 
by the open-loop unstable poles. This property 
mandates a tradeoff between sensitivity reduc¬ 
tion and sensitivity amplification in different fre¬ 
quency bands. Indeed, to achieve disturbance 
attenuation, the logarithmic sensitivity magnitude 
must stay below zero db, the lower the better. 
For noise reduction and robustness, however, its 
tail has to roll off sufficiently fast to zero db 
at high frequencies. Since, in light of the inte¬ 
gral relation, the total area under the logarithmic 
magnitude curve is nonnegative, the logarithmic 
magnitude must rise above zero db, so that under 
its curve, the positive and negative areas may 
cancel each other to yield a nonnegative value. 
As such, an undesirable sensitivity amplifica¬ 
tion occurs, resulting in a fundamental tradeoff 
between the desirable sensitivity reduction and 
the undesirable sensitivity amplification, known 
colloquially as the waterbed effect. 


MIMO Integral Relations 


oo 

log|50‘ft))|Jft) = 71^2 Pi- 

i 

(ii) If L(s) contains no less than two integrators, 
then 


For a multi-input multi-output (MIMO) system 
depicted in Fig. 1, the sensitivity and complemen¬ 
tary sensitivity functions, which now are transfer 
function matrices, satisfy similar interpolation 
constraints: at each RHP pole pi and zero Zi of 
L(s), the equations 


f 


log I T(j(o)\ 


CjO a 


dco = 71 ^ —. 


S(Pi)rji = 0, T(pi)rji = rji , 


Bode’s original work concerns the sensitivity 
integral for open-loop stable systems only. The 
integral relations shown herein, which are at¬ 
tributed to Freudenberg and Looze (1985) and 
Middleton (1991), respectively, provide general¬ 
izations to open-loop unstable and nonminimum 
phase systems. 

Why are Bode sensitivity integrals important? 
What is the hidden message behind the math¬ 
ematical formulas? Simply put, Bode integral 
exhibits that a feedback system’s sensitivity must 


wf S(zi )=wf, wf T(zi ) = o 

hold with some unitary vectors rjj and w/, where 
rji is referred to as a right pole direction vector 
associated with p t , and w/ a left zero direction 
vector associated with Zi . 

While it seems both natural and tempting, the 
extension of Bode integrals to MIMO systems has 
been highly nontrivial a task. Deep at the root is 
the complication resulted from the directionality 
properties of MIMO systems. Unlike in a SISO 
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system, the measure of frequency response 
magnitude is now the largest singular value 
of a transfer function matrix, which represents 
the worst-case amplification of energy-bounded 
signals, a direct counterpart to the gain of a 
scalar transfer function. This fact alone proves 
to cast a fundamental difference and poses a 
formidable obstacle. From a technical standpoint, 
the logarithmic function of the largest singular 
value is no longer a harmonic function as 
in the SISO case, but only a subharmonic 
function. Much to our regret then, familiar 
tools found from analytic function theory, such 
as Cauchy and Poisson theorems, the very 
backbone in developing Bode integrals, cease to 
be applicable. Nevertheless, it remains possible 
to extend Bode integrals in their essential spirit. 
Advances are made by Chen (1995, 1998, 
2000 ). 

MIMO Bode Sensitivity Integrals Let pi £ 

C+ be the unstable poles of L(s) and zi £ 
C+ the nonminimum phase zeros of L(s). Sup¬ 
pose that the closed-loop system in Fig. 1 is 
stable. 

(i) If L(s) has relative degree greater than one, 
then 

r oo _ 

/ lo go : (S(ja)))d(o > jr X 

Jo 

(ii) If L(s) contains no less than two integrators, 
then 



f°° log a(T(joj)) 

Jo « 2 



where rjj and Wi are some unitary vectors related 
to the right pole direction vectors associated with 
Pi and the left zero direction vectors associated 
withzi, respectively. 

From these extensions, it is evident that same 
limitations and tradeoffs on the sensitivity and 
complementary sensitivity functions carry over 
to MIMO systems; in fact, both integrals re¬ 
duce to the Bode integrals when specialized to 


SISO systems. Yet there is something additional 
and unique of MIMO systems: the integrals now 
depend on not only the locations but also the 
directions of the zeros and poles. In particular, it 
can be shown that they depend on the mutual ori¬ 
entation of these directions, and the dependence 
can be explicitly characterized geometrically by 
the principal angles between the directions. This 
new phenomenon, which finds no analog in SISO 
systems, thus highlights the important role of 
directionality in sensitivity tradeoff and more 
generally, in the design of MIMO systems. 

A more sophisticated and accordingly, more 
informative variant of Bode integrals is the Pois¬ 
son integral for sensitivity and complementary 
sensitivity functions (Freudenberg and Looze 
1985), which can be used to provide quantitative 
estimates of the waterbed effect. MIMO versions 
of Poisson integrals are also available (Chen 
1995, 2000). 


Frequency-Domain Performance 
Bounds 

Performance bounds complement the integral 
relations and provide fundamental thresholds to 
the best possible performance ever attainable. 
Such bounds are useful in providing benchmarks 
for evaluating a system’s performance prior to 
and after controller design. In the frequency 
domain, fundamental limits can be specified as 
the minimal peak magnitude of the sensitivity and 
complementary sensitivity functions achievable 
by feedback, or formally, the minimal achievable 
FLoo norms: 

^min : = inf { IIsc?)Hoc : K(s) stabilizes P(s )\, 
yL : = inf i llTTv)Hoc : K(s) stabilizes P(s)\. 

Drawing upon Nevanlinna-Pick interpolation 
theory for analytic functions, one can obtain 
exact performance limits under rather general 
circumstances (Chen 2000). 

FLoo Performance Limits Let zi £ C+ be the 

nonminimum phaze zeros of P(s) with left direc¬ 
tion vectors Wi, and pi £ C+ the unstable poles 
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of P(s) with right direction vectors r)i, where zi 
and pi are all distinct. Then, 

Yl in = Ymin = ^ ^(Q^Q^QP^) , 


the output z to track a given reference input r, 
based on the feedforward of the reference signal 
r and the feedback of the measured output y. 
The tracking performance is defined in the time 
domain by the integral square error 


where Q z , Q p , and Q zp are the matrices given 
by 



r(t)\\\dt. 


Qz '■= 


wfwjJ] 
Zi + Zj J ’ 


Qp ■■= 


-Pi + Pj J ’ 


Qzp 


' wfnj ' 

- Pj.' 


More explicit bounds showing how zeros and 
poles may interact to have an effect on these 
limits can be obtained, e.g., as 


y s . =y T > 

f mm Y mm — 


I sin 2 Z(wi, rjj) + 


Pj + Zi 


Pj ~ Zi 


cos 2 Z(wi, rjj). 


which demonstrates once again that the pole 
and zero directions play an important role in 
MIMO systems. Note that for RHP poles and 
zeros located in the close vicinity, this bound 
can become excessively large, which serves as 
another vindication why unstable pole-zero can¬ 
celation must be prohibited. Note also that for 
MIMO systems however, whether near pole-zero 
cancelation is problematic depends additionally 
on the mutual orientation of the pole and zero 
directions. 


Typically, we take r to be a step signal, which 
in the MIMO setting corresponds to a unitary 
constant vector, i.e., r(t) = v, t > 0 and r(t) = 
0, t < 0, where ||i ?||2 = 1. We assume P to be 
LTI. But K can be arbitrarily general, as long as 
it is causal and stabilizing. We want z to not only 
track r asymptotically but also minimize J . But 
how small can it be? 

For the regulation problem, a general setup is 
given in Fig. 3. Likewise, K may be taken as a 
2-DOF controller. The control output energy is 
measured by the quadratic cost 



We consider a disturbance signal d , typically 
taken as an impulse signal, d{t) = v8(t ), where 
v is a unitary vector. In this case, the disturbance 
can be interpreted as a nonzero initial condition, 
and the controller K is to regulate the system’s 
zero-input response. Similarly, we assume that P 


r 

K 

u 

P 

z 
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Tracking and Regulation Limits 


Fundamental Limitation of Feedback Control, Fig. 2 

2-DOF tracking control structure 


Tracking and regulation are two canonical 
objectives of servo mechanisms and constitute 
chief criteria in assessing the performance of 
feedback control systems. Understandings gained 
from these problems will shed light into more 
general issues indicative of feedback design. 
In its full generality, a tracking system can be 
depicted as in Fig. 2, in which a 2-DOF (degree 
of freedom) controller K is to be designed for 



Fundamental Limitation of Feedback Control, Fig. 3 

2-DOF regulator 
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is LTI, but allow K to be any causal, stabilizing 
controller. Evidently, for a stable P , the problem 
is trivial; the response will restore itself to the 
origin and thus no energy is required. But what 
if the system is unstable? How much energy 
does the controller must generate to combat the 
disturbance? What is the smallest amount of 
energy required? These questions are answered 
by the best achievable limits of the tracking 
and regulation performance (Chen et al. 2000, 
2003). 

Tracking and Regulation Performance Limits 

Let pi e C+ and Zi e C+ be the RHP poles and 
zeros of P (s), respectively. Then, 

inf {E : K stabilizes P{s)} = ^ p cos 2 (£/, v), 

i 

inf{/ : K stabilizes P(s)} = — cos 2 (£/, v ), 

i Zi 

where & and are some unitary vectors related 
to the right pole direction vectors associated with 
Pi and the left zero direction vectors associated 
withzi, respectively. 

It becomes instantly clear that the optimal 
performance depends on both the pole/zero loca¬ 
tions and their directions. In particular, it depends 
on the mutual orientation between the input and 
pole/zero directions. This sheds some interesting 
light. Take the tracking performance for an ex¬ 
ample. For a SISO system, the minimal tracking 
error can never be made zero for a nonmini¬ 
mum phase plant; in other words, perfect tracking 
can never be achieved. Yet this is possible for 
MIMO systems, when the input and zero direc¬ 
tions are appropriately aligned, specifically when 
they are orthogonal. Interestingly, the optimal 
performance in both cases can be achieved by LTI 
controllers, though allowed to be more general. 
As a result, the results herein provide the true fun¬ 
damental limits that cannot be further improved, 
in spite of using any other more general forms 
such as nonlinear, time-varying feedforward and 
feedback. It is simply the best one can ever hope 
for, and the LTI controllers turn out to be optimal. 


Summary and Future Directions 

Whether in time or frequency domain, while 
the results presented herein may differ in forms 
and contexts, they unequivocally point to the 
fact that inherent constraints exist in feedback 
design, and fundamental limitations will neces¬ 
sarily arise, limiting the performance achievable 
regardless of controller design. Such constraints 
and limitations are especially exacerbated by the 
nonminimum phase zeros and unstable poles in 
the system. Understanding of these constraints 
and limitations proves essential to the success of 
control design. 

For both its intrinsic appeal and fundamental 
implications, the study of fundamental control 
limitations will continue to be a topic of en¬ 
during vitality and indeed will prove timeless. 
Challenges are especially daunting and endeavor 
is called for, e.g., to incorporate information and 
communication constraints into control limitation 
studies, of which networked control and multi¬ 
agent systems serve as notable testimonies. 


Cross-References 
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► H 2 Optimal Control 

► Linear Quadratic Optimal Control 
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Abstract 

Game theory provides a mature mathematical 
foundation for making security decisions in a 
principled manner. Security games help formaliz¬ 
ing security problems and decisions using quan¬ 
titative models. The resulting analytical frame¬ 
works lead to better allocation of limited re¬ 
sources and result in more informed responses to 
security problems in complex systems and orga¬ 
nizations. The game-theoretic approach to secu¬ 
rity is applicable to a wide variety of systems and 
critical infrastructures such as electricity, water, 
financial services, and communication networks. 

Keywords 

Complex systems; Cyberphysical system security; 
Game theory; Security games 

Introduction 

Securing a system involves making numerous 
decisions whether the system is a computer 


network, part of a business process in an 
organization, or belongs to a critical infras¬ 
tructure. One has to decide on, for example, 
how to configure sensors for surveillance, collect 
further information on system properties, allocate 
resources to secure a critical segment, or who 
should be able to access a specific function in the 
system. The decision-maker can be, depending 
on the setting, a regular employee, a system 
administrator, or the chief technical officer of an 
organization. In many cases, the decisions are 
made automatically by a computer program such 
as allowing a packet pass the firewall or filtering 
it out. The time frame of these decisions exhibits 
a high degree of variability from milliseconds, 
if made by software, to days and weeks, e.g., 
when they are part of a strategic plan. Each 
security decision has a cost and any decision¬ 
maker is always constrained by limited amount 
of available resources. More importantly, each 
decision carries a security risk that needs to be 
taken into account when balancing the costs and 
the benefits. 

Security games facilitate building analytical 
models which capture the interaction between 
malicious attackers, who aim to compromise net¬ 
works, and owners or administrators defending 
them. Attacks exploiting vulnerabilities of the 
underlying systems and defensive countermea¬ 
sures constitute the moves of the game. Thus, the 
strategic struggle between attackers and defend¬ 
ers is formalized quantitatively based on the solid 
mathematical foundation provided by the field of 
game theory. 
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An important aspect of security games is the 
allocation of limited available resources from the 
perspectives of both attackers and defenders. If 
the players had access to unlimited resources 
(e.g., time, computing power, bandwidth), then 
the resulting security games would be trivial. In 
real-world security settings, however, both at¬ 
tackers and defenders have to act strategically and 
make numerous decisions when allocating their 
respective resources. Unlike in an optimization 
approach, security games take into account the 
decisions and resource limitations of both the 
attackers and the defenders. 

Security Games 

A security game is defined with four components: 
the players, the set of possible actions or strate¬ 
gies for each player, the outcome of the game for 
each player as a result of their action-reaction, 
and information structures in the game. The play¬ 
ers have their own (selfish or malicious) motiva¬ 
tions and inherent resource constraints. Based on 
these motivations and information available, they 
choose the most beneficial strategies for them¬ 
selves and act accordingly. Hence, game theory 
helps analyzing decision-makers interacting on a 
system in a quantitative manner. 

An Example Formulation 

A simple security game can be formulated as a 
two-player and strategic (noncooperative) one, 
where one player is the attacker and the other 
one is the defender protecting a system. Let 
the discrete actions available to the attacker and 
the defender be {a,b} and { c,d }, respectively. 
Each attack-defense pair leads to one of the 
outcome pairs for the attacker and the defender 
{(xl, yl), (x2, y 2), (x3, y3), (x4, y4)}, which 
represent the respective player’s gains (or losses). 
This security game is depicted graphically in 
Fig. 1. It can also be represented as a matrix 
game as follows, where the attacker is the row 
player and the defender is the column player: 

(C) (d) 

(xl,yl) (x2,y2) (a) 

_(x3,j3)(x4,j4)J (b) 



Outcomes 
(xl.yl) 

(x2,y2) 
(x3,y3) 

(x4,y4) 


Game Theory for Security, Fig. 1 A simple, two-player 
security game 


If the decision variables of the players are 
continuous, for example, x e [0, a] and y e [0, c] 
denote attack and defense intensity, respectively, 
then the resulting continuous-kernel game is de¬ 
scribed using functions instead of a matrix. Then, 
j attacker^ ^ anc l /defender^ ^ quantify the COSt 

of the attacker and defender as a function of their 
actions, respectively. 

Security Game Types 

In its simplest formulation, the conflict between 
those defending a system and malicious attackers 
targeting it can be modeled as a two-person zero- 
sum security game, where the loss of a player 
is the gain of the other. Alternatively, two- and 
multi-person nonzero-sum security games gener¬ 
alize this for capturing a broader range of interac¬ 
tions. Static game formulations and their repeated 
versions are helpful for modeling myopic behav¬ 
ior of players in fast changing situations where 
planning future actions is of little use. In the 
case where the underlying system dynamics are 
predictable and available to the players, dynamic 
security game formulations can be utilized. If 
there is an order of actions in the game, for ex¬ 
ample, a purely reactionary defender, then leader- 
follower games can be used to formulate such 
cases where the attacker takes the lead and the 
defender follows. 

Within the framework of security games, the 
concept of Nash equilibrium, where no player 
gains from deviating from its own Nash equilib¬ 
rium strategy if others stick with theirs, provides 
a solid foundation. However, there are refine¬ 
ments and additional solution concepts when the 
game is dynamic or when there is more than one 
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Nash equilibrium or information limitations in 
the game. These constitute an open and ongoing 
research topic. A related research question is the 
design of security games to ensure a favorable 
outcome from a global perspective while taking 
into account the independence of individual play¬ 
ers in their decisions. 

In certain cases, it is useful to analyze the 
player interactions in multiple layers. For exam¬ 
ple, in some security games there may be defend¬ 
ers and malicious attackers trying to influence a 
population of other players by indirect means. 
In order to circumvent modeling complexity of 
the problem, evolutionary games have been sug¬ 
gested. While evolutionary games forsake mod¬ 
eling individual player actions, they provide valu¬ 
able insights to collective behavior of populations 
of players and ensure tractability. Such models 
are useful, for example, in the analysis of various 
security policies affecting many users or security 
of large-scale critical systems. 

Another important aspect of security decisions 
is the availability and acquisition of informa¬ 
tion on the properties of the system at hand, 
the actions of other players, and the incentives 
behind them. Clearly, the amount of information 
available has a direct influence on the decisions, 
yet acquiring information is often costly or even 
infeasible in some cases. Then, the decisions 
have to be made with partial information, and 
information collection becomes part of the deci¬ 
sion process itself, creating a complex feedback 
loop. 

Statistical or machine learning techniques and 
system identification are other useful methods 
in the analysis of security games where players 
use the acquired information iteratively to update 
their own model of the environment and other 
players. The players then decide on their best 
courses of action. Existing work on fictitious 
play and reinforcement learning methods such as 
Q-learning are applicable and useful. A unique 
feature of security games is the fact that players 
try to hide their actions from others. These ob¬ 
servability issues and distortions in observations 
can be captured by modeling the interaction be¬ 
tween players who observe each other’s actions 
as a noisy communication channel. 


Applications 

An early application of the decision and game- 
theoretic approach has been to the well-defined 
jamming problem, where malicious attackers aim 
to disrupt wireless communication between legit¬ 
imate parties (Kashyap et al. 2004; Zander 1990). 
Detection of security intrusion and anomalies due 
to attacks is another problem, where the interac¬ 
tion between attackers and defenders has been 
modeled successfully using game theory (Alp- 
can and Ba§ar 2011; Kodialam and Lakshman 
2003). Decision and game-theoretic approaches 
have been applied to a broad variety of networked 
system security problems such as security invest¬ 
ments in organizations (Miura-Ko et al. 2008), 
(location) privacy (Buttyan and Hubaux 2008; 
Kantarcioglu et al. 2011), distributed attack de¬ 
tection, attack trees and graphs, adversarial con¬ 
trol (Altman et al. 2010), network path selection 
(Zhang et al. 2010) and topology planning in 
presence of adversaries (Gueye et al. 2010), as 
well as to other types of security games and 
decisions. More recently, security games have 
been used to investigate cyberphysical security 
of (smart) power grid (Law et al. 2012). The 
proceedings of the last three Conferences on De¬ 
cision and Game Theory for Security published 
as edited volumes in 2010 (Alpcan et al. 2010), 
2011 (Baras et al. 2011), and 2012 (Grossklags 
and Walrand 2012) as well as the recent survey 
paper (Manshaei et al. 2013) present an extensive 
segment of the literature on the subject. 

Analytical risk management is a related 
emerging research subject. Analytical methods 
and game theory have been applied to 
the field only recently but with increasing 
success (Guikema 2009; Mounzer et al. 2010). 
Another emerging topic is the adversarial 
mechanism design (Chorppath and Alpcan 
2011; Roth 2008), where the goal is to design 
mechanisms resistant to malicious behavior. 


Summary and Future Directions 

Game theory provides quantitative methods for 
studying the players in security problems such 
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as attackers, defenders, and users as well as their 
interaction and incentives. Hence, it facilitates 
making decisions on the best courses of action 
in addressing security problems while taking 
into account resource limitations, underlying 
incentive mechanisms, and security risks. Thus, 
security games and associated quantitative 
models have started to replace the prevalent 
ad hoc decision processes in a wide variety of 
security problems from safeguarding critical 
infrastructure to risk management, trust, and 
privacy in networked systems. 

Game theory for security is a young and active 
research area as evidenced by the recently initi¬ 
ated conference series “Conference on Decision 
and Game Theory for Security” (www.gamesec- 
conf.org), the increasing number of journal and 
conference articles, as well as the recently pub¬ 
lished books (Alpcan and Ba§ar 2011; Butty an 
and Hubaux 2008; Tambe 201 1). 
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Abstract 

This article provides an overview of the aspects 
of game theory that are covered in this Encyclo¬ 
pedia , which includes a broad spectrum of topics 
on static and dynamic game theory. It starts with 
a brief overview of game theory, identifying its 
basic ingredients, and continues with a brief his¬ 
torical account of the development and evolution 
of the field. It concludes by providing pointers to 
other articles in the Encyclopedia on game theory, 
and a list of references. 


Keywords 

Cooperation; Dynamic games; Evolutionary 
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game theory; Nash equilibrium; Stackelberg 
equilibrium 

What Is Game Theory? 

Game theory deals with strategic interactions 
among multiple decision makers, called players 
(and in some context agents ), with each player’s 
preference ordering among multiple alternatives 
captured in an objective function for that player, 
which she either tries to maximize (in which case 
the objective function is a utility function or a 
benefit function) or minimize (in which case we 
refer to the objective function as a cost function or 
a loss function). For a nontrivial game, the objec¬ 
tive function of a player depends on the choices 
(actions or equivalently decision variables) of 
at least one other player, and generally of all 
the players, and hence a player cannot simply 
optimize her own objective function independent 
of the choices of the other players. This thus 
brings in a coupling among the actions of the 
players and binds them together in decision mak¬ 
ing even in a noncooperative environment. If the 
players are able to enter into a cooperative agree¬ 
ment so that the selection of actions or decisions 
is done collectively and with full trust, so that 
all players would benefit to the extent possible, 
then we would be in the realm of cooperative 
game theory , where issues such as bargaining 
and characterization of fair outcomes, coalition 
formation, and excess utility distribution are of 
relevance; an article in this Encyclopedia (by 
Haurie) discusses cooperation and cooperative 
outcomes in the context of dynamic games. Other 
aspects of cooperative game theory can be found 
in several standard texts on game theory, such as 
Owen (1995), Vorob’ev (1977), or Fudenberg and 
Tirole (1991). See also the 2009 survey article 
Saad et al. (2009), which emphasizes applications 
of cooperative game theory to communication 
networks. 

If no cooperation is allowed among the play¬ 
ers, then we are in the realm of noncooperative 
game theory , where first one has to introduce a 
satisfactory solution concept. Leaving aside for 
the moment the issue of how the players can 
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reach such a solution point, let us address the 
issue of what would be the minimum features one 
would expect to see there. To first order, such 
a solution point should have the property that 
if all players but one stay put, then the player 
who has the option of moving away from the 
solution point should not have any incentive to 
do so because she cannot improve her payoff. 
Note that we cannot allow two or more players 
to move collectively from the solution point, 
because such a collective move requires cooper¬ 
ation, which is not allowed in a noncooperative 
game. Such a solution point where none of the 
players can improve her payoff by a unilateral 
move is known as a noncooperative equilibrium 
or Nash equilibrium , named after John Nash, who 
introduced it and proved that it exists in finite 
games (i.e., games where each player has only 
a finite number of alternatives), over 60 years 
ago; see Nash (1950, 1951). This result and 
its various extensions for different frameworks 
as well as its computation (both off-line and 
online) are discussed in several articles in this 
Encyclopedia. Another noncooperative equilib¬ 
rium solution concept is the Stackelberg equi¬ 
librium , introduced in von Stackelberg (1934), 
and predating the Nash equilibrium, where there 
is a hierarchy in decision making among the 
players, with some of the players, designated 
as leaders , having the ability to first announce 
their actions (and make a commitment to play 
them) and the remaining players, designated as 
followers , taking these actions as given in the 
process of computation of their noncooperative 
(Nash) equilibria (among themselves). Before 
announcing their actions, the leaders would of 
course anticipate these responses and determine 
their actions in a way that the final outcome 
will be most favorable to them (in terms of 
their objective functions). For a comprehensive 
treatment of Nash and Stackelberg equilibria for 
different classes of games, see Ba§ar and Olsder 
(1999). 

We say that a noncooperative game is nonzero- 
sum if the sum of the players’ objective functions 
cannot be made zero after appropriate positive 
scaling and/or translation that do not depend on 
the players’ decision variables. We say that a 


two-player game is zero-sum if the sum of the 
objective functions of the two players is zero 
or can be made zero by appropriate positive 
scaling and/or translation that do not depend on 
the decision variables of the players; hence, two- 
player zero-sum games can be viewed as a spe¬ 
cial subclass of two-player nonzero-sum games, 
and in this case the Nash equilibrium becomes 
the saddle-point equilibrium. A game is a finite 
game if each player has only a finite number of 
alternatives, that is, the players pick their actions 
out of finite sets (action sets); otherwise, the game 
is an infinite game. Finite games are also known 
as matrix games. An infinite game is said to be a 
continuous-kernel game if the action sets of the 
players are continua and the players’ objective 
functions are continuous with respect to action 
variables of all players. A game is said to be 
deterministic if the players’ actions uniquely de¬ 
termine the outcome, as captured in the objective 
functions, whereas if the objective function of at 
least one player depends on an additional variable 
(state of nature) with a probability distribution 
known to all players (or can be learned on line), 
then we have a stochastic game. A game is a com¬ 
plete information game if the description of the 
game (i.e., the players, the objective functions, 
and the underlying probability distributions (if 
stochastic)) is common information to all players; 
otherwise, we have an incomplete information 
game. We say that a game is static if players have 
access to only the a priori information (shared 
by all) and none of the players has access to 
information on the actions of any of the other 
players; otherwise, what we have is a dynamic 
game. A game is a single-act game if every player 
acts only once; otherwise, the game is multi-act. 
Note that it is possible for a single-act game to be 
dynamic and for a multi-act game to be static. A 
dynamic game is said to be a differential game if 
the evolution of the decision process (controlled 
by the players over time) takes place in contin¬ 
uous time and generally involves a differential 
equation; if it takes place over a discrete-time 
horizon, the dynamic game is sometimes called 
a discrete-time game. 

In dynamic games, as the game progresses 
players acquire information (complete or partial) 



Game Theory: Historical Overview 


501 


on past actions of other players and use this 
information in selecting their own actions (also 
dictated by the equilibrium solution concept at 
hand). In finite dynamic games, for example, the 
progression of a game involves a tree structure 
(also called extensive form ) where each node 
is identified with a player along with the time 
when she acts, and branches emanating from a 
node show the possible moves of that particular 
player. A player, at any point in time, could 
generally be at more than one node, which is a 
situation that arises when the player does not have 
complete information on the past moves of other 
players and hence may not know with certainty 
which particular node she is at at any particular 
time. This uncertainty leads to a clustering of 
nodes into what is called information sets for 
that player. What players decide on within the 
framework of the extensive form is not their 
actions, but their strategies , that is, what action 
they would take at each information set (in other 
words, correspondences between their informa¬ 
tion sets and their allowable actions). They then 
take specific actions (or actions are executed on 
their behalf), dictated by the strategies chosen 
as well as the progression of the game (deci¬ 
sion) process along the tree. The equilibrium is 
then defined in terms of not actions but strate¬ 
gies. 

The notion of a strategy , as a mapping from 
the collection of information sets to action sets, 
extends readily to infinite dynamic games, and 
hence, in both differential games and difference 
games, Nash equilibria are defined in terms of 
strategies. Several articles in this Encyclopedia 
discuss such equilibria, for both zero-sum and 
nonzero-sum dynamic games, with and without 
the presence of probabilistic uncertainty. 

In the broad scheme of things, game theory 
and particularly noncooperative game theory 
can be viewed as an extension of two fields, 
both covered in this Encyclopedia : Mathematical 
Programming and Optimal Control Theory. Any 
problem in game theory collapses to a problem 
in one of these disciplines if there is only one 
player. One-player static games are essentially 
mathematical programming problems (linear 
programming or nonlinear programming), and 


one-player difference or differential games can 
be viewed as optimal control problems. 


Highlights on the History and 
Evolution of Game Theory 

Game theory has enjoyed over 70 years of 
scientific development, with the publication of 
the Theory of Games and Economic Behavior by 
von Neumann and Morgenstern (1947) generally 
acknowledged to kick-start the field. It has 
experienced incessant growth in both the number 
of theoretical results and the scope and variety 
of applications. As a recognition of the vitality 
of the field, through 2012 a total of 10 Nobel 
Prizes were given in Economic Sciences for 
work primarily in game theory, with the first such 
recognition bestowed in 1994 on John Harsanyi, 
John Nash, and Reinhard Selten “for their 
pioneering analysis of equilibria in the theory 
of noncooperative games.” The second round 
of Nobel Prizes in game theory went to Robert 
Aumann and Thomas Schelling in 2005, “for 
having enhanced our understanding of conflict 
and cooperation through game-theory analysis.” 
The third round recognized Leonid Hurwicz, 
Eric Maskin, and Roger Myerson in 2007, “for 
having laid the foundations of mechanism design 
theory.” And the most recent one was in 2012, 
recognizing Alvin Roth and Lloyd Shapley, 
“for the theory of stable allocations and the 
practice of market design.” To this list of highest- 
level awards related to contributions to game 
theory, one should also add the 1999 Crafoord 
Prize (which is the highest prize in Biological 
Sciences), which went to John Maynard Smith 
(along with Ernst Mayr and G. Williams) 
“for developing the concept of evolutionary 
biology,” where Smith’s recognized contributions 
had a strong game-theoretic underpinning, 
through his work on evolutionary games and 
evolutionary stable equilibrium (Smith 1974, 
1982; Smith and Price 1973); this is the topic 
of one of the articles in this Encyclopedia 
(by Altman). Several other “game theory” 
articles in the Encyclopedia also relate to the 




502 


Game Theory: Historical Overview 


contributions of the Nobel Laureates mentioned 
above. 

Even though von Neumann and Morgenstern’s 
1944 book is taken as the starting point of 
the scientific approach to game theory, game- 
theoretic notions and some isolated key results 
date back to earlier years and even centuries. 
Sixteen years earlier, in 1928, von Neumann 
himself had resolved completely an open 
fundamental problem in zero-sum games, that 
every finite two-player zero-sum game admits a 
saddle point in mixed strategies , which is known 
as the Minimax Theorem (von Neumann 1928) 
- a result which Emile Borel had conjectured 
to be false eight years before. Some early 
traces of game-theoretic thinking can be seen 
in the 1802 work ( Considerations sur la theorie 
mathematique du jeu) of Andre-Marie Ampere 
(1775-1836), who was influenced by the 1777 
writings (Essai d’Arithmetique Morale) of 
Georges Louis Buffon (1707-1788). 

Which event or writing has really started 
game-theoretic thinking or approach to decision 
making (in law, politics, economics, operations 
research, engineering, etc.) may be a topic of 
debate, but what is indisputable is that in (zero- 
sum) differential games (which is most relevant 
to control theory) the starting point was the work 
of Rufus Isaacs in the RAND Corporation in 
the early 1950s, which remained classified for 
at least a decade, before being made accessible 
to a broad readership in 1965 (Isaacs 1965); see 
also the review (Ho 1965) which first introduced 
the book to the control community. One of the 
articles in this Encyclopedia (by Bernhard) talks 
about this history and the theory developed by 
Isaacs, within the context of pursuit-evasion 
games, and another article (again by Bernhard) 
discusses the impact the zero-sum differential 
game framework has made on robust control 
design (Ba§ar and Bernhard 1995). Extension 
of the game-theoretic framework to nonzero- 
sum differential games with Nash equilibrium as 
the solution concept was initiated in Starr and 
Ho (1969) and with Stackelberg equilibrium 
as the solution concept in Simaan and Cruz 
(1973). Systematic study of the role information 
structures play in the existence of such equilibria 


and their uniqueness or nonuniqueness (termed 
informational nonuniqueness) was carried out in 
Ba§ar (1974, 1976, 1977). 

Related Articles on Game Theory in 
the Encyclopedia 

Several articles in the Encyclopedia introduce 
various subareas of game theory and discuss im¬ 
portant developments (past and present) in each 
corresponding area. 

The article ► Strategic Form Games and 
Nash Equilibrium introduces the static game 
framework along with the Nash equilibrium 
concept, for both finite and infinite games, 
and discusses the issues of existence and 
uniqueness as well efficiency. The article 
► Dynamic Noncooperative Games focuses on 
dynamic games, again for both finite and infinite 
games, and discusses extensive form descriptions 
of the underlying dynamic decision process, 
either as trees (in finite games) or difference 
equations (in discrete-time infinite games). 
Bernhard, in two articles, discusses continuous¬ 
time dynamic games, described by differential 
equations (so-called differential games), but 
in the two-person zero-sum case. One of these 
articles ► Pursuit-Evasion Games and Zero-Sum 
Two-Person Differential Games describes the 
framework initiated by Isaacs, and several of 
its extensions for pursuit-evasion games, and 
the other one ►Linear Quadratic Zero-Sum 
Two-Person Differential Games presents results 
on the special case of linear quadratic differential 
games, with an important application of that 
framework to robust control and more precisely 
H°°-optimal control. 

When the number of players in a nonzero- 
sum game is countably infinite, or even just 
sufficiently large, some simplifications arise in 
the computation and characterization of Nash 
equilibria. The mathematical framework appli¬ 
cable to this context is provided by mean field 
theory , which is the topic of the article ► Mean 
Field Games, which discusses this relatively new 
theory within the context of stochastic differential 
games. 
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Cooperative solution concepts for dynamic 
games are discussed in the article ► Cooperative 
Solutions to Dynamic Games, which introduces 
Pareto optimality, the bargaining solution concept 
by Nash, characteristic functions, core, and C- 
optimality, and presents some selected results us¬ 
ing these concepts. In the article ► Evolutionary 
Games, the foundations of, as well as the recent 
advances in, evolutionary games are presented, 
along with examples showing their potential as 
a tool for capturing and modeling interactions in 
complex systems. 

The article ► Learning in Games addresses the 
online computation of Nash equilibrium through 
an iterative process which takes into account each 
player’s response to choices made by the remain¬ 
ing players, with built-in learning and adaptation 
rules; one such scheme that is discussed in the 
article is the well-known fictitious play. Learning 
is also the topic of the article ► Stochastic Games 
and Learning, which presents a framework and a 
set of results using the stochastic games formula¬ 
tion introduced by Shapley in the early 1950s. 

The article ►Network Games shows how 
game theory plays an important role in modeling 
interactions between entities on a network, partic¬ 
ularly communication networks, and presents a 
simple mathematical model to study one such 
instance, namely, resource allocation in the 
Internet. How to design a game so as to obtain a 
desired outcome (as captured by say a Nash equi¬ 
librium) is a question central to mechanism de¬ 
sign, which is covered in the article ► Mechanism 
Design, which discusses as a specific example the 
Vickrey-Clarke-Groves (VCG) mechanism. 

Two other applications of game theory are 
to design of auctions and security. The article 

► Auctions addresses the former, discussing 
general auction theory along with equilibrium 
strategies and more specifically combinatorial 
auctions. The latter is addressed in the article 

► Game Theory for Security, which discusses 
how the game-theoretic approach leads to more 
effective responses to security in complex 
systems and organizations, with applications 
to a wide variety of systems and critical 
infrastructures such as electricity, water, financial 
services, and communication networks. 


Future of Game Theory 

The second half of the twentieth century was a 
golden era for game theory, and all evidence so 
far in the twenty-first century indicates that the 
next half century is destined to be a platinum era. 
In all respects game theory is on an upward slope 
in terms of its vitality, the wealth of topics that fall 
within its scope, the richness of the conceptual 
framework it offers, the range of applications, and 
the challenges it presents to an inquisitive mind. 
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Abstract 

The linear-quadratic (LQ) problem is the 
prototype of a large number of optimal control 


problems, including the fixed endpoint, the point- 
to-point, and several // 2///00 control problems, 
as well as the dual counterparts. In the past 
50 years, these problems have been addressed 
using different techniques, each tailored to 
their specific structure. It is only in the last 
10 years that it was recognized that a unifying 
framework is available. This framework hinges 
on formulae that parameterize the solutions of 
the Hamiltonian differential equation in the 
continuous-time case and the solutions of the 
extended symplectic system in the discrete-time 
case. Whereas traditional techniques involve the 
solutions of Riccati differential or difference 
equations, the formulae used here to solve the 
finite-horizon LQ control problem only rely on 
solutions of the algebraic Riccati equations. 
In this article, aspects of the framework are 
described within a discrete-time context. 

Keywords 

Cyclic boundary conditions; Discrete-time linear 
systems; Fixed end-point; Initial value; Point- 
to-point boundary conditions; Quadratic cost; 
Riccati equations 

Introduction 

Ever since the linear-quadratic (LQ) optimal 
control problem was introduced in the 1960s by 
Kalman in his pioneering paper (1960), it has 
found countless applications in areas such as 
chemical process control, aeronautics, robotics, 
servomechanisms, and motor control, to name but 
a few. 

For details on the raisons d’etre of LQ prob¬ 
lems, readers are referred to the classical text¬ 
books on this topic (Anderson and Moore 1971; 
Kwakernaak and Sivan 1972) and to the Spe¬ 
cial Issue on LQ optimal control problems in 
IEEE Trans. Aut. Contr ., vol. AC-16, no. 6, 1971. 
The LQ regulator is not only important per se. It 
is also the prototype of a variety of fundamental 
optimization problems. Indeed, several optimal 
control problems that are extremely relevant in 
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practice can be recast into composite LQ, dual 
LQ, or generalized LQ problems. Examples in¬ 
clude LQG, H 2 and H 00 problems, and Kalman 
filtering problems. Moreover, LQ optimal control 
is intimately related, via matrix Riccati equations, 
to absolute stability, dissipative networks, and 
optimal filtering. The importance of LQ problems 
is not restricted to linear systems. For example, 
LQ control techniques can be used to modify an 
optimal control law in response to perturbations 
in the dynamics of a nonlinear plant. For these 
reasons, the LQ problem is universally regarded 
as a cornerstone of modern control theory. 

In its simplest and most classical version, the 
finite-horizon discrete LQ optimal control can be 
stated as follows: 

Problem 1 Let A e and B e M nxm , and 
consider the linear system 

x t +\ = Ax t + Bu t , y t =Cx t + Du t , (1) 

where the initial state Vo £ n is given. Let 
W = W T e M nxw be positive semidefinite. Find 
a sequence of inputs u t , with t = 0,1,..., A — 1, 
minimizing the cost function 

N-\ 

Jn,xo(u) = J2 \\yt\\ 2 + x]i w xn. (2) 

t =0 

Historically, LQ problems were first intro¬ 
duced and solved by Kalman in (1960). In this 
paper, Kalman showed that the LQ problem can 
be solved for any initial state xo, and the optimal 
control can be written as a state feedback u(t) = 
K(t)x(t), where K(t) can be found by solving 
a famous quadratic matrix difference equation 
known as the Riccati equation. When W is no 
longer assumed to be positive semidefinite, the 
optimal solution may or may not exist. A com¬ 
plete analysis of this case has been worked out 
in Bilardi and Ferrante (2007). In the infinite- 
horizon case (i.e., when N is infinite), the optimal 
control (when it exists) is stationary and may be 
computed by solving an algebraic Riccati equa¬ 
tion (Anderson and Moore 1971; Kwakernaak 
and Sivan 1972). 


Since its introduction, the original formulation 
of the classic LQ optimal control problem has 
been generalized in several different directions, 
to accommodate for the need of considering 
more general scenarios than the one represented 
by Problem 1. Examples include the so-called 
fixed endpoint LQ, in which the extreme states 
are sharply assigned, and the point-to-point 
case, in which the initial and terminal values 
of an output of the system are constrained to be 
equal to specified values. This led to a number 
of contributions in the area where different 
adaptations of the Riccati theory were tailored to 
these diversified contexts of LQ optimal control. 
These variations of the classic LQ problem are 
becoming increasingly important due to their 
use in several applications of interest. Indeed, 
many applications including spacecraft, aircraft, 
and chemical processes involve maneuvering 
between two states during some phases of a 
typical mission. Another interesting example is 
the H 2 -optimization of transients in switching 
plants, where the problem can be divided 
into a set of finite-horizon LQ problems with 
welding conditions on the optimal arcs for 
each switch instant. This problem has been 
the object of a large number of contributions 
in the recent literature, under different names: 
Parameter varying systems, jump linear systems, 
switching systems, and bumpless systems are 
definitions extensively used to denote different 
classes of systems affected by sensible changes 
in their parameters or structures (Balas and 
Bokor 2004). In recent years, a new unified 
approach emerged in Ferrante et al. (2005), 
Ferrante and Ntogramatzidis (2005), Ferrante 
and Ntogramatzidis (2007a), and Ferrante 
and Ntogramatzidis (2007b) that solves the 
finite-horizon LQ optimal control problem 
via a formula which parameterizes the set of 
trajectories generated by the corresponding 
Hamiltonian differential equation in the 
continuous time and the extended symplectic 
difference equation in the discrete case. Loosely, 
we can say that the expressions parameterizing 
the trajectories of the Hamiltonian differential 
equation and the extended symplectic difference 
equation using this approach hinge on a pair of 
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“opposite” solutions of the associated algebraic 
Riccati equations. This active stream of research 
considerably enlarged the range of optimal 
control problems that can be successfully 
addressed. This point of view always requires 
some controllability-type assumption and the 
extended symplectic pencil (Ferrante and 
Ntogramatzidis 2005, 2007b) to be regular 
and devoid of generalized eigenvalues on the 
unit circle. More recently, a new point of 
view has emerged which yields a more direct 
solution to this problem, without requiring 
system-theoretic assumptions (Ferrante and 
Ntogramatzidis 2013a,b; Ntogramatzidis and 
Ferrante 2013). 

The discussion here is restricted to the 
discrete-time case; for the corresponding 
continuous-time counterpart, we will only make 
some comments and refer to the literature. 

Notation. For the reader’s convenience, we 
briefly review some, mostly standard, matrix 
notation used throughout the paper. Given a 
matrix B e M wxm , we denote by B T its 
transpose and by B^ its Moore-Penrose pseudo¬ 
inverse , the unique matrix B ^ that satisfies 
BB^B = B , (BB^) t = BB\ 

and (B^B) t = B^B. The kernel of B is 
the subspace {x e W 1 \ Bx = 0} and is 
denoted ker B . The image of B is the subspace 
{y g R m | G W 1 : y = Ax} and is denoted 
by im B. Given a square matrix A, we denote by 
(7(A) its spectrum , i.e., the set of its eigenvalues. 
We write A\ > A 2 (resp. A\ > A 2 ) when A\ — A 2 
is positive definite (resp .positive semidefinite). 


Classical Finite-Horizon 
Linear-Quadratic Optimal Control 

The simplest classical version of the finite- 
horizon LQ optimal control is Problem 1. By 
employing some standard linear algebra, this 
problem may be solved by the classical technique 
known as “completion of squares”: First of all, 
the cost can be rewritten as 


N -1 

Jn,x o(w) = ^[*7 uj ]n 

t =0 


Xt 

u t 


+x t n Wx n , n = 


Q s' 

S T R 


def 


rc T i 

D t 


[C D] = II T > 0. (3) 


Now, let Xo,X\,,Xjy be an arbitrary 
sequence of n x n symmetric matrices. We have 
the identity 

E , i 7 )' W+\ x t+ 1 x,+ 1 - xj X, X ,] 

X 0 x 0 - xl X N x N = 0. (4) 

Adding (4)-(3) and using the expression (1) for 
*f+i, we get 


N -1 

Jn,x 0 {u) = ^2i x 7 u J] 

t=0 

' Q + A T X t+l A - X, S + A T X t+l B~ 
_S T + B T X t+l A R + B T X t+l B_ 

Xl + x t n (W - X N ) x N + x 0 T Xq x 0 , (5) 


which holds for any sequence of matrices X t . 
With Xn = W fixed, for? = N-l, N-2,... ,0, 
let 


X, = Q + A T X t+l A -(S + A T X t+1 B) 

(R + B T X l+l B)\S T + B T X l+l A). (6) 


It is now easy to see that all the matrices of the 
sequence X t defined above are positive semidefi¬ 
nite. Indeed, X^ = W > 0. Assume by induction 
that X t +\ > 0. Then, 


M aer 

t +1 = 


Q + A T X t+l A 
S T + B T X t+l A 


S+A T X t+1 B 
R + B T X t+l B 


= n + 


r^ T l 

b t 


X t+l [A 


B]> 0. 


Since X t is the generalized Schur complement of 
the right upper block of M ?+ i in M t +\, it follows 
that X t > 0, which in turns implies 
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R + B t X,B > 0. 


J*=xJX 0 x 0 . (10) 


Moreover, by employing Eq. (6) and recalling 
that, given a positive semidefinite matrix II o = 

Qo So ' 


\r °1 = n 0 T > 0, we have 
Uo *oJ 0 - 

j /?J[ Sj Ro] >0, we easily see that 


SoRlsJ So 
S7 Ro 


The corresponding results in continuous 
time can be obtained along the same lines 
as the discrete-time case; see Ferrante and 
Ntogramatzidis (2013b) and references therein. 


M t +\ — 


X t 

0 


0 

0 


S + A T X t+l B 
R + B T X t+l B 


(R + B t X,B ) f 


[S T +^ T X, +1 fi R + B T X t+l B] > 0. 


Hence, (5) takes the form 


N -1 

Jn,x 0 (u) = E II [(R + B T X t+l B) 1 ' 2 ] t 

t =0 

(5 t + B T X t+l A)x, + (R + B T X t+l B) 1 ' 2 
u t \\1 +Xq XqXq. (7) 


Now it is clear that u t is optimal if and only if 

(R + B T X t+l B)^ 2 ]\S T + B T X t+l A)x t 
+(R + B T X t+l B) l ' 2 u t =0, 


More General Linear-Quadratic 
Problems 

The problem discussed in the previous section 
presents some limitations that prevent its appli¬ 
cability in several important situations. In partic¬ 
ular, three relevant generalizations of the classical 
problem are: 

1. The fixed endpoint case , where the states at the 
endpoints x$ and x^ are both assigned. 

2. The point-to-point case , where the initial and 
terminal values zo and zn of linear combina¬ 
tion it = Vx t of the state of the dynamical 
system described by (1) are constrained to be 
equal to two assigned vectors. 

3. The cyclic case , where the states at the end¬ 
points Xq and Xn are not sharply assigned, but 
they are constrained to be equal (clearly, we 
can have combinations of (2) and (3)). 

All these problems are special cases of a 
general LQ problem that can be stated as follows: 


whose solutions are parameterized by the feed¬ 
back control 

u t = -K t x t + G t v t , (8) 


Problem 2 Consider the dynamical setting ( 1 ) of 
Problem 1 . Find a sequence of inputs u t , with t = 
0,1,..., N — 1 and an initial state Xo minimizing 
the cost function 


where K t = (R + B T X t+l B^(S T + B T X t+x A) 
and G t = [/ - (R + B T X t+x B)\R + 

B t X, +ii?)] is the orthogonal projector onto 
the linear space of vectors that can be added 
to the optimal control u t without affecting 
optimality and v t is a free parameter. The optimal 
state trajectory is now given by the closed-loop 
dynamics 

x t +1 = (A - BK t )x t + BG t v t . (9) 


N -1 

Jn,x o(«) = II yt II 2 + [ X o" “ *o" X N - *N ] 

t =o 

" W\\ Wn 

_w x w 22 

under the dynamic constraints (1) and the end¬ 
points constraints 

xo 
X N 


Xo -x 0 
Xn — X N 


( 11 ) 


The optimal cost is clearly 


V 


= v. 


( 12 ) 
















508 


Generalized Finite-Horizon Linear-Quadratic Optimal Control 


def r Wn Wl2 1 

Here W = \ T is a positive semidefinite 

L yy 12 w 22 J 

matrix (partitioned in four blocks) that quadrati- 
cally penalizes the differences between the initial 
state xo and a desired initial state xo and between 
the final state x^ and a desired final state x^ 
(This general problem formulation can also en¬ 
compass problems where the difference Xo — x^ 
is not fixed but has to be quadratically penalized 
with a matrix A = A T > 0 in the performance 
index J' Nx(] (u) = E,l7)' \ x 7 M 7l n [u, ] + (*o - 
Xiv) T A(vo — xn). It is simple to see that this 
performance index can be brought back to (11) 
by setting W = [ _ A A “ A A j and x 0 = x N = 
0.). Equation (12) permits to impose also a hard 
constraint on an arbitrary linear combination of 
initial and final states. 

The solution of Problem 2 can be obtained 
by parameterizing the solutions of the so-called 
extended symplectic system (see Ferrante and 
Levy (1998) and references therein for a discus¬ 
sion on symplectic matrices and pencils). This 
solution can be convenient also for the classi¬ 
cal case of Problem 1. In fact it does not re¬ 
quire to iterate the difference Riccati equation 
(which can be undesirable if the time horizon 
is large) but only to solve an algebraic Ric¬ 
cati equation and a related discrete Lyapunov 
equation. 

This solution requires some definitions, pre¬ 
liminary results, and standing assumptions (see 
Ntogramatzidis and Ferrante (2013) and Ferrante 
and Ntogramatzidis (2013a) for a more general 
approach which does not require such assump¬ 
tions). A detailed proof of the main result can 
be found in Ferrante and Ntogramatzidis (2007b); 
see also Zattoni (2008) and Ferrante and Ntogra¬ 
matzidis (2012). The extended symplectic pencil 
is defined by 


where Q , S, R are defined as in (3). We make the 
following assumptions: 

(Al) The pair (A, B ) is modulus controllable , 
i.e., VA e C\{0} at least one of the two 
matrices [XI — A \ B] and [A -1 / — A \ B] 
has full row rank. 

(A2) The pencil zF— G is regular (i.e., 
det (zF— G ) is not the zero polynomial) 
and has no generalized eigenvalues on the 
unit circle. 

Consider the discrete algebraic Riccati 
equation 

X = A t X A - (A t X B + S)(R + B t X B)~ x 
(; s t + b t xa) + q . (14) 

Under assumptions (A1)-(A2), (14) admits 
a strongly unmixed solution X = X T , i.e., a 
solution X for which the corresponding closed- 
loop matrix 

A x = A-BKx, K x = (R + B T X B)~ l 
(S T + B t X A) (15) 

has spectrum that does not contain reciprocal 
values, i.e., A ea(Ax) implies A -1 ^a(Ax). It 
can now be proven that the following closed- 
loop Lyapunov equation admits a unique solution 
Y = Y T eR nxn : 

Ax V Aj — Y + B (R + B t X B)~ x B t = 0. 

( 16 ) 

The following theorem provides an explicit for¬ 
mula parameterizing all the optimal state and 
control trajectories for Problem 2. Notice that this 
formula can be readily implemented starting from 
the problem data. 


zF -G,F = 


I n 0 0 

0 —A r 0 
0 -B r 0 


A 

Q 

s T 


0 

-I 

0 


B 

S 

R 


Theorem 1 With reference to Problem 2, assume 
that (Al) and (A2) are satisfied. Let X = X T 
be any strongly unmixed solution of (14) and 
Y = Y t be the corresponding solution of (16). 
Let Ny be a basis matrix (In the case when 
(13) ker V = {0}, we consider Ny to be void.) of the 
null space of V. Moreover, let 
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f = a n x , 

K t = K X Y Aj-(R + B T X 


X = diag (—X, X), X 


x 0 

X N 


def 

V 

T def 

'In 

Y F t " 

w = 

_-Nj Wx_ 

, L = 

F 

Y 


U = 


0 -F t_i 
0 In 


a j def 

M = 


V L 

_Nj [( X-W)L-U] 


Problem 2 admits solutions if and only if 
w eimM . In this case, let Nm be a basis matrix 
of the null space of M, and define 


V = {jt = M f w + Nm £ | £ arbitrary }. (17) 


Then, the set of optimal state and control trajec¬ 
tories of Problem 2 is parameterized in terms of 
n e V, by 


x(t) 

u(t) 


A‘ x Y {Aj) N ~> 1 

-K X A' X -K t (Aj) N ~‘~ l J 

0 < t < N — 1, (18) 


The interpretation of the above result is the 
following. As 7r varies, (18) describes the trajec¬ 
tories of the extended symplectic system. The set 
V defined in (17) is the set of tt for which these 
trajectories satisfy the boundary conditions. All 
the details of this construction can be found in 
Ferrante and Ntogramatzidis (2007b). If the pair 
(A, B) is stabilizable, we can choose X = X T 
to be the stabilizing solution of (14). In such 
case, the matrices A x , (A x ) N ~ t and (Aj) N ~ t ~ 1 
appearing in (18) are asymptotically stable for all 
t = 0,..., N. Thus, in this case, the optimal state 
trajectory and control are expressed in terms of 
powers of strictly stable matrices in the overall 
time interval, thus ensuring the robustness of 
the obtained solution even for very large time 
horizons. Indeed, the stabilizing solution of an 


algebraic Riccati equation and the solution of a 
Lyapunov equation may be computed by standard 
and robust algorithms available in any control 
package (see the MATLAB® routines dare. m 
and dlyap . m). We refer to Ferrante et al. (2005) 
and Ferrante and Ntogramatzidis (2013b) for the 
continuous-time counterpart of the above results. 

Summary 

With the technique discussed in this paper, a 
large number of LQ problems can be tackled 
in a unified framework. Moreover, several finite- 
horizon LQ problems that can be interesting and 
useful in practice can be recovered as particular 
cases of the control problem considered here. The 
generality of the optimal control problem herein 
considered is crucial in the solution of several 
H 2 — Hoq optimization problems whose optimal 
trajectory is composed of a set of arches, each one 
solving a parametric LQ subproblem in a specific 
time horizon and all joined together at the end¬ 
points of each subinterval. In these cases, in fact, 
a very general form of constraint on the extreme 
states is essential in order to express the condition 
of conjunction of each pair of subsequent arches 
at the endpoints. 
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Abstract 

Graphs constitute natural models for networks of 
interacting agents. This chapter introduces graph 
theoretic formalisms that facilitate analysis and 
synthesis of coordinated control algorithms over 
networks. 

Keywords 

Distributed control; Graph theory; Multi-agent 
networks 

Introduction 

Distributed and networked systems are character¬ 
ized by a set of dynamical units (agents, actors, 
nodes) that share information with each other 
in order to achieve a global performance objec¬ 
tive using locally available information. Informa¬ 
tion can typically be shared if agents are within 
communication or sensing range of each other. 
It is useful to abstract away the particulars of 
the underlying information-exchange mechanism 
and simply say that an information link exists 
between two nodes if they can share information. 
Such an abstraction is naturally represented in 
terms of a graph. 

A graph is a combinatorial object defined by 
two constructs: vertices (or nodes) and edges (or 
links) connecting pairs of distinct vertices. The 
set of N vertices specified by V = {rq,..., vn} 
corresponds to the agents, and an edge between 
vertices V; and vj is represented by (u/, vj); 
the set of all edges constitutes the edge set E. 
The graph G is thus the pair G = (V, E) and 
the interpretation is that an edge (vj,vj) e E 



Graphs for Modeling Networked Interactions 


511 


Graphs for Modeling 
Networked Interactions, 
Fig. 1 A network of 
agents equipped with 
omnidirectional range 
sensors can be viewed as a 
graph (undirected in this 
case), with nodes 
corresponding to the agents 
and edges to their pairwise 
interactions, which are 
enabled whenever the 
agents are within a certain 
distance from each other 



V3 



if information can flow from vertex i to vertex 
j. If the information exchange is sensor based, 
and if there is a state x z - associated with vertex i , 
e.g., its position, then this information is typically 
relative, i.e., the states are measured relative to 
each other and the information obtained along 
the edge is xj — X/. If on the other hand the 
information is communicated, then the full state 
information x* can be transmitted along the edge 
(Vi, Vj). The graph abstraction for a network of 
agents with sensor-based information exchange is 
illustrated in Fig. 1 . 

It is often useful to differentiate between sce¬ 
narios where the Information exchange is bidi¬ 
rectional - if agent i can get information from 
agent j , then agent j can get information from 
agent i - and when it is not. In the language of 
graph theory, an undirected graph is one where 
(Vi, vj) e E implies that (vj , Vf) e E , while a 
directed graph is one where such an implication 
may not hold. 

Graph-Based Coordination Models 

Graphs provide structural insights into how dif¬ 
ferent coordination algorithms behave over a net¬ 
work. A coordination algorithm, or protocol, is 
an update rule that describes how the node states 
should evolve over time. To understand such 
protocols, one needs to connect the interaction 
dynamics to the underlying graph structure. This 
connection is facilitated through the common 
intersection of linear system theory and graph 


theory, namely, the broad discipline of algebraic 
graph theory, by first associating matrices with 
graphs. For undirected graphs, the following ma¬ 
trices play a key role: 

Degree matrix : A=Diag (deg (v{) ,..., deg ( v ^])), 
Adjacency matrix : A = \a ;y], 

where Diag denotes a diagonal matrix whose 
diagonal consists of its argument and deg(r>z) 
is the degree of vertex i in the graph, i.e., the 
cardinality of the set of edges incident on vertex 
i . Moreover, 

a _ ( 1 if (Vj,Vi) e E 
lJ (0 otherwise. 

As an example of how these matrices come into 
play, the so-called consensus protocol over scalar 
states can be compactly written on ensemble form 
as 

x = —Lx, 

where x = [xi,..., x^] T and L is the graph 
Laplacian: 

L = A — A. 

A useful matrix for directed networks is the 
incidence matrix, obtained by associating an in¬ 
dex to each edge in E. We say that Vi = tail (ej) if 
edge ej starts at node Vi and Vi =head(e 7 ) if ej 
ends up at Vj , leading to the 

Incidence matrix : D = ], 





512 


Graphs for Modeling Networked Interactions 


where 


( 1 if Vi = head (ej ) 
-1 if Vi = tailO j) 

0 otherwise. 


It now follows that for undirected networks, the 
Laplacian has an equivalent representation as 

L = DD t , 


can be generalized is by defining an edge-tension 
energy Eij (||v; — Xj ||) along each edge in the 
graph, which gives the total energy in the network 
as 

N 

E(x) = EE Mil 

i = 1 jeNi 

If the agents update their states in such a way as 
to reduce the total energy in the system according 
to a gradient descent scheme, the update law 
becomes 


where D is the incidence matrix associated with 
an arbitrary orientation (assignment of directions 
to the edges) of the undirected graph. This in 
turn implies that for undirected networks, L is 
a positive semi-definite matrix and that all of its 
eigenvalues are nonnegative. 

If the network is directed, one has to pay 
attention to the direction in which information 
is flowing, using the in-degree and out-degree 
of the vertices. The out-degree of vertex i is 
the number of directed edges that originate at 
/, and similarly the in-degree of node i is the 
number of directed edges that terminate at node i . 
A directed graph is balanced if the out-degree is 
equal to the in-degree at every vertex in the graph. 
And, the graph Laplacian for directed graphs is 
obtained by only counting information flowing in 
the correct direction, i.e., if L = \lij \ then la is 
the in-degree of vertex i and Uj = —1 if z ^ j 
and (vj , Vi) g E. 

As a final note, for both directed and 
undirected networks, it is possible to associate 
weights to the edges, w : E T where T is set 
of nonnegative reals or more generally a field, in 
which case the Laplacian’s diagonal elements are 
the sum of the weights of edges incident to node 
i and the off-diagonal elements are —w(vj,Vi) 
when ( Vj , Vi) e E. 


dE(x) 
3 Xi 


E(x) = 


dE(x ) 2 
dx 2 


which is nonpositive, i.e., the total energy is 
reduced in the network. For undirected networks, 
the ensemble version of this protocol assumes the 
form 

x = —L w (x)x, 

where the weighted graph Laplacian is 


L w (x) = DW(x)D r , 


with the weight matrix W(x) = Diag(wi(v),..., 
wm (•*))• Here M is the total number of edges 
in the network, and Wk(x) is the weight that 
corresponds to the kth edge, given an arbitrary 
ordering of the edges consistent with the incident 
matrix D . 

This energy interpretation allows for the 
synthesis of coordination laws for multi-agent 
networks with desirable properties, such as 



making the agents reach the desired interagent 
distances dij , as shown in Fig. 2. Other 
applications where these types of constructions 
have been used include collision avoidance and 
connectivity maintenance. 


Applications 

Graph-based coordination has been used in a 
number of application domains, such as multi¬ 
agent robotics, mobile sensor and communication 
networks, formation control, and biological sys¬ 
tems. One way in which the consensus protocol 


Summary and Future Directions 

A number of issues pertaining to graph-based dis¬ 
tributed control remain to be resolved. These in¬ 
clude how heterogeneous networks, i.e., networks 
comprising of agents with different capabilities, 
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Graphs for Modeling Networked Interactions, Fig. 2 Fifteen mobile robots are forming the letter “G” by executing 
a weighted version of the consensus protocol, (a) Formation control (/ = 0). (b) Formation control (/ = 5) 


can be designed and understood. A variation to 
this theme is networks of networks, i.e., net¬ 
works that are loosely coupled together and that 
must coordinate at a higher level of abstrac¬ 
tion. Another key issue concerns how human 
operators should interact with networked control 
systems. 


Cross-References 

► Averaging Algorithms and 
Consensus 

► Distributed Optimization 

► Dynamic Graphs, Connectivity of 

► Flocking in Networked Systems 

► Networked Systems 

► Optimal Deployment and Spatial 
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Recommended Reading 

There are a number of research manuscripts 
and textbooks that explore the role of network 
structure on the system theoretic aspects of 
networked dynamic systems and its many 
ramifications. Some of these references are listed 
below. 
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Abstract 

An optimization-based approach to linear 
feedback control system design uses the H 2 
norm, or energy of the impulse response, 
to quantify closed-loop performance. In this 
entry, an overview of state-space methods for 
solving H 2 optimal control problems via Riccati 
equations and matrix inequalities is presented 
in a continuous-time setting. Both regular and 
singular problems are considered. Connections 
to so-called LQR and LQG control problems are 
also described. 

Keywords 

Feedback control; H 2 control; Linear matrix 
inequalities; Linear systems; Riccati equations; 
State-space methods 

Introduction 

Modern multivariable control theory based 
on state-space models is able to handle 


multi-feedback-loop designs, with the added 
benefit that design methods derived from it 
are amenable to computer implementation. 
Indeed, over the last five decades, a number of 
multivariable analysis and design methods have 
been developed using the state-space description 
of systems. Of these design tools, H 2 optimal 
control problems involve minimizing the H 2 
norm of the closed-loop transfer function from 
exogenous disturbance signals to a pertinent 
controlled output signals of a given plant 
by appropriate use of a internally stabilizing 
feedback controller. It was not until the 1990s 
that a complete solution to the general H 2 optimal 
control problem began to emerge. To elaborate 
on this, let us concentrate our discussion on H 2 
optimal control for a continuous-time system E 
expressed in the following state-space form: 

x = Ax + Bu + Ew (1) 

y — C\X + D\\U 4- D\w (2) 

z = C 2 x + D 2 u + D 22 W (3) 

where x is the state variable, u is the control 
input, w is the exogenous disturbance input, y is 
the measurement output, and z is the controlled 
output. The system E is typically an augmented 
or generalized plant model including weighting 
functions that reflect design requirements. The 
H 2 optimal control problem is to find an ap¬ 
propriate control law, relating the control input 
u to the measured output y, such that when it 
is applied to the given plant in Eqs. (1)—(3), the 
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resulting closed-loop system is internally stable, 
and the H 2 norm of the resulting closed-loop 
transfer matrix from the disturbance input w to 
the controlled output z, denoted by T zw {s ), is 
minimized. For a stable transfer matrix T zw (s ), 
the H 2 norm is defined as 


T zwh= (E trace J 


T ZH! (joj)T"(joj)da) 


(4) 


where T" w is the conjugate transpose of T zw . Note 
that the H 2 norm is equal to the energy of the 
impulse response associated with T zw (s ) and this 
is finite only if the direct feedthrough term of the 
transfer matrix is zero. 

It is standard to make the following assump¬ 
tions on the problem data: D\\ = 0; D 22 = 
0; ( A,B ) is stabilizable; (A,C 1 ) is detectable. 
The last two assumptions are necessary for the 
existence of an internally stabilizing control law. 
The first assumption can be made without loss 
of generality via a constant loop transformation. 
Finally, either the assumption D 22 = 0 can be 
achieved by a pre-static feedback law, or the 
problem does not yield a solution that has finite 
H 2 closed-loop norm. 

There are two main groups into which all H 2 
optimal control problems can be divided. The 
first group, referred to as regular H 2 optimal 
control problems, consists of those problems for 
which the given plant satisfies two additional 
assumptions: 

1. The subsystem from the control input to the 
controlled output, i.e., (A, B, C 2 , D 2 ), has no 
invariant zeros on the imaginary axis, and 
its direct feedthrough matrix, D 2 , is injective 
(i.e., it is tall and of full rank). 

2. The subsystem from the exogenous dis¬ 
turbance to the measurement output, i.e., 
(A, E,C\, Di), has no invariant zeros on 
the imaginary axis and its direct feedthrough 
matrix, D 1 , is surjective (i.e., it is fat and of 
full rank). 

Assumption 1 implies that (A, B, C 2 , D 2 ) is left 
invertible with no infinite zero, and Assump¬ 
tion 2 implies that (A, E, C\, D\) is right invert¬ 
ible with no infinite zero. The second, referred to 


as singular H 2 optimal control problems, consists 
of those which are not regular. 

Most of the research in the literature was 
expended on regular problems. Also, most of the 
available textbooks and review articles, see, for 
example, Anderson and Moore (1989), Bryson 
and Ho (1975), Fleming and Rishel (1975), 
Kailath (1974), Kwakernaak and Sivan (1972), 
Lewis (1986), and Zhou et al. (1996), to name a 
few, cover predominantly only a subset of regular 
problems. The singular H 2 control problem with 
state feedback was studied in Geerts (1989) and 
Willems et al. (1986). Using different classes of 
state- and measurement-feedback control laws, 
Stoorvogel et al. (1993) studied the general H 2 
optimal control problems for the first time. In 
particular, necessary and sufficient conditions are 
provided therein for the existence of a solution in 
the case of state-feedback control, and in the case 
of measurement-feedback control. Following 
this, Trentelman and Stoorvogel (1995) explored 
necessary and sufficient conditions for the 
existence of an H 2 optimal controller within 
the context of discrete-time and sampled-data 
systems. At the same time Chen et al. (1993, 
1994a) provided a thorough treatment of the 
H 2 optimal control problem with state-feedback 
controllers. This includes a parameterization 
and construction of the set of all H 2 optimal 
controllers and the associated sets of H 2 optimal 
fixed modes and H 2 optimal fixed decoupling 
zeros. Also, they provided a computationally 
feasible design algorithm for selecting an H 2 
optimal state-feedback controller that places the 
closed-loop poles at desired locations whenever 
possible. Furthermore, Chen and Saberi (1993) 
and Chen et al. (1996) developed the necessary 
and sufficient conditions for the uniqueness of 
an H 2 optimal controller. Interested readers are 
referred to the textbook Saberi et al. (1995) 
for a detailed treatment of H 2 optimal control 
problems in their full generality. 


Regular Case 

Solving regular H 2 optimal control problems is 
relatively straightforward. In the case that all of 
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the state variables of the given plant are available 
for feedback, i.e., y = x, and Assumption 1 
holds, the corresponding H 2 optimal control 
problem can be solved in terms of the unique 
positive semi-definite stabilizing solution P > 0 
of the following algebraic Riccati equation: 

A T P + PA + C 2 t C 2 - (PB + C^D 2 )(D t 2 D 2 )~ 1 
(. D t 2 C 2 + B T P ) = 0 (5) 

The H 2 optimal state-feedback law is given by 

u = Fx =-(D T 2 D 2 )-\D T 2 C 2 +B T P) X (6) 

and the resulting closed-loop transfer matrix from 
w to z, T zw (s), has the following property: 

117^112 = yj trace( E 1 P E) (7) 


where R* > 0 and > 0 with (A, Ql) being 
detectable. The LQR problem is equivalent to 
finding a static state-feedback H 2 optimal control 
law for the following auxiliary plant £ LQR : 

x — Ax T Bu Xqw (10) 

J=X (11) 

+ (12) 

For the measurement-feedback case with both 
Assumptions 1 and 2 being satisfied, the cor¬ 
responding H 2 optimal control problem can be 
solved by finding a positive semi-definite stabi¬ 
lizing solution P > 0 for the Riccati equation 
given in Eq. (5) and a positive semi-definite sta¬ 
bilizing solution Q > 0 for the following Riccati 
equation: 


H 


Note that the H 2 optimal state-feedback control 
law is generally nonunique. A trivial example 
is the case when E = 0, whereby every sta¬ 
bilizing control law is an optimal solution. It 
is also interesting to note that the closed-loop 
system comprising the given plant with y = x 
and the state-feedback control law of Eq. (6) has 
poles at all the stable invariant zeros and all the 
mirror images of the unstable invariant zeros of 
(A, B, C 2 , D 2 ) together with some other fixed 
locations in the left half complex plane. More de¬ 
tailed results about the optimal fixed modes and 
fixed decoupling zeros for general H 2 optimal 
control can be found in Chen et al. (1993). 

It can be shown that the well-known linear 
quadratic regulation (LQR) problem can be refor¬ 
mulated as a regular H 2 optimal control problem. 
For a given plant 

x = Ax + Bu, x(0) = Xq (8) 

with (A, B) being stabilizable, the LQR problem 
is to find a control law u = Fx such that the 
following performance index is minimized: 

pOO 

J= (x T Q+x + u T R+u)dt, (9) 

Jo 


QA t +AQ + EE T -(QC ! T + EDJ)(D 1 DJ)~ 1 
(D 1 E T + C 1 Q) = 0 (13) 

The H 2 optimal measurement-feedback law is 
given by 

v = (A + BF + KCi)v -Ky, u= Fx (14) 

where F is as given in Eq. (6) and 

K = -( QC\ + ED\)(D x D\y x (15) 

In fact, such an optimal control law is unique and 
the resulting closed-loop transfer matrix from w 
to z, T Z w(s), has the following property: 

\\T ZW \\ 2 = {trac e(£ T P£) 

+trace [(A'P + PA + C 2 T C 2 ) Q 

(16) 

Similarly, consider the standard LQG problem 
for the following system: 


x — Ax Bu G+d 


(17) 
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y = 

Cx - 1 - N+ n , 

W* >o 

(18) 

H 

), R*> 0, 

II 

£ 

(19) 


where x is the state, u is the control, d and n 
white noises with identity covariance, and y the 
measurement output. It is assumed that (A, B) is 
stabilizable and (A, C) is detectable. The control 
objective is to design an appropriate control law 
that minimizes the expectation of |z| 2 . Such an 
LQG problem can be solved via the H 2 optimal 
control problem for the following auxiliary sys¬ 
tem S LQG (see Doyle 1983): 


to find a suboptimal control law for the singular 
problem, i.e., to find an appropriate control law 
such that the H 2 norm of the resulting closed- 
loop transfer matrix from w to z can be made 
arbitrarily close to the best possible performance. 
The procedure given below is to transform the 
original problem into an H 2 almost disturbance 
decoupling problem; see Stoorvogel (1992) and 
Stoorvogel et al. (1993). 

Consider the given plant in Eqs. (l)-(3) with 
Assumption 1 and/or Assumption 2 not satisfied. 
First, find the largest solution P > 0 for the 
following linear matrix inequality 


x = Ax + Bu + [ G* 0 ]w (20) 


y = Cx + [ 0 A*]w (21) 



H 2 optimal control problem for discrete¬ 
time systems can be solved in a similar way 
via the corresponding discrete-time algebraic 
Riccati equations. It is worth noting that many 
works can be found in the literature that deal 
with solutions to discrete-time algebraic Riccati 
equations related to optimal control problems; 
see, for example, Kucera (1972), Pappas et al. 
(1980), and Silverman (1976), to name a few. It 
is proven in Chen et al. (1994b) that solutions 
to the discrete- and continuous-time algebraic 
Riccati equations for optimal control problems 
can be unified. More specifically, the solution 
to a discrete-time Riccati equation can be done 
through solving an equivalent continuous-time 
one and vice versa. 


Singular Case 

As in the previous section, only the key procedure 
in solving the singular /^-optimization problem 
for continuous-time systems is addressed. For 
the singular problem, it is generally not possible 
to obtain an optimal solution, except for some 
situations when the given plant satisfies certain 
geometric constraints; see, e.g., Chen et al. (1993) 
and Stoorvogel et al. (1993). It is more feasible 


F(P)= 


/ A T P + PA + C 2 t C 2 PB + CJD 2 
( B r P + D t 2 C 2 D t 2 D 2 


and find the largest solution Q > 0 for 


>0 

(23) 


G(Q)= 


(AQ + QA T + EE 1 QC] + ED]\ 

V CxQ + DiE* D,D\ )- 

(24) 


Note that by decomposing the quadruples 
(A, B,C 2 , D 2 ) and (A, E, C\, D\) into various 
subsystems in accordance with their structural 
properties, solutions to the above linear matrix 
inequalities can be obtained by solving a Riccati 
equation similar to those in Eq. (5) or Eq. (5) for 
the regular case. In fact, for the regular problem, 
the largest solution P > 0 for Eq. (23) and 
the stabilizing solution P > 0 for Eq. (5) are 
identical. Similarly, the largest solution Q > 0 
for Eq. (24) and the stabilizing solution Q > 0 
for Eq. (13) are also the same. Interested readers 
are referred to Stoorvogel et al. (1993) for more 
details or to Chen et al. (2004) for a more system¬ 
atic treatment on the structural decomposition of 
linear systems and its connection to the solutions 
of the linear matrix inequalities. 

It can be shown that the best achievable // 2 
norm of the closed-loop transfer matrix from w 
to z, i.e., the best possible performance over all 
internally stabilizing control laws, is given by 


y 2 = {trac e(E T PE) 

+ trace \{A T P + PA + C\C 2 ) Q ]}* (25) 
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Next, partition 

f(p)= (z>9 

■ d ndG(Q) = ^(EJ ) Dl) (26) 

where [C P D P \ and [is* TL] are of maximal 
rank, and then define an auxiliary system Xp Q : 

XpQ — -d-XpQ -1 - Bu — I - EqW P q (27) 

y — Ci-^pq + D Q w PQ (28) 

Zpq — CpXpq H - D P u (29) 

It can be shown that the quadruple (A , B , C P ,D P ) 
is right invertible and has no invariant zeros in the 
open right-half complex plane, and the quadruple 
(A, E q , Ci, D q ) is left invertible and has no 
invariant zeros in the open right-half complex 
plane. It can also be shown that there exists 
an appropriate control law such that when it is 
applied to H PQ , the resulting closed-loop system 
is internally stable and the H 2 norm of the closed- 
loop transfer matrix from w PQ to z PQ can be made 
arbitrarily small. Equivalently, H 2 almost distur¬ 
bance decoupling problem for E PQ is solvable. 

More importantly, it can further be shown 
that if an appropriate control law solves the H 2 
almost disturbance decoupling problem for S PQ , 
then it solves the H 2 suboptimal problem for £. 
As such, the solution to the singular H 2 control 
problem for £ can be done by finding a solution 
to the H 2 almost disturbance decoupling problem 
for £pq. There are vast results available in the 
literature dealing with disturbance decoupling 
problems. More detailed treatments can be found 
in Saberi et al. (1995). 

Conclusion 

This entry considers the basic solutions to 
H 2 optimal control problems for continuous¬ 
time systems. Both the regular problem and 
the general singular problem are presented. 
Readers interested in more details are referred 


to Saberi et al. (1995) and the references therein, 
for the complete treatment of H 2 optimal control 
problems, and to Chap. 10 of Chen et al. (2004) 
for the unification and differentiation of H 2 
control, Hoq control, and disturbance decoupling 
control problems. H 2 optimal control is a mature 
area and has a long history. Possible future 
research includes issues on how to effectively 
utilize the theory in solving real-life problems. 
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Abstract 

The area of robust control, where the perfor¬ 
mance of a feedback system is designed to be 
robust to uncertainty in the plant being controlled, 
has received much attention since the 1980s. 
System analysis and controller synthesis based on 
the H-infinity norm has been central to progress 
in this area. This article outlines how the control 
law that minimizes the H-infinity norm of the 


closed-loop system can be derived. Connections 
to other problems, such as game theory and risk- 
sensitive control, are discussed and finally appro¬ 
priate problem formulations to produce “good” 
controllers using this methodology are outlined. 

Keywords 

Loop-shaping; Robust control; Robust stability 

Introduction 

The Hoo-norm probably first entered the study 
of robust control with the observations made 
by Zames (1981) in the considering optimal 
sensitivity. The so-called Hoo methods were 
subsequently developed and are now routinely 
available to control engineers. In this entry 
we consider the H 0 o methods for control, and 
for simplicity of exposition, we will restrict 
our attention to linear, time-invariant, finite 
dimensional, continuous-time systems. Such 
systems can be represented by their transfer 
function matrix, G(s ), which will then be a 
rational function of s. Although the Hardy 
Space, Hoo, also includes nonrational functions, 
a rational G(s ) is in 1-Loo if and only if it is proper 
and all its poles are in the open left half plane, in 
which case the Hoo-norm is defined as: 

l|G(*)lloo= sup a max (G(A)= sup <w(C/<w)) 

Re.s’>0 — oo<oxoo 

(where cr max denotes the largest singular value). 
Hence for a single input/single output system 
with transfer function, g(s), its Hoo-norm, 
ll^(^)llcx) gi yes the maximum value of \g(jco)\ 
and hence the maximum amplification of 
sinusoidal signals by a system with this transfer 
function. In the multi-input/multi-output case 
a similar result holds regarding the system 
amplification of a vector of sinusoids. There 
is now a good collection of graduate level 
textbooks that cover the area in some detail from 
a variety of approaches, and these are listed 
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in the Recommended Reading section and the 
references in this article are generally to these 
texts rather than to the original journal papers. 

Consider a system with transfer function, 
G(s ), input vector, u(t) e £ 2 ( 0 , 00 ) and an 
output vector, y(t ), whose Laplace transforms 
are given by u(s) and y(s). Such a system will 
have a state space realization, 

x(t ) = Ax(t ) + Bu(t), y(t) = Cx(t ) + Du(t ) 

giving G(s ) = D + C(sl — A)~ l B, which we 
also denote 



H-Infinity Control, Fig. 1 Lower linear fractional trans¬ 
formation: feedback system 


" A 

B~ 

C 

D 


and hence y(s ) = G(s)u(s) if v(0) = 0. 

There are two main reasons for using the Hoo~ 
norm. Firstly in representing the system gain for 
input signals u(t ) e £2(0,00) or equivalently 
u(jco) e £2(—00, 00), with corresponding norm 
IHI2 — / 0 °° u(t)*u(t) dt (where v* denotes the 
conjugate transpose of the vector v (or a matrix)). 
With these input and output spaces the induced 
norm of the system is easily shown to be the Hoo~ 
norm of G(s ), and in particular, 

IMh < IIG^IIooIImIU 

Hence in a control context the % 00 -norm can 
give a measure of the gain, for example, from 
disturbances to the resulting errors. In the 
interconnection of systems, the property that 
II P(s)Q (5) II 00 < II PCs))II 00 new II 00 is Often 
useful. 

The second reason for using the Hoc-norm 
is in representing uncertainty in the plant being 
controlled, e.g., the nominal plant is P 0 (s) but 
the actual plant is P(s ) = P 0 (s) + A(s) where 
l|A0)||oo<i. 

A typical control design problem is given in 
Fig. 1, i.e., 


z 

— p 

w 


P\\w + Pnu 

_y_ 

— / 

u 


_P 2 lW + P 22 u_ 


u = Ky 


=> y = (/ — P 22 K) l P 2 \w , 
u = K(I - P 22 K)- 1 P 2 iw 

z = (Pu + PuK(I - P 22 K)~ l P 2l )w 
=: K)w =: T z ^ w w 

where Fi(P,K) denotes the lower Linear 
Fractional Transformation (LFT) with connection 
around the lower terminals of P as in Fig. 1. 

The standard Hoo -control synthesis problem is 
to find a controller with transfer function, K , that 


stabilizes the closed-loop system in Fig. 1 
and minimizes || Hi(P, ^)||oo- 


That is, the controller is designed to minimize 
the worst-case effect of the disturbance w on 
the output/error signal z as measured by the £2 
norm of the signals. This article will describe the 
solution to this problem. 

Robust Stability 

Before we describe the solution to the synthe¬ 
sis problem, consider the problem of the robust 
stability of an uncertain plant with a feedback 
controller. Suppose the plant is given by the upper 
LFT, T U (P , A) with HAHoo < 1/y as illustrated 
in Fig. 2, 


H 


y = Fu(P, A)w, 


( 1 ) 
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H-Infinity Control, Fig. 2 Upper linear fractional trans¬ 
formation 



H-Infinity Control, Fig. 3 Feedback system with plant 
uncertainty 


P A = (I + W l AW 2 )P 0 = F u ( 


0 W 2 P 0 
Wi P 0 



with robust stability test given by 



0 W 2 P 0 
m P 0 



= \\W 2 P 0 K(I - PoKy'WiWoo < y 


As a second example consider the plants 
P A = (M + Am)- 1 (A + A n ), with A = 
[A^v A m ] and HAHoo < l/y. Here P 0 = 
M~ l N is a left coprime factorization of the 
nominal plant and the plants P A are represented 
by perturbations to these coprime factors. In this 
case P A = T U (P , A), where 


P = 



0 


I 



—M~ l 


1 

1 



M~ l 




and the robust stability test will be 


mp,K)\\oo= 


K 

-I 


{I — P 0 K)~ l M~ x 


< y 

oo 


This is related to plant perturbations in the gap 
metric (see Vinnicombe 2001). It is therefore 
observed that the robust stability test for these 
useful representations of uncertain plants is given 
by an Hoo-norm test just as in the controller 
synthesis problem. 


where TJP,K) := P 22 + P 2l A(I— P\\ A) _1 f > 12 

( 2 ) 

The small gain theorem then states that the 
feedback system of Fig. 3 will be stable for all 
such A if the feedback connection of P 22 and 
K is stable and || J/(P, K) ||oo < y. This robust 
stability result is valid if P and A are both stable; 
more care is required when either or both are 
unstable but with such care a similar result is true. 

Let us consider a couple of examples. First 
suppose that the uncertainty is represented as 
output multiplicative uncertainty, 


Derivation of the / H 00 -Control Law 

In this section we present a solution to the PLoq- 
control problem and give some interpretations of 
the solution. The approach presented is as in by 
Doyle et al. (1989); see also Zhou et al. (1996). 
We will make some simplifying structural as¬ 
sumptions to make the formulae less complex and 
will not state the required assumptions on rank, 
stabilizability, and detectability. Let the system in 
Fig. 1 be described by the equations: 

x (^) — Ax{t^) T~ B\\v{t) T~ B 2 u(jt ) (3) 
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z(t ) = C\x(t) + Di 2 u(t) (4) 

y(t) = C 2 x(t ) + D 2 \w(t) (5) 

i.e., in Fig. 1 



" A 

B i 

b 2 

p = 

Cl 

0 

F> 12 


c 2 

F> 21 

0 


where we also assume, with little loss of general¬ 
ity, that D* 2 D i2 = /, D 2 \D*i = /, D* 2 C x = 
0 and B\D* X = 0. Since we wish to have 
|| r z <_ w ||oo < y, we need to find u such that 

||z||i — y 2 ||w||| <0 for all w / 0 e £2(0, 00). 

We could consider w to be an adversary trying to 
make this expression positive, while u has to en¬ 
sure that it always remains negative in spite of the 
malicious intentions of w, as in a noncooperative 
game. Suppose that there exists a solution, 
to the Algebraic Riccati Equation (ARE), 

A*Xoo + X^A + C*C X 

+X 00 (y~ 2 B\B* - B 2 B*)X oo = 0 ( 6 ) 

with Xoo > 0 and A + (y~ 2 B\B* - B 2 B*)X 00 
a stable “A-matrix.” A simple substitution then 
gives that 

Xoox(t)) = -z*z+ y 2 w*w 
at 

+v v — y r r 

where 

v := u + B^XoqX, r := w — y~ 2 B*X^x. 

Now let x(0) = 0 and assuming stability so that 
x(oo) = 0 , then integrating from 0 to oo gives 

Mil - y 2 IMl! = ll^lll — y 2 lklll (7) 

If the state is available to u, then the control 
law u = —B*XoqX gives v = 0 and \\z\\ 2 — 
Y 2 \H\ 2 2 < 0 for all w 7 ^ 0. It can be shown 
that ( 6 ) has a solution if there exists a controller 


such that || Fi(P, ^0||oo < y. In addition since 
transposing a system does not change its H 0 o- 
norm, the following dual ARE will also have a 
solution, Eoo > 0 , 

ay^ + y^a* + b x b* 

+ioo(y _2 C*C 1 - C*C 2 )Y oo = 0 (8) 

To obtain a solution to the output feedback 
case, note that (7) implies that ||z||| < y 2 |Ml 2 
if and only if ||u ||2 < y 2 1|^111 and v = 

J'/CAmp, K)r where 


V 


r 

_y_ 

ii 

3 

T3 

u 


and 



~ A +y~ 2 B t B*X 00 

^1 ^2 

3 

II 

B?X oo 

0 I 


C 2 

D 2i 0 


The special structure of this problem enables a 
solution to be derived in much the same way 
as the dual of the state feedback problem. The 
corresponding ARE will have a solution F tmp = 
(/ — y^EooAoo ) -1 Too > 0 if and only if the 
spectral radius p(Y 00 X 00 ) < y 2 . 

The above outline, supported by significant 
technical detail and assumptions, will therefore 
demonstrate that there exists a stabilizing con¬ 
troller, K(s ), such that the system described by 
(3-1) satisfies HJ^-vJoo < y if and only if there 
exist stabilizing solutions to the AREs in ( 6 ) and 
( 8 ) such that 

*oo>0, Foo>0, piYooXoo) < y 2 (9) 

The state equations for the resulting controller 
can be written as 

X — A.X - h ^lW wors t H - B 2 U~\~ ZooLooCCz* y) 
u — FqqX, w W0rs t — y B^ XqqX 
Foo := -BIX oo, L 0 o := -Y^C*, 

Zoo := (/ - y^ooZoo)- 1 
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giving feedback from a state estimator in the pres¬ 
ence of an estimate of the worst-case disturbance. 

As y oo the standard LQG controller 
is obtained with state feedback of a state esti¬ 
mate obtained from a Kalman filter. In contrast 
to the LQG problem, the controller depends on 
the value of y, and if this is chosen to be too 
small, then one of the conditions in (9) will 
be violated. In order to determine the minimum 
achievable value of y, a bisection search over y 
can be performed checking (9) for each candidate 
value of y. 

In the limit as y —> y 0pt (its minimum value), 
a variety of situations can arise and the formulae 
given here may become ill-conditioned. Typically 
achieving y 0pt is more of an interesting and some¬ 
times challenging mathematical exercise rather 
than a control system requirement. 

This control problem does not have a unique 
solution, and all solutions can be characterized 
by an LFT form such as K = Q ) where 

Q G Hoq with lieiloo < 1, the present solution 
is sometimes referred to as the “central solution” 
obtained with Q = 0. 


Relations for Other Solution Methods 
and Problem Formulations 

The 1-Loo -control problem has been shown to 
be related to an extraordinarily wide variety of 
mathematical techniques and to other problem 
areas, and investigations of these connections 
have been most fruitful. Earlier approaches (see 
Francis 1988) firstly used the characterization 
of all stabilizing controllers of Youla et al. (see 
Vidyasagar 1985) which shows that all stable 
closed-loop systems can be written as 

FiiP, K) = T\ + T 2 QT 3 , where Q e Hoo 

and then solved the model matching problem 
inf <2^oo H r i + KQTsWoo. This model matching 
problem is related to interpolation theory and 
resulted in a productive interaction with the 
operator theory. One solution method reduces 
this problem to J-spectral factorisation problems 


^ where J = 


) 


and generates state- 


space solutions to these problems (Kimura 1997). 

The derivation above clearly demonstrates re¬ 
lations to noncooperative differential games, and 
this is fully developed in Ba§ar and Bernhard 
(1995) and Green and Limebeer (1995). 

The model matching problem is clearly a con¬ 
vex optimization problem. The solution of linear 
matrix inequalities can give effective methods 
for solving certain convex optimization problems 
(e.g., calculating the Hoo norm using the bounded 
real lemma) and can be exploited in the Hoo~ 
control problem. See Boyd and Barratt (1991) for 
a variety of results on convex optimization and 
control and Dullerud and Paganini (2000) for this 
approach in robust control. 

As noted above there is a family of solutions 
to the Hoo-control problem. The central solution 
in fact minimizes the entropy integral given by 


i(Tz*-w', y) 


2tt 



In 


|det(/—y 2 T z ^Jjco)*T 7 ^ w (joj))\ dco 

( 10 ) 


It can be seen that this criterion will penalize the 
singular values of T z ^ w (jco) from being close to 
y for a large range of frequencies. 

One of the more surprising connections is 
with the risk-sensitive stochastic control problem 
(Whittle 1990) where w is assumed to be Gaus¬ 
sian white noise and it is desired to minimize 


My) :=ylnE \eb- 1VT ] 

(id 

r T 

where V T := J z(t)*z(t ) dt 

(12) 


The situation with y 2 > 0 corresponds to the 
risk averse controller since large values of Vt 
are heavily penalized by the exponential function. 
It can be shown that if || T z< - W \\ 0Q < y, then 


lim Jt(y) = KT 7/ - w \ y) 

oo 

and hence the central controller minimizes both 
the entropy integral and the risk-sensitive cost 
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function. When y is chosen to be too small, 
Whittle refers to the controller having a “neurotic 
breakdown” because the cost will be infinite 
for all possible control laws! If in (11) we set 
y 2 = —6~ l , then the entropy minimizing con¬ 
troller will have 0 < 0 and will be risk-averse. 
The risk neutral controller is when 0 —> 0, y 
oo and gives the standard LQG case. If 6 > 0, 
then the controller will be risk-seeking, believing 
that large variance will be in its favor. 

Controller Design with l-Loo 
Optimization 

The above solutions to the Hoo mathematical 
problem do not give guidance on how to set up a 
problem to give a “good” control system design. 
The problem formulation typically involves iden¬ 
tifying frequency-dependent weighting matrices 
to characterize the disturbances, w, and the rel¬ 
ative importance of the errors, z (see Skogestad 
and Postlethwaite 1996). The choice of weights 
should also incorporate system uncertainty to 
obtain a robust controller. 

One approach that combines both closed-loop 
system gain and system uncertainty is called 
H 0 o loop-shaping where the desired closed-loop 
behavior is determined by the design of the loop- 
shape using pre- and post-compensators and the 
system uncertainty is represented in the gap met¬ 
ric (see Vinnicombe 2001). This makes classical 
criteria such as low frequency tracking error, 
bandwidth, and high-frequency roll-off all eas¬ 
ily incorporated. In this framework the perfor¬ 
mance and robustness measures are very well 
matched to each other. Such an approach has been 
successfully exploited in a number of practical 
examples (e.g., Hyde (1995) for flight control 
taken through to successful flight tests). Standard 
control design software packages now routinely 
have Hoo -control design modules. 


Summary and Future Directions 

We have outlined the derivation of Hoo 
controllers with straightforward assumptions that 


nevertheless exhibit most of the features of linear 
time-invariant systems without such assumptions 
and for which routine design software is now 
available. Connections to a surprisingly large 
range of other problems are also discussed. 

Generalizations to more general cases such 
as time-varying and nonlinear systems, where 
the norm is interpreted as the induced norm of 
the system in £ 2 , can be derived although the 
computational aspects are no longer routine. For 
the problems of robust control, there are necessar¬ 
ily continuing efforts to match the mathematical 
representation of system uncertainty and system 
performance to the physical system requirements 
and to have such representations amenable to 
analysis and computation. 
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Abstract 

This entry gives an overview of the development 
of adaptive control, starting with the early ef¬ 
forts in flight and process control. Two popular 
schemes, the model reference adaptive controller 
and the self-tuning regulator, are described with 
a thumbnail overview of theory and applications. 
There is currently a resurgence in adaptive flight 
control as well as in other applications. Some 
reflections on future development are also given. 
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Introduction 

In everyday language, to adapt means to change 
a behavior to conform to new circumstances, 
for example, when the pupil area changes to 


accommodate variations in ambient light. The 
distinction between adaptation and conventional 
feedback is subtle because feedback also attempts 
to reduce the effects of disturbances and plant 
uncertainty. Typical examples are adaptive optics 
and adaptive machine tool control which are 
conventional feedback systems, with controllers 
having constant parameters. In this entry we take 
the pragmatic attitude that an adaptive controller 
is a controller that can modify its behavior in re¬ 
sponse to changes in the dynamics of the process 
and the character of the disturbances, by adjusting 
the controller parameters. 

Adaptive control has had a colorful history 
with many ups and downs and intense debates 
in the research community. It emerged in the 
1950s stimulated by attempts to design autopi¬ 
lots for supersonic aircrafts. Autopilots based on 
constant-gain, linear feedback worked well in one 
operating condition but not over the whole flight 
envelope. In process control there was also a need 
for automatic tuning of simple controllers. 

Much research in the 1950s and early 1960s 
contributed to conceptual understanding of 
adaptive control. Bellman showed that dynamic 
programming could capture many aspects of 
adaptation (Bellman 1961). Feldbaum introduced 
the notion of dual control , meaning that control 
should be probing as well as directing; the 
controller should thus inject test signals to obtain 
better information. Tsypkin showed that schemes 
for learning and adaptation could be captured in 
a common framework (Tsypkin 1971). 

Gabor’s work on adaptive filtering (Gabor 
et al. 1959) inspired Widrow to develop an 
analogue neural network (Adaline) for adaptive 
control (Widrow 1962). Widrow’s adaptation 
mechanism was inspired by Hebbian learning in 
biological systems (Hebb 1949). 

There are adaptive control problems in eco¬ 
nomics and operations research. In these fields 
the problems are often called decision making 
under uncertainty. A simple idea, called the cer¬ 
tainty equivalence principle proposed by Simon 
(1956), is to neglect uncertainty and treat esti¬ 
mates as if they are true. Certainty equivalence 
was commonly used in early work on adaptive 
control. 
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A period of intense research and ample fund¬ 
ing ended dramatically in 1967 with a crash of the 
rocket powered XI5-3 using Honeywell’s MH- 
96 self-oscillating adaptive controller. The self- 
oscillating adaptive control system has, however, 
been successfully used in several missiles. 

Research in adaptive control resurged in the 
1970s, when the two schemes the model ref¬ 
erence adaptive control (MRAC) and the self¬ 
tuning regulator (STR) emerged together with 
successful applications. The research was influ¬ 
enced by stability theory and advances in the field 
of system identification. There was an intensive 
period of research from the late 1970s through the 
1990s. The insight and understanding of stability, 
convergence, and robustness increased. Recently 
there has been renewed interest because of flight 
control (Hovakimyan and Cao 2010; Lavretsky 
and Wise 2013) and other applications; there is, 
for example a need for adaptation in autonomous 
systems. 


The Brave Era 

Supersonic flight posed new challenges for flight 
control. Eager to obtain results, there was a very 
short path from idea to flight test with very little 
theoretical analysis in between. A number of 
research projects were sponsored by the US air 
force. Adaptive flight control systems were devel¬ 
oped by General Electric, Honeywell, MIT, and 
other groups. The systems are documented in the 
Self-Adaptive Flight Control Systems Sympo¬ 
sium held at the Wright Air Development Center 
in 1959 (Gregory 1959) and the book (Mishkin 
and Braun 1961). 

Whitaker of the MIT team proposed the model 
reference adaptive controller system which is 
based on the idea of specifying the performance 
of a servo system by a reference. Honeywell pro¬ 
posed a self-oscillating adaptive system (SOAS) 
which attempted to keep a given gain margin 
by bringing the system to self-oscillation. The 
system was flight-tested on several aircrafts. It 
experienced a disaster in a test on the X-15. 
Combined with the success of gain scheduling 


based on air data sensors, the interest in adaptive 
flight control diminished significantly. 

There was also interest of adaptation for 
process control. Foxboro patented an adaptive 
process controller with a pneumatic adaptation 
mechanism in 1950 (Foxboro 1950). DuPont had 
joint studies with IBM aimed at computerized 
process control. Kalman worked for a short 
time at the Engineering Research Eaboratory 
at DuPont, where he started work that led to a 
paper (Kalman 1958), which is the inspiration 
of the self-tuning regulator. The abstract of this 
entry has the statement, This paper examines the 
problem of building a machine which adjusts 
itself automatically to control an arbitrary 
dynamic process , which clearly captures the 
dream of early adaptive control. 

Draper and Ei investigated the problem of op¬ 
erating aircraft engines optimally, and they devel¬ 
oped a self-optimizing controller that would drive 
the system towards optimal working conditions. 
The system was successfully flight-tested (Draper 
and Li 1966) and initiated the field of extremal 
control. 

Many of the ideas that emerged in the brave 
era inspired future research in adaptive control. 
The MRAC, the STR, and extremal control are 
typical examples. 


Model Reference Adaptive Control 
(MRAC) 

The MRAC was one idea from the early work 
on flight control that had a significant impact on 
adaptive control. A block diagram of a system 
with model reference adaptive control is shown 
in Fig. 1. The system has an ordinary feedback 
loop with a controller, having adjustable param¬ 
eters, and the process. There is also a reference 
model which gives the ideal response y m to the 
command signal y m and a mechanism for adjust¬ 
ing the controller parameters 0. The parameter 
adjustment is based on the process output y, the 
control signal u , and the output y m of the ref¬ 
erence model. Whitaker proposed the following 
rule for adjusting the parameters: 
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History of Adaptive 
Control, Fig. 1 Block 
diagram of a feedback 
system with a model 
reference adaptive 
controller (MRAC) 



where e = y — y m and de/d0 is the sensitivity 
derivative. Efficient ways to compute the sensitiv¬ 
ity derivative were already available in sensitivity 
theory. The adaptation law (1) became known as 
the MIT rule. 

Experiments and simulations of the model 
reference adaptive systems indicated that there 
could be problems with instability, in particular 
if the adaptation gain y in Eq. (1) is large. 
This observation inspired much theoretical 
research. The goal was to replace the MIT 
rule by other parameter adjustment rules with 
guaranteed stability; the models used were non 
linear continuous time differential equations. The 
papers Butchart and Shackcloth (1965) and Parks 
(1966) demonstrated that control laws could 
be obtained using Lyapunov theory. When all 
state variables are measured, the adaptation laws 
obtained were similar to the MIT rule (1), but 
the sensitivity function was replaced by linear 
combinations of states and control variables. 
The problem was more difficult for systems 
that only permitted output feedback. Lyapunov 
theory could still be used if the process transfer 
function was strictly positive real, establishing a 
connection with Popov’s hyper-stability theory 
(Landau 1979). The assumption of a positive 
real process is a severe restriction because such 
systems can be successfully controlled by high- 
gain feedback. The difficulty was finally resolved 
by using a scheme called error augmentation 
(Monopoli 1974; Morse 1980). 


There was much research, and by the late 
1980s, there was a relatively complete theory for 
MRAC and a large body of literature (Ander¬ 
son et al. 1986; Astrom and Wittenmark 1989; 
Egardt 1979; Goodwin and Sin 1984; Kumar 
and Varaiya 1986; Narendra and Annaswamy 
1989; Sastry and Bodson 1989).The problem of 
flight control was, however, solved by using gain 
scheduling based on air data sensors and not by 
adaptive control (Stein 1980). The MRAC was 
also extended to nonlinear systems using back- 
stepping (Krstic et al. 1993); Lyapunov stability 
and passivity were essential ingredients in devel¬ 
oping the algorithm and analyzing its stability. 

The Self-Tuning Regulator 

The self-tuning regulator was inspired by 
steady-state regulation in process control. The 
mathematical setting was discrete time stochastic 
systems. A block diagram of a system with a self¬ 
tuning regulator is shown in Fig. 2. The system 
has an ordinary feedback loop with a controller 
and the process. There is an external loop for 
adjusting the controller parameters based on real¬ 
time parameter estimation and control design. 
There are many ways to estimate the process 
parameters and many ways to do the control 
design. Simple schemes do not take parameter 
uncertainty into account when computing the 
controller parameters invoking the certainty 
equivalence principle. 

Single-input, single-output stochastic systems 
can be modeled by 
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History of Adaptive 
Control, Fig. 2 Block 
diagram of a feedback 
system with a self-tuning 
regulator (STR) 



y(t)+aiy(t - h) H- a n y(t - nh) = 

b\u(t — h) b n u(t — nh)-\- 

c\w(t — h) + • • • — nh) + e(7), 

( 2 ) 

where u is the control signal, y the process out¬ 
put, w a measured disturbance, and e a stochas¬ 
tic disturbance. Furthermore, h is the sampling 
period and ak, bk and Ck, are the parameters. 
Parameter estimation is typically done using least 
squares, and a control design that minimized the 
variance of the variations was well suited for 
regulation. A surprising result was that if the 
estimates converge, the limiting controller is a 
minimum variance controller even if the distur¬ 
bance e is colored noise (Astrom and Wittenmark 
1973). Convergence conditions for the self-tuning 
regulator were given in Goodwin et al. (1980), 
and a very detailed analysis was presented in Guo 
and Chen (1991). 

The problem of output feedback does not 
appear for the model (2) because the sequence 
of past inputs and outputs y (t — h ),... ,y(t — 
nh),u(t — h ),... ,u(t — nh) is indeed a state, 
albeit not a minimal state representation. The 
continuous analogue would be to use derivatives 
of states and inputs which is not feasible because 
of measurement noise. The selection of the sam¬ 
pling period is however important. 

Early industrial experience indicated that the 
ability of the STR to adapt feedforward gains was 
particularly useful, because feedforward control 
requires good models. 


Insight from system identification showed that 
excitation is required to obtain good estimates. In 
the absence of excitation, a phenomenon of burst¬ 
ing could be observed. There could be epochs 
with small control actions due to insufficient 
excitation. The estimated parameters then drifted 
towards values close to or beyond the stability 
boundary generating large control axions. Good 
parameter estimates were then obtained and the 
system quickly recovered stability. The behavior 
then repeated in an irregular fashion. There are 
two ways to deal with the problem. One possibil¬ 
ity is to detect when there is poor excitation and 
stop adaptation (Hagglund and Astrom 2000). 
The other is to inject perturbations when there is 
poor excitation in the spirit of dual control. 


Robustness and Unification 

The model reference adaptive control and the 
self-tuning regulator originate from different ap¬ 
plication domains, flight control and process con¬ 
trol. The differences are amplified because they 
are typically presented in different frameworks, 
continuous time for MRAC and discrete time 
for the STR. The schemes are, however, not too 
different. For a given process model and given 
design criterion the process model can often be 
re-parameterized in terms of controller parame¬ 
ters, and the STR is then equivalent to an MRAC. 
Similarly there are indirect MRAC where the 
process parameters are estimated (Egardt 1979). 
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A fundamental assumption made in the early 
analyses of model reference adaptive controllers 
was that the process model used for analysis had 
the same structure as the real process. Rohrs at 
MIT, which showed that systems with guaranteed 
convergence could be very sensitive to unmod¬ 
eled dynamics, generated a good deal of research 
to explore robustness to unmodeled dynamics. 
Averaging theory, which is based on the obser¬ 
vation that there are two loops in an adaptive 
system, a fast ordinary feedback and a slow 
parameter adjustment loop, turned out to be a key 
tool for understanding the behavior of adaptive 
systems. A large body of theory was generated 
and many books were written (Ioannou and Sun 
1995; Sastry and Bodson 1989). 

The theory resulted in several improvements 
of the adaptive algorithms. In the MIT rule (1) 
and similar adaptation laws derived from Lya¬ 
punov theory, the rate of change of the adapta¬ 
tion rate is a multiplication of the error e with 
other signals in the system. The adaptation rate 
may then become very large when signals are 
large. The analysis of robustness showed that 
there were advantages in avoiding large adapta¬ 
tion rates by normalizing the signals. The stabil¬ 
ity analysis also required that parameter estimates 
had to be bounded. To achieve this, parame¬ 
ters were projected on regions given by prior 
parameter bounds. The projection did, however, 
require prior process knowledge. The improved 
insight obtained from the robustness analysis is 
well described in the books Goodwin and Sin 
(1984), Egardt (1979), Astrom and Wittenmark 
(1989), Narendra and Annaswamy (1989), Sastry 
and Bodson (1989), Anderson et al. (1986), and 
Ioannou and Sun (1995). 

Applications 

There were severe practical difficulties in 
implementing the early adaptive controllers 
using the analogue technology available in the 
brave era. Kalman used a hybrid computer when 
he attempted to implement his controller. There 
were dramatic improvements when mini- and 
microcomputers appeared in the 1970s. Since 


computers were still slow at the time, it was 
natural that most experimentats were executed in 
process control or ship steering which are slow 
processes. Advances in computing eliminated the 
technological barriers rapidly. 

Self-oscillating adaptive controllers are used 
in several missiles. In piloted aircrafts there were 
complaints about the perturbation signals that 
were always exciting the system. 

Self-tuning regulators have been used indus¬ 
trially since the early 1970s. Adaptive autopilots 
for ship steering were developed at the same 
time. They outperformed conventional autopi¬ 
lots based on PID control, because disturbances 
generated by waves were estimated and com¬ 
pensated for. These autopilots are still on the 
market (Northrop Grumman 2005). Asea (now 
ABB) developed a small distributed control sys¬ 
tem, Novatune, which had blocks for self-tuning 
regulators based on least-squares estimation, and 
minimum variance control. The company First 
Control, formed by members of the Novatune 
team, has delivered SCADA systems with adap¬ 
tive control since 1985. The controllers are used 
for high-performance process control systems for 
pulp mills, paper machines, rolling mills, and 
pilot plants for chemical process control. The 
adaptive controllers are based on recursive esti¬ 
mation of a transfer function model and a control 
law based on pole placement. The controller also 
admits feedforward. The algorithm is provided 
with extensive safety logic, parameters are pro¬ 
jected, and adaptation is interrupted when varia¬ 
tions in measured signals and control signals are 
too small. 

The most common industrial uses of adaptive 
techniques are automatic tuning of PID con¬ 
trollers. The techniques are used both in single 
loop controllers and in DCS systems. Many dif¬ 
ferent techniques are used, pattern recognition 
as well as parameter estimation. The relay auto¬ 
tuning has proven very useful and has been shown 
to be very robust because it provides proper 
excitation of the process automatically. Some of 
the systems use automatic tuning to automatically 
generate gain schedules, and they also have adap¬ 
tation of feedback and feedforward gains (Astrom 
and Hagglund 2005). 
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Summary and Future Directions 

Adaptive control has had turbulent history with 
alternating periods of optimism and pessimism. 
This history is reflected in the conferences. When 
the IEEE Conference on Decision and Control 
started in 1962, it included a Symposium on 
Adaptive Processes, which was discontinued af¬ 
ter the 20th CDC in 1981. There were two IFAC 
symposia on the Theory of Self-Adaptive Control 
Systems, the first in Rome in 1962 and the second 
in Teddington in 1965 (Hammond 1966). The 
symposia were discontinued but reappeared when 
the Theory Committee of IFAC created a working 
group on adaptive control chaired by Prof. Lan¬ 
dau in 1981. The group brought the communities 
of control and signal processing together, and a 
workshop on Adaptation and Learning in Signal 
Processing and Control (ALCOSP) was created. 
The first symposium was held in San Francisco in 
1983 and the 11th in Caen in 2013. 

Adaptive control can give significant benefits, 
it can deliver good performance over wide op¬ 
erating ranges, and commissioning of controllers 
can be simplified. Automatic tuning of PID con¬ 
trollers is now widely used in the process in¬ 
dustry. Auto-tuning of more general controller is 
clearly of interest. Regulation performance is of¬ 
ten characterized by the Harris index which com¬ 
pares actual performance with minimum variance 
control. Evaluation can be dispensed with by 
applying a self-tuning regulator. 

There are adaptive controllers that have been 
in operation for more than 30 years, for example, 
in ship steering and rolling mills. There is a 
variety of products that use scheduling, MRAC, 
and STR in different ways. Automatic tuning 
is widely used; virtually all new single loop 
controllers have some form of automatic tuning. 
Automatic tuning is also used to build gain sched¬ 
ules semiautomatically. The techniques appear 
in tuning devices, in single loop controllers, in 
distributed systems for process control, and in 
controllers for special applications. There are 
strong similarities between adaptive filtering and 
adaptive control. Noise cancellation and adaptive 
equalization are widely spread uses of adapta¬ 
tion. The signal processing applications are a 


little easier to analyze because the systems do 
not have a feedback controller. New adaptive 
schemes are appearing. The C\ adaptive con¬ 
troller is one example. It inherits features of 
both the STR and the MRAC. The model-free 
controller by Fliess and Join (2013) is another 
example. It is similar to a continuous time version 
of the self-tuning regulator. 

There is renewed interest in adaptive control 
in the aerospace industry, both for aircrafts and 
missiles (Lavretsky and Wise 2013). Good results 
in flight tests have been reported both using 
MRAC and the recently developed C\ adaptive 
controller (Hovakimyan and Cao 2010). 

Adaptive control is a rich field, and to under¬ 
stand it well, it is necessary to know a wide range 
of techniques: nonlinear, stochastic, and sampled 
data systems, stability, robust control, and system 
identification. 

In the early development of adaptive con¬ 
trol, there was a dream of the universal adaptive 
controller that could be applied to any process 
with very little prior process knowledge. The 
insight gained by the robustness analysis shows 
that knowledge of bounds on the parameters is 
essential to ensure robustness. With the knowl¬ 
edge available today, adaptive controllers can be 
designed for particular applications. Design of 
proper safety nets is an important practical issue. 
One useful approach is to start with a basic 
constant-gain controller and provide adaptation 
as an add-on. This approach also simplifies de¬ 
sign of supervision and safety networks. 

There are still many unsolved research 
problems. Methods to determine the achievable 
adaptation rates are not known. Finding ways 
to provide proper excitation is another problem. 
The dual control formulation is very attractive 
because it automatically generates proper 
excitation when it is needed. The computations 
required to solve the Bellman equations are 
prohibitive, except in very simple cases. The 
self-oscillating adaptive system, which has been 
successfully applied to missiles, does provide 
excitation. The success of the relay auto-tuner 
for simple controllers indicates that it may 
be called in to provide excitation of adaptive 
controllers. Adaptive control can be an important 
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component of the emerging autonomous system. 
One may expect that the current upswing in 
systems biology may provide more inspiration 
because many biological clearly have adaptive 
capabilities. 
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Abstract 

The control of systems with hybrid dynamics 
requires algorithms capable of dealing with 
the intricate combination of continuous and 
discrete behavior, which typically emerges from 
the presence of continuous processes, switching 
devices, and logic for control. Several analysis 
and design techniques have been proposed for 
the control of nonlinear continuous-time plants, 
but little is known about controlling plants that 
feature truly hybrid behavior. This short entry 
focuses on recent advances in the design of 
feedback control algorithms for hybrid dynamical 
systems. The focus is on hybrid feedback 
controllers that are systematically designed em¬ 
ploying Lyapunov-based methods. The control 
design techniques summarized in this entry 
include control Lyapunov function-based control, 
passivity-based control, and trajectory tracking 
control. 


Keywords 

Leedback control; Hybrid control; Hybrid sys¬ 
tems; Asymptotic stability 

Definition 

A hybrid control system is a feedback system 
whose variables may flow and, at times, jump. 
Such a hybrid behavior can be present in one or 
more of the subsystems of the feedback system: 
in the system to control, i.e., the plant ; in the 
algorithm used for control, i.e., the controller ; 
or in the subsystems needed to interconnect the 


plant and the controller, i.e., the interfaces/signal 
conditioners. Ligure 1 depicts a feedback system 
in closed-loop configuration with such subsys¬ 
tems under the presence of environmental dis¬ 
turbances. Due to its hybrid dynamics, a hybrid 
control system is a particular type of hybrid 
dynamical system. 


Motivation 

Hybrid dynamical systems are ubiquitous in sci¬ 
ence and engineering as they permit capturing 
the complex and intertwined continuous/discrete 
behavior of a myriad of systems with variables 
that flow and jump. The recent popularity of feed¬ 
back systems combining physical and software 
components demands tools for stability analysis 
and control design that can systematically handle 
such a complex combination. To avoid the issues 
due to approximating the dynamics of a system, 
in numerous settings, it is mandatory to keep 
the system dynamics as pure as possible and 
to be able to design feedback controllers that 
can cope with flow and jump behavior in the 
system. 


Modeling Hybrid Dynamical Control 
Systems 

In this entry, hybrid control systems are 
represented in the framework of hybrid 
equations/inclusions for the study of hybrid 
dynamical systems. Within this framework, the 
continuous dynamics of the system are modeled 
using a differential equation/inclusion, while the 
discrete dynamics are captured by a difference 
equation/inclusion. A solution to such a system 
can flow over nontrivial intervals of time and 
jump at certain time instants. The conditions 
determining whether a solution to a hybrid 
system should flow or jump are captured by 
subsets of the state space and input space of the 
hybrid control system. In this way, a plant with 
hybrid dynamics can be modeled by the hybrid 
inclusion. 
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Fig. 1 A hybrid control system: a feedback system 
with a plant, controller, and interfaces/signal conditioners 


(along with environmental disturbances) as subsystems 
featuring variables that flow and, at times, jump 


( Z e F P (z,U ) (z, u) G C P 

Z + G Gp(z,u) ( Z,u ) G Dp (1) 
y =hp(z,u ) 

where z is the state of the plant and takes values 
from the Euclidean space R np , u is the input 
and takes values from M mp , y is the output 
and takes values from the output space W p , 
and (Cp,Fp,Dp,Gp,hp) is the data of the 
hybrid system. The set Cp is the flow set, the 
set-valued map Fp is the flow map, the set 
Dp is the jump set, the set-valued map Gp is 
the jump map, and the single-valued map hp is 
the output map. (This hybrid inclusion captures 
the dynamics of (constrained or unconstrained) 
continuous-time systems when Dp = 0 and Gp 
is arbitrary. Similarly, it captures the dynamics 
of (constrained or unconstrained) discrete-time 
systems when Cp = 0 and Fp is arbitrary. Note 
that while the output inclusion does not explicitly 
include a constraint on (z, u), the output map is 
only evaluated along solutions.) 

Given an input u, a solution to a hybrid in¬ 
clusion is defined by a state trajectory 0 that 
satisfies the inclusions. Both the input and the 
state trajectory are functions of (t,j) G M>o x 
N := [0, oo) x {0,1,2,...}, where t keeps track 
of the amount of flow, while j counts the number 
of jumps of the solution. These functions are 
given by hybrid arcs and hybrid inputs, which are 
defined on hybrid time domains. More precisely, 
hybrid time domains are subsets E of M>o x N 
that, for each (T, J) G E, 


E H ([0,7] x{0,1,.../}) 

can be written in the form 
j-\ 

U (lo.o+ith 

]= o 

for some finite sequence of times 0 = to < 
t\ < t 2 < ... < tj. A hybrid arc 0 is a 
function on a hybrid time domain. The set E PI 
([0, T] x {0,1,..., /}) defines a compact hybrid 
time domain since it is bounded and closed. The 
hybrid time domain of 0 is denoted by dom0. 
A hybrid arc is such that, for each j G N, t i-> 
4>(t,j ) is absolutely continuous on intervals of 
flow / 7 := {t : ( t,j)e dom0 } with nonzero 
Lebesgue measure. A hybrid input u is a function 
on a hybrid time domain that, for each j G N, 
t i-> u{t, j) is Lebesgue measurable and locally 
essentially bounded on the interval I J . 

In this way, a solution to the plant Hp is 
given by a pair (0, u) with dom0 = dom u (= 
dom(0, u)) satisfying 

(50) (0(0,0), u{ 0,0)) G C P or (0(0,0), w(0,0)) 
G Dp, and dom0 = dom u\ 

(51) For each j G N such that / 7 has nonempty 
interior int(/ 7 ), we have 

(0(7 j), u(t, j )) G Cp for all t G int(/ 7 ) 
and 

d 

-j-pdJ) e F P (<p(t,j),u(t,j )) 
for almost all t G T 7 
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(S2) For each ( t,j ) e dom ((f), u) such that 
(t, j + 1) G dom(0, u ), we have 

(cp(t,j),u(t,j)) e Dp 

and 

<p(t,j + 1) e Gp(<p(t,j), u(t, j)) 

A solution pair (« cj),u ) to H is said to be 
complete if dom(0, u) is unbounded and maximal 
if there does not exist another pair (0,^)' such 
that (0, u) is a truncation of (0, u)' to some proper 
subset of dom(0,w) / . A solution pair (0,w) to 
H is said to be Zeno if it is complete and the 
projection of dom(0, u ) onto M>o is bounded. 
Input and output modeling remark: At times, it 
is convenient to define inputs u c £ M mp c and 
Ud £ M. mp > d collecting every component of the 
input u that affect flows and that affect jumps, 
respectively (Some of the components of u can 
be used to define both u c and Ud, that is, there 
could be inputs that affect both flows and jumps.). 
Similarly, one can define y c and yd as the com¬ 
ponents of y that are measured during flows and 
jumps, respectively. 

To control the hybrid plant Hp in (1), 
control algorithms that can cope with the 
nonlinearities introduced by the flow and 
jump equations/inclusions are required. In 
general, feedback controllers designed using 

classical techniques from the continuous-time 
and discrete-time domain fall short. Due to this 
limitation, hybrid feedback controllers would 
be more suitable for the control of plants with 
hybrid dynamics. Then, following the hybrid 
plant model above, hybrid controllers for the 
plant Hp in (1) will be given by the hybrid 
inclusion 

{I e Fg&v) ($,v)eC K 

U K ■ U+e Gic&v) (l,v)eD K (2) 

{ i] = k(£, v) 

where £ is the state of the controller and takes 
values from the Euclidean space W* K , v is the 
input and takes values from W p , rj is the output 
and takes values from the output space M mp , and 


(Ck,Fk,Dk,Gk,k) is the data of the hybrid 
inclusion defining the hybrid controller. 

The control of Up via Hk defines an intercon¬ 
nection through the input/output assignment u = 
rj and v = y; the system in Fig. 1 without inter¬ 
faces represents this interconnection. The result¬ 
ing closed-loop system is a hybrid dynamical sys¬ 
tem given in terms of a hybrid inclusion/equation 
with state x = (z, £). We will denote such a 
closed-loop system by T-L. Its data can be con¬ 
structed from the data (Cp,Fp,Dp,Gp,hp) and 
(Ck, Fr, Dk , Gk,k) of each of the subsystems. 
Solutions to both Hk and H are understood 
following the notion introduced above. 

Definitions and Notions 

For convenience, we use the equivalent notation 
[x T y T ] T and ( x , y) for vectors x and y. Also, 
we denote by /Coo the class of functions from M>o 
to M>o that are continuous, zero at zero, strictly 
increasing, and unbounded. 

The dynamics of hybrid inclusions have 
right-hand sides given by set-valued maps. 
Unlike functions or single-valued maps, set¬ 
valued maps may return a set when evaluated 
at a point. For instance, at points in Cp, 
the set-valued flow map Fp of the hybrid 
plant Hp might return more than one value, 
allowing for different values of the derivative 
of z. A particular continuity property of set¬ 
valued maps that will be needed later is lower 
semicontinuity. A set-valued map S from W 1 to 
M m is lower semicontinuous if for each x e W 1 
one has that fim inf X/ S (x/) D S(x), where 
liminf Xi ^ x S(xi) = {z : V* f x,3zi z 
s.t. Zi € S(xi ) } is the so-called inner limit of S. 

A vast majority of control problems consist of 
designing a feedback algorithm that assures that 
a function of the solutions to the plant approach 
a desired set-point condition ( attractivity ) and, 
when close to it, the solutions remain nearby 
(stability). In some scenarios, the desired set- 
point condition is not necessarily an isolated 
point, but rather a set. The problem of designing 
a hybrid controller Hk for a hybrid plant Hp 
typically pertains to the stabilization of sets, in 
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particular, due to the hybrid controller’s state 
including timers that persistently evolve within 
a bounded time interval and logic variables that 
take values from discrete sets. Denoting by A 
the set of points to stabilize for the closed-loop 
system H and | • |^ as the distance to such set, the 
following property captures the typically desired 
properties outlined above. A closed set A is said 
to be: 

(S) Stable : for each s > 0 there exists 8 > 0 
such that each maximal solution <p to H with 
0(0,0) = x 0 , \x 0 \a < & satisfies 10(L j )\a < 
s for all ( t , j) e dom0. 

(A) Attractive : there exists /x > 0 such that every 
maximal solution 0 to H with 0(0,0) = x 0 , 
\x 0 \a — l 1 is bounded and if it is complete 
satisfies lim ( ,j )sdom0 , t +j^oo \<P(t,j)\ A = 0- 
(AS) Asymptotically stable : it is stable and at¬ 
tractive. 

The basin of attraction of an asymptotically stable 
set A is the set of points from where the attractiv- 
ity property holds. The set A is said to be globally 
asymptotically stable when the basin of attraction 
is equal to the entire state space. 

A dynamical system with assigned inputs is 
said to be detectable when its output being held to 
zero implies that its state converges to the origin. 
A similar property can be defined for hybrid 
dynamical systems. For the closed-loop system 
H, given sets A and K , the distance to A is 
0-input detectable relative to K for H if every 
complete solution 0 to H 

f(t,j)eK V(£,y)edom0 =>► 
liltt(?J)€dom0, t+j^>oo |0(L./) 1.4. = 0 

where “0(L j) e K” captures the “output being 
held to zero” property in the usual detectability 
notion. 


Feedback Control Design for Hybrid 
Dynamical Systems 

Several methods for the design of a hybrid con¬ 
troller Hk rendering a given set asymptotically 
stable are given below. At the core of these 


methods are sufficient conditions in terms of Lya¬ 
punov functions guaranteeing that the asymptotic 
stability property defined in section “Definitions 
and Notions” holds. Some of the methods pre¬ 
sented below exploit such sufficient conditions 
when applied to the closed-loop system H, while 
others exploit the properties of the hybrid plant 
to design controllers with a particular structure. 
The design methods are presented in order of 
complexity of the controller, namely, from it be¬ 
ing a static state-feedback law to being a generic 
algorithm with true hybrid dynamics. 

CLF-Based Control Design 

In simple terms, a control Lyapunov function 
(CLF) is a regular enough scalar function that 
decreases along solutions to the system for some 
values of the unassigned input. When such a 
function exists, it is very tempting to exploit 
its properties to construct an asymptotically 
stabilizing control law. Following the ideas from 
the literature of continuous-time and discrete¬ 
time nonlinear systems, we define control 
Lyapunov functions for hybrid plants Hp and 
present results on CLF-based control design. 
For simplicity, as mentioned in the input and 
output modeling remark in section “Definitions 
and Notions,” we use inputs u c and Ud instead u. 
Also, we restrict the discussion to sets A that are 
compact as well as hybrid plants with Fp,Gp 
single valued and such that h p (z, u) = z. For 
notational convenience, we use n to denote 
the “projection” of Cp and D P onto R np , 
i.e., ri(Cp) = {z : 3u c s.t. (z, u c ) e Cp} and 
n(D P ) = {z : 3u d s.t. ( z,u d ) e Dp}, and the 
set-valued maps ^(z) = {u c : (z, u c ) e Cp} 
and (z) = {u d : ( z,u d ) e D P }. 

Given a compact set A, a continuously dif¬ 
ferentiable function V : R np R is a control 
Lyapunov function for Hp with respect to A 
if there exist ol\,ol 2 € /Coo and a continuous, 
positive definite function p such that 

«1 (kU) < V(z) < OC 2 (\z\a) 

Vz e R np 

inf (VV(z), Fp(z,u c )) < —pflzU) 

Uc^c(z) 
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Vz e n(Cp) (3) 

inf V(G P (z,u d ))-V(z)<-p(\z\A) 

ud^diz) 

WzeU(Dp) (4) 

With the availability of a CLF, the set A 
can be asymptotically stabilized if it is possible 
to synthesize a controller Hk from inequali¬ 
ties (3) and (4). Such a synthesis is feasible, 
in particular, forthe special case of Hk being a 


static state-feedback law z k(z). Sufficient 
conditions guaranteeing the existence of such a 
controller as well as a particular state-feedback 
law with point-wise minimum norm are given 
next. 

Given a compact set A and a control Lyapunov 
function V (with respect to A ), define, for each 
r > 0, the set T(r) := {z G : V(z) > r }. 
Moreover, for each (z, u c ) and r > 0, define the 
function 


r c (z, u c ,r ) 


(VV(z),Fp(z,u c )) + ±p(\zU) 

— OO 


if (z,u c ) e C P n (I(r)xR^), 
otherwise 


and, for each (z,Ud) and r > 0, the function 


r <t(z, Ud,r) 


V(G P (z, u d )) - V(z) + ip(|zU) 

—OO 


if (z, u d ) G Dp Pi (l(r) X 
otherwise 


The following result states conditions on the 
data of Hp guaranteeing that, for each r > 0, 
there exists a continuous state-feedback law z i-> 
k(z) = (k c (z), iCd (z)) rendering the compact set 

A r := {z e : 7(z) < r } 

asymptotically stable. This property corresponds 
to a practical version of asymptotic stabilizability. 

Theorem 1 Given a hybrid plant Hp = 
(Cp , Fp , Dp , Gp ,hp), a compact set A, and a 
control Lyapunov function V for H p with respect 
to A, if 

(Cl) Cp and Dp are closed sets, and Fp and 
Gp are continuous; 

(C2) The set-valued maps 4> c (z)= {u c • (z, u c ) 
^Cp) and T'j(z) = {ud : (z,Ud) e D P } 
are lower semicontinuous with convex values; 
(C3) For every r > 0, we have that, for every z G 
fl(Cp) fl T(r), the function u c T c (z, u c , r) 

is convex on 4L(z) and that, for every z G 
Tl(Dp) Hl(r), the function Ud i-> T c (z, Ud, r) 
is convex on ^ (z); 

then, for every r > 0, the compact set A r is 
asymptotically stabilizable for Hp by a state- 


feedback law z ^ if(z) = (k c (z), Kd(z)) with k c 
continuous on fl(CV) PI T(r) and Kd continuous 
on n(Dp) D X(r). 

Theorem 1 assures the existence of a continu¬ 
ous state-feedback law practically asymptotically 
stabilizing A. However, Theorem 1 does not 
provide an expression of an asymptotically stabi¬ 
lizing control law. The following result provides 
an explicit construction of such a control law. 

Theorem 2 Given a hybrid plant Hp = 
(Cp, Fp, Dp, Gp,hp), a compact set A, and a 
control Lyapunov function V for H p with respect 
to A, if (Cl)-(C3) in Theorem 1 hold then, for 
every r > 0, the state-feedback law pair 

K c : n(Cp) -> R mp ’ c , K d : n(Dp) -> 

defined on II (Cp) andU(Dp) as 

k c (z) := argmin{|w c | : u c e %(z) } 

Vze n(Cp)HX(r) 

K d (z) := argmin| : u d e Td(z) } 

VzG Tl(Dp)nX(r) 
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respectively, renders the compact set A r asymp¬ 
totically stable for FLp, where T c {z) = ^ c (z) H 
{u c : T c (z, u c , V(z)) < 0} andTdiz) = 'Fj(z) H 
: Tj (z, F(z)) < 0}. Furthermore, if the 

set-valued maps and VE f d have a closed graph, 

then k c and Kd are continuous on II (CV) PI T(r) 
and n(Dp) PI T(r), respectively. 

The stability properties guaranteed by 
Theorems 1 and 2 are practical. Under further 
properties, similar results hold when the input 
u is not partitioned into u c and Ud. To achieve 
asymptotic stability (or stabilizability) of A with 
a continuous state-feedback law, extra conditions 
are required to hold nearby the compact set, 
which for the case of stabilization of continuous¬ 
time systems are the so-called small control 
properties. Furthermore, the continuity of the 
feedback law assures that the closed-loop system 
has closed flow and jump sets as well as contin¬ 
uous flow and jump maps, which, in turn, due to 
the compactness of A, implies that the asymptotic 
stability property is robust. Robustness follows 
from results for hybrid systems without inputs. 

Passivity-Based Control Design 

Dissipativity and its special case, passivity, pro¬ 
vide a useful physical interpretation of a feedback 
control system as they characterize the exchange 
of energy between the plant and its controller. 
For an open system, passivity (in its very pure 
form) is the property that the energy stored in 
the system is no larger than the energy it has 
absorbed over a period of time. The energy stored 
in a system is given by the difference between 
the initial and final energy over a period of time, 
where the energy function is typically called the 
storage function. Hence, conveniently, passivity 
can be expressed in terms of the derivative of a 
storage function (i.e., the rate of change of the 
internal energy) and the product between inputs 
and outputs (i.e., the system’s power flow). Un¬ 
der further observability conditions, this power 
inequality can be employed as a design tool 
by selecting a control law that makes the rate 
of change of the internal energy negative. This 
method is called passivity-based control design. 

The passivity-based control design method 
can be employed in the design of a controller for 


a “passive” hybrid plant FLp, in which energy 
might be dissipated during flows, jumps, or both. 
Passivity notions and a passivity-based control 
design method for hybrid plants are given next. 
Since the form of the plant’s output plays a key 
role in asserting a passivity property, and this 
property may not necessarily hold both during 
flows and jumps, as mentioned in the input and 
output modeling remark in section “Definitions 
and Notions,” we define outputs y c and yd , 
which, for simplicity, are assumed to be single 
valued: y c = h c (x) and yd = hd(x). Moreover, 
we consider the case when the dimension of the 
space of the inputs u c and Ud coincides with 
that of the outputs y c and yd , respectively, i.e., a 
“duality” of the output and input space. 

Given a compact set A and functions h C9 hd 
such that h c (A) = hd (A) = 0, a hybrid plant 
FLp for which there exists a continuously differ¬ 
entiable function V : R np M>o satisfying 
for some functions co c : M mp c xR^ —> M and 
co d : M mp ’ c x R np R 

(VF(z), F P (z,u c )) < co c (u c ,z) 

V (z, u c ) e C (5) 

V(G P (z,u d )) - V(z) < cod(u d ,z) 

V(z, u d ) e D (6) 

is said to be passive with respect to a compact set 

A if 

( u c , z) i-> (o c (u c ,z) = ujy c (7) 

(Ud ,z)^( 0 d o u d , z) = U J y d (8) 

The function V is the so-called storage function. 
If (5) holds with co c as in (7), and (6) holds with 
co d = 0, then the system is called flow-passive , 
i.e., the power inequality holds only during flows. 
If (5) holds with co c = 0, and (6) holds with cod as 
in (8), then the system is called jump-passive , i.e., 
the energy of the system decreases only during 
jumps. 

Under additional detectability properties, 
these passivity notions can be used to design 
static output feedback controllers. The following 
result gives two design methods for hybrid plants. 
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Theorem 3 Given a hybrid plant ELp = 
(Cp,Fp,Dp,Gp,hp) satisfying 
(Cl') Cp and Dp are closed sets; Fp and Gp 
are continuous; and h c and hd are continuous; 
and a compact set A, we have: 

(1) If Up is flow-passive with respect to A with 
a storage function V that is positive definite 
with respect to A and has compact sublevel 
sets, and if there exists a continuous function 
k c : M mp - C -> M mp ’ c , yjK c (y c ) > 0 for all 
y c 0, such that the resulting closed-loop 
system with u c = — K c (y c ) and Ud = 0 has 
the following properties: 

(1.1) The distance to A is detectable relative to 

{z € n(Cp) U II(Dp) U Gp(Dp) : 
h c (z) T K c (h c (z)) = 0, ( z,-K c (h c (z ))) e C P } ; 

(1.2) Every complete solution f is such that, for 
some 8 > 0 and some J G N , we have 
tj +1 — tj > 8 for all j > J; 

then the control law u c = — K: c (y c ), Ud = 0 
renders A globally asymptotically stable. 

(2) If Up is jump-passive with respect to A with 
a storage function V that is positive definite 
with respect to A and has compact sublevel 
sets, and if there exists a continuous function 
Kd : M mp>£/ -> R mp ' d , yjxdiyd) > 0 for all 
yd 0, such that the resulting closed-loop 
system with u c = 0 and Ud = —Kdiyd) has 
the following properties: 

(2.1) The distance to A is detectable relative to 

£ n(C/>) U Tl(Dp') U Gp(Dp') : 
h d (z) T K d (h d (z)) = 0, (z,-K d (h d (z))) e D P j; 

(2.2) Every complete solution f is Zeno; 

then the control law Ud = —Kd(yd)> 11 c = 0 
renders A globally asymptotically stable. 

Strict passivity notions can also be formulated 
for hybrid plants, including the special cases 
where the power inequalities hold only during 
flows or jumps. In particular, strict passivity and 
output strict passivity can be employed to assert 
asymptotic stability with zero inputs. 


Tracking Control Design 

While numerous control problems pertain to the 
stabilization of a set-point condition, at times, 
it is desired to stabilize the solutions to the 
plant to a time-upying trajectory. In this section, 
we consider the problem of designing a hybrid 
controller ELk for a hybrid plant Up to track 
a given reference trajectory r (a hybrid arc). 
The notion of tracking is introduced below. We 
propose sufficient conditions that general hybrid 
plants and controllers should satisfy to solve such 
a problem. For simplicity, we consider tracking 
of state trajectories and that the hybrid controller 
can measure both the state of the plant z and the 
reference trajectory r; hence, v = (z, r). 

The particular approach used here consists of 
recasting the tracking control problem as a set 
stabilization problem for the closed-loop system 
EL. To do this, we embed the reference trajectory 
r into an augmented hybrid model for which it 
is possible to define a set capturing the condition 
that the plant tracks the given reference trajectory. 
This set is referred to as the tracking set. More 
precisely, given a reference r : domr -> R n p, 
we define the set T r collecting all of the points 
(t, j) in the domain of r at which r jumps, 
that is, every point (tj , j ) e domr such that 
(tj, j + 1) e domr. Then, the state of the closed 
loop EL is augmented by the addition of states 
r G M>o and k e N. The dynamics of the 
states r and k are such that r counts elapsed 
flow time, while k counts the number of jumps 
of EL\ hence, during flows r = 1 and k = 0, 
while at jumps r + = r and k+ = k + 1. These 
new states are used to parameterize the given 
reference trajectory r, which is employed in the 
definition of the tracking set 

A = {(z, rj, r, k) e R np x R nK x M> 0 x N : 

z = r(r, k), (9) 

This set is the target set to be stabilized for EL. 
The set C R nK in the definition of A is some 
closed set capturing the set of points asymptoti¬ 
cally approached by the controller’s state £. 

The following result establishes a sufficient 
condition for stabilization of the tracking set 
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A. For notational convenience, we define x = 

(z,£, z,k), 

c = {x : (z,/c c (%,z, r(z,k))) e C P , 

z e [ t X t k+ 1]> (£’ 2. r ( r - *0) e Cjf} 
F(z,%,z,k) = ( F P (z,K c (%,z,r(z,k ))), 
F K ($,z,r{z,k)), 1,0) 

D = {x : (z,K c (%,z,r(z,k))) e D P , 
(t, kj G } U {A ! T G 
[t[j[ +x )A^z,r(x,k))eD K } 
Gi(z,£,r,fc) = (G P (z,K c (^z,r(x,k))), 

£, r ,k + 1), 

G 2 (z, £, r, k) = (z, G*(£, z, r (r, fc)), r, fc) 

Theorem 4 Given a complete reference trajec¬ 
tory r : domr —>* associated tracking 

set A in (9), if there exists a hybrid controller FLk 
guaranteeing that 

(1) The jumps ofr and TL p occur simultaneously; 

(2) There exist a function V : xR^ xR> 0 x 

N —> M that is continuously differentiable; 
functions a\,ot 2 £ /Coo/ and continuous , pos- 
zYzve definite functions p\, p 2 , p 3 such that 

(a) For all (z, £, r, /:) G C U D U G\(D) U 
G 2 (D) 

ari(|(z,f,r,fc)U) < V(z,£,r,fc) 

< a 2 (|(z,£>t,fc)U) 

(b) For all (z,^,z,k) € C am/ a// £ e 
F(z,i-, z,k), 

(VV(z^,z,k)^)<-pi (\(z,$,z,k)\A) 

(c) For all (z,^,z,k) e D\ and all £ e 
Gi(z,£, z, k) 

V(0~V(z,%,z,k) < -p 2 (\(z,^z,k)\ A ) 

(d) For all (z,^,z,k) € D 2 and all £ e 
G 2 (z r, &) 

H£) - F(z,£,r,A;) < -p 3 (|(z,f,r,/c)U) 
£/z£zz *4 A globally asymptotically stable. 


Theorem 4 imposes that the jumps of the 
plant and of the reference trajectory occur 
simultaneously. Though restrictive, at times, this 
property can be enforced by proper design of the 
controller. 


Summary and Future Directions 

Advances over the last decade on modeling and 
robust stability of hybrid dynamical systems 
(without control inputs) have paved the road 
for the development of systematic methods 
for the design of control algorithms for hybrid 
plants. The results selected for this short 
expository entry, along with recent efforts on 
multimode/logic-based control, event-based 
control, and backstepping, which were not 
covered here, contribute to that long-term 
goal. The future research direction includes the 
development of more powerful tracking control 
design methods, state observers, and optimal 
controllers for hybrid plants. 
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► Lyapunov’s Stability Theory 

► Output Regulation Problems in Hybrid 
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► Stability Theory for Hybrid Dynamical 
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Abstract 

In the first part of the paper, two consolidated 
hybrid observer designs for non-hybrid systems 
are presented. In the second part, recently results 


available in the literature related to the observ¬ 
ability and observer design for different classes 
of hybrid systems are introduced. 


Keywords 

Hybrid systems; Observer design; Observability; 
Switching systems 


Introduction 

Observers design, which are used to estimate the 
unmeasured plant state, has received a lot of at¬ 
tention since the late ’60s. One of the first leading 
contribution to clearly formalize the estimation 
problem and propose a solution in the linear case 
has been proposed by Luenberger (1966). The 
recipe to implement a Luenberger-type observer 
for a continuous-time linear system described by 

x — Ax T- Bu , y — Cx Du, (1) 

with v g R n ,u e R p ,y e M m ,A e e 

R nxp , C G M mxn , and D e M mx/? , has three main 
ingredients: system data, the correction term 
commonly referred to as output injection , and 
the observability/detectability/determinability 
conditions. A Luenberger-type observer for 
(1), which consists in a copy of the (system 
data) dynamics (1) with a linear correction term 
L(y — y), is given by 

x = Ax + Bu + L(y—y), y = Cx + Du, (2) 

with L G R nxm and where x is the estimated 
value of v. The estimation error e = x — x 
satisfies the differential equation e = (A — LC)e 
with initial condition c(0) = x(0) — x(0). Since 
the observer has a copy of the plant dynamics 
and the correction term is L(y — y) = LCe, the 
zero estimation error manifold x = x is invariant 
(if v(0) = x(0), then e(t) = 0 for all t > 
0), whereas its attractivity (yielding global expo¬ 
nential stability of the estimation error system) 
requires A - LC be Hurwitz. Such an L, if A 
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is not already Hurwitz, exists if the pair (A, C) 
is detectable or (sufficient condition) observable. 
The observer in (2) exploits only the injection 
term in the for continuous time dynamics (flow 
map), and one may ask how profitable could be 
resets of the observer state (jump map) designing 
a hybrid observer. 

The observer design for hybrid systems is a 
relatively new area of research and results are 
consolidated only for few classes of linear hybrid 
systems. 

In section “Continuous-Time Plants,” a hy¬ 
brid redesign of the observer (2) is discussed 
first and then a more general design for non¬ 
linear systems is introduced, whereas in sec¬ 
tion “Systems with Flows and Jumps” the recent 
results related to observability and observer de¬ 
signs for hybrid systems is discussed. Conclu¬ 
sions are given in section “Summary and Future 
Directions.” 


Hybrid Observers: Different 
Strategies 

The community of researchers working on hybrid 
observer, which is a quite recent area and is the 
subject of growing interest, is wide and a unique 
formal definition/notation has not been reached 
yet. This fact is strictly related to the large num¬ 
ber of different hybrid system models that are 
currently adopted by researchers. To render as 
simple as possible this short presentation, we let 
the state x(t) of a hybrid system be driven by 
the flow map (differential equation) when t ^ 
tj and by the jump map (difference equation) 
when t = tj , with x(t) right continuous, i.e., 
I'™, ^,+ x(t) = x(tj). 

Continuous-Time Plants 

Linear Case 

A simple strategy to improve convergence to 
zero of the estimation error for (1) has been 
proposed in Raff and Allgower (2008) and con¬ 
sists in resetting the observer state x, at prede¬ 
termined fixed time intervals tj , by means of 


the linear correction term K (t) (y (t) — Cx (t)) at 
jump times, yielding 

x (t) = Ax (t) + Bu ( t ) + L (y (t) — Cx (t)), 

(3a) 

* (o) = A- (tj ) + K (tr) (y (tr) - Cx (tr)), 

(3b) 

where to = 0, tj+\ — tj = T > 0, j e N>i and 
T is a parameter that defines the interval times 
between resets and has to be chosen such that 

Im (X p - X r ) T ^ 2rn, r e Z\{0}, (4) 

for each pair (X p , X r ) of complex eigenvalues 
of the matrix A — LC. This preserves the (con¬ 
tinuous time or flow) observability of the sys¬ 
tem (1) when sampled at time instants tj and 
allows to select a matrix Ko such that (/ — 
KoC ) exp ((A — LC) T ) has all its eigenvalues 
at zero. Then, the estimation error e(t) converges 
to zero in finite time (nT) if (1) is observable 
and the matrix K (t) : M>o -> R nxq is selected 
such as K (t) = Ko if t < t n and K(t) = 0 
otherwise. It is important to note that the state 
reset (3b) yields a hybrid estimation error system 
given by 

e (t) = (A — LC) e (t) , (5a) 

e{tj) = (l-K{tj)c)e{tj). (5b) 

The stability property of the origin can be easily 
deduced by noting that 

e ( t 1 ) = t\( I - K ( t j) c ) 

k =l 

exp ((A — LC) T) e (0), 

and given that (/ — K 0 C) exp ((A — LC) T) is 
nilpotent, then e(t n ) = e(nT) = 0. 

The potentiality benefits of hybrid observers 
to improve the performances of classic 
continuous-time observer is a relatively new area 
of research. Along this line, the recent work 
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proposed in Prieur et al. (2012) allows to limit 
the peaking phenomena for a class of high-gain 
observers opportunely resetting its (augmented) 
state. Moreover, when the output of (1) is a 
nonlinear function of the state, y = h(x), with 
/z(-) not invertible (e.g. the saturation function), 
it would be possible to rewrite (1) as a hybrid 
system with linear flow map and augmented state 
designing a hybrid observer as in Carnevale and 
Astolfi (2009). 

Nonlinear Case 

When the input of a continuous-time plant is 
piecewise-constant the hybrid observer proposed 
in Moraal and Grizzle (1995), exploiting sampled 
measurements, can be successfully applied for a 
class of nonlinear continuous (or discrete-time) 
systems 

X = f (x(t),u (?)) ,y(t) = h(x (?), u (?)), 

( 6 ) 

with sufficiently smooth maps /(•,•) and /?(-,-) 
and where 

x (tj) = F(x (tj- 1 ) , u (tj-i)), (7) 


is the sample-data (discrete-time) version of (6) 
with sampling time T = tj~\ — tj. Then, it 
is possible to define a hybrid observer of the 
following type: 

x (?) = / (x (?), u (?)), (8a) 

* (o ) = r (y( t j~)’* ( r 7 ) - £ (y)) > < 8b > 

where the reset map T and the dynamics of the 
new variable £(t) have to be properly defined. 
The main idea in Moraal and Grizzle (1995) 
is that the Newton method, in continuous and 
discrete time, can be used to estimate the value 
of £ that renders zero the function 

Wf (?) = Yf — H (?, t/f) , (9) 

where l/F = [u' ,..., u' (tj)]' and 

Yf = [y' (tj-N+i) (tj)]' are the sam¬ 

pled input and output vectors, respectively, and 
H : W 1 x M mxiV -> R n maps the state x(tj) and 
the N-tuple of control inputs into the output 

vector Yf, i.e., H (x (?_,•), Uf'j = Yf, and is 
defined as 


h (F 1 (F 1 ,U (tj-N+l)) ,U (tj-N+l)) 


H 



A 


h(F 1 (x,u(tj-i)) j)) 

h (x,u (tj)) 


( 10 ) 


where F 1 shortly represents the inverse of the 
map F such that v (tj-f = F 1 (x (?/),M (?/-])). 

The system (6)-(7) is said to be N-osbervable, 
for some N > 1 (the generic selection is 

N = 2n + 1), when Wf (£) = 0 hold only 
if §■ = x (tj), uniformly in Uf. Then, under 
certain technical assumptions (see Moraal and 
Grizzle 1995) related to the derivatives of / 
and h and the invertibility of the Jacobian 
matrix J (x) = dH (x)/dx, it is possible to 
select 


l(t)=kJ(S(t))- 1 [Yf 


-h (HO.uf)) 

(1 la) 

H*j) = F (? (h )' 11 

)), (Hb) 

with a sufficiently high-gain k > 0 and the reset 
map T (•) = F , u Note that 


(11a) is commonly referred to as Newton flow. 
This approach could be easily extended to other 
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continuous-time minimization algorithms (nor¬ 
malized gradient, line-search, etc.) changing the 
rhs of (11a) or even with discrete-time methods 
iterated at higher frequency within the sample 
time T, yielding faster convergence to zero of the 
estimation error. 

The same approach can be used when a 
continuous-time observer for (6) is considered in 
place of (8a) and the Newton-based resets can be 
used to possibly improve the performances. The 
continuous and discrete-time Newton algorithm 
require the knowledge of the jump map F to 
define (7), i.e. the exact discrete time model of 
(6), and the Jacobian matrix J (x) = 3H (x)/dx. 
An approach that do not require such knowledge 
is proposed in Biyik and Arcak (2006), where 
continuous time filters and secant method allow 
to estimate (numerically) the map F and the 
Jacobian matrix, or in Sassano et al. (2011) 
where an extremum-seeking -based technique is 
considered. 

A different approach to estimate the state of 
a continuous-time plant, pursued for example 
in Ahrens and Khalil (2009) and Liu (1997), 
exploits switching output injections, letting the 
correction term l G (•) to switch among opportune 
values selected by a suitable definition (often de¬ 
rived by a Lyapunov-based proof) of the switch¬ 
ing signal a(t). These switching gains allow to 
improve observer performances and robustness 
against measurement noise and model uncertain¬ 
ties. 

Systems with Flows and Jumps 

The classical notion of observability does not 
hold for hybrid systems. As an example, consider 
the autonomous linear hybrid system described 
by x (t) = Ax {t) and x(tj) = Jx (tj^ 

with 



"0 0 O' 


"o o r 

A = 

0 0 1 

, J = 

0 1 0 


0 0 0 


1 0 0 


and C = [0, 1, 0]. Evidently the flow is 
not observable in the classic sense given 
that Oflow = [C', (C A)', (CA 2 )']' is not 

full rank and the flow-unobservable subspace 


is ker(Oflow) = {x e M 3 : x 2 = x 3 = 0}. 
Nevertheless, in the first flow time interval 
r = t\ — to, it is possible to estimate (e.g. in finite 
time using the observability Gramian matrix) 
the initial conditions (x 2 Oo), *3 (to))- Then when 
the first jump take place at time t \, thanks to the 
structure of the jump map J that resets the value 
of x 3 (^i) with the flow-unobservable X\ it is 
possible to estimate in the next flow time interval 
the value of x\ (t^) so that the initial condition 
x (to) can be completely determined. The hybrid 
observability matrix in this case has the following 
expression 

Ohybnd = [0' flow ,(O flow Je AT 'y, 

(O flow (Je AT2 ) 2 y]' 

and is full rank for all Tj = tj — tj-\ 
that satisfies (4). Note that from a practical 
point of view, in this case the time interval 
that allows to reconstruct the complete state 
is [t 0 , t\ + e) since the observer needs at 
least an € time of the new measurements 
(after the first jump) to evaluate the full state 

[o' flow , (O flow Je AT 0\ (Ofi ow (Je AT2 ) 2 y]. This 

simple example suggests that (impulsive) 
hybrid systems might have a reacher notion 
of observability than the classical ones. These 
properties have been studied also for mechanical 
systems subject to non-smooth impacts in 
Martinelli et al. (2004), where a high-gain-like 
observer design has been proposed assuming 
the knowledge of the impact times 6, no Zeno 
phenomena (no finite accumulation point for 
t j ’s), and a minimum dwell-time, 0+1 - 0 > 
8 >0. With the aforementioned assumptions 
and considering the more general class of hybrid 
system described by 

x(t) = f (x,u), 

X (tj) = S (x (tj ) . u {tj )) , (13) 

with y = h(x,u), a frequent choice is to consider 
the hybrid observer of the form 

x (t) = f (x, u ) + 1 (y, x, u) , (14a) 
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*(0) = s(*(o)’“(o)) 

+m(x(tr),u(tr)), (14b) 

with l(-) and m() that are zero when x = x ren¬ 
dering flow and jump-invariant the manifold x = 
x relying only on the correction term l(-) (m = 0) 
in a high-gain-like design during the flow. The 
correction during the flow has to recover, within 
the minimum dwell-time 8 , the worst deteriora¬ 
tion of the estimation error induced by the jumps 
(if any) and the transients such that Ho + .)ll< 
\\e (tj ) | or V (e (?/+i)) < V (e (t, )) if a 
Lyapunov analysis is considered. This type of 
observer design, with m = 0 and the linear 
choice /(y, x,u) = L{y — Mx), have been 
proposed in Heemels et al. (2011) for linear com¬ 
plementarity systems (LCS) in the presence of 
state jumps induced by impulsive input. Therein, 
solutions of LCS are characterized by means of 
piecewise Bohl distributions and the specially 
defined well-posedness and low-index properties, 
which combined with passivity-based arguments, 
allow to design a global hybrid observer with 
exponential convergence. A separation principle 
to design an output feedback controller is also 
proposed. 

An interesting approach is pursued in Forni 
et al. (2003) where global output tracking results 
on a class of linear hybrid systems subject to 
impacts is introduced. Therein, the key ingre¬ 
dient is the definition of a “mirrored” tracking 
reference (a change of coordinate) that depends 
on the sequence of different jumps between the 
desired trajectory (a virtual bouncing ball) and 
the plant (the controlled ball). Exploiting this 
(time-varying) change of coordinates and assum¬ 
ing that the impact times are known, it is pos¬ 
sible to define an estimation error that is not 
discontinuous even when the tracked ball has 
a bounce (state jump) and the plant does not. 
A time regularization is included in the model 
embedding a minimum dwell-time among jumps. 
In this way, it is possible to design a linear 
hybrid observer represented by (14) with a linear 
(mirrored) term / (•) and m(-) = 0, proving (by 

standard quadratic Lyapunov functions) that the 


origin of the estimation error system is GES. In 
this case, the standard observability condition for 
the couple (A, C) is required. 

Switching Systems and Hybrid Automata 
Switching systems and hybrid automata have 
been the subject of intense study of many re¬ 
searchers in the last two decades. For these class 
of systems, there is a neat separation x = [z, q]' 
among purely discrete-time state q (,switching 
signal or system mode) and rest of the state z 
that generically can both flow and jump. The 
observability of the entire system is often divided 
into the problem of determining the switching 
signal q first and then z- The switching signal 
can be divided into two categories: arbitrary ( uni¬ 
versal problem) or specific (existential problems) 
switchings. 

In Vidal et al. (2003) the observability of 
autonomous linear switched systems with no 
state jump, minimum dwell time, and unknown 
switching signal is analyzed. Necessary and 
sufficient observability conditions based on 
rank tests and output discontinuities detection 
strategies are given. Along the same line, the 
results are extended in Babaali and Pappas (2005) 
to non-autonomous switched systems with non- 
Zeno solutions and without the minimum dwell- 
time requirement, providing state z and mode q 
observability characterized by linear-algebraic 
conditions. 

Luenberger-type observers with two distinct 
gain matrices L\ and L 2 are proposed in the case 
of bimodal piecewise linear systems in Juloski 
et al. (2007) (where state jumps are considered), 
whereas recently in Tanwani et al. (2013), 
algebraic observability conditions and observer 
design are proposed for switched linear systems 
admitting state jumps with known switching 
signal (although some asynchronism between 
the observer and the plant switches is allowed). 
Results related to the observability of hybrid 
automata, which include switching systems, 
can be found in Balluchi et al. (2002) and the 
related references. Therein the location observer 
estimates first the system current location q , 
processing system input and output assuming that 
it is current-location observable , a property that 
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is related to the system current-location observa¬ 
tion tree. This graph is iteratively explored at each 
new input to determine the node associated to the 
current value of q(t). Then, a linear (switched) 
Luenberger-type observer for the estimation of 
the state z, assuming minimum dwell-time and 
observability of each pair ( A q C q ), is proposed. 

Summary and Future Directions 

Observer design and observability properties 
of general hybrid systems is an active field of 
research and a number of different results have 
been proposed although not consolidated as for 
classical linear systems. The results are based 
on different notations and definitions for hybrid 
systems. Efforts to provide a unified approach, in 
many case considering the general framework for 
hybrid systems proposed in Goebel et al. (2009), 
is pursued by the scientific community to im¬ 
prove consistency and cohesion of the general re¬ 
sults. Observer designs, observability properties, 
and separation principle even with linear flow and 
jump maps are not yet completely characterized 
and, in the nonlinear case, only few works have 
been proposed (see Teel (2010)), providing open 
challenges for the scientific community. 
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Abstract 

We explore the problem of identification and con¬ 
trol of living cell populations. We describe how 
de novo control systems can be interfaced with 
living cells and used to control their behavior. Us¬ 
ing computer controlled light pulses in combina¬ 
tion with a genetically encoded light-responsive 
module and a flow cytometer, we demonstrate 
how in silico feedback control can be configured 
to achieve precise and robust set point regulation 
of gene expression. We also outline how external 
control inputs can be used in experimental design 
to improve our understanding of the underlying 
biochemical processes. 
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Identification; Intrinsic variability; Population 
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Introduction 

Control systems, particularly those that employ 
feedback strategies, have been used successfully 
in engineered systems for centuries. But natural 
feedback circuits evolved in living organisms 
much earlier, as they were needed for regulating 
the internal milieu of the early cells. Owing 
to modern genetic methods, engineered feed¬ 
back control systems can now be used to control 
in real-time biological systems, much like they 
control any other process. The challenges of con¬ 
trolling living organisms are unique. To be suc¬ 
cessful, suitable sensors must be used to measure 
the output of a single cell (or a sample of cells in a 
population), actuators are needed to affect control 
action at the cellular level, and a controller that 
connects the two should be suitably designed. As 
a model-based approach is needed for effective 
control, methods for identification of models of 
cellular dynamics are also needed. In this entry, 
we give a brief overview of the problem of identi¬ 
fication and control of living cells. We discuss the 
dynamic model that can be used, as well as the 
practical aspects of selecting sensor and actua¬ 
tors. The control systems can either be realized on 
a computer (in silico feedback) or through genetic 
manipulations (in vivo feedback). As an example, 
we describe how de novo control systems can be 
interfaced with living cells and used to control 
their behavior. Using computer controlled light 
pulses in combination with a genetically encoded 
light-responsive module and a flow cytometer, we 
demonstrate how in silico feedback control can 
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be configured to achieve precise and robust set 
point regulation of gene expression. 


Dynamical Models of Cell Populations 

In this entry, we focus on a model of an essential 
biological process: gene expression. The goal 
is to come up with a mathematical model for 
gene expression that can be used for model-based 
control. Due to cell variability, we will work with 
a model that describes the average concentration 
of the product of gene expression (the regulated 
variable). This allows us to use population mea¬ 
surements and treat them as measurements of 
the regulated variable. We refer the reader to the 
entry ► Stochastic Description of Biochemical 
Networks in this encyclopedia for more informa¬ 
tion on stochastic models of biochemical reaction 
networks. In this framework, the model consist 
of an N-vector stochastic process X(t) describ¬ 
ing the number of molecules of each chemical 
species of interest in a cell. Given the chemical 
reactions in which these species are involve, the 
mean, E[X(t)], of X(t ) evolves according to 
deterministic equations described by 

E[X{t)\ = SE[w(X{t))l 

where S is an N x M matrix that describes 
the stoichiometry of the M reactions described 
in the model, while w(-) is an M -vector of 
propensity functions. The propensity functions 
reflect the rate of the reactions being modeled. 
When one considers elementary reactions 
(see ► Stochastic Description of Biochemical 
Networks), the propensity function of the / th 
reaction, vv/ (•), is a quadratic function of the 
form Wi(x) = at + bjx + CiX T QjX. Typically, 
Wi is either a constant: w/ (x) = a, a linear 
function of the form w, (x) = bxj or a simple 
quadratic of the form w; (x) = cxj . Following 
the same procedure, similar dynamical models 
can be derived that describe the evolution of 
higher-order moments (variances, covariances, 
third-order moments, etc.) of the stochastic 
process X(t). 


Identification of Cell Population 
Models 

The model structure outlined above captures the 
fundamental information about the chemical re¬ 
actions of interest. The model parameters that 
enter the functions Wi(x) reflect the reaction 
rates, which are typically unknown. Moreover, 
these reaction rates often vary between different 
cells, because, for example, they depend on the 
local cell environment, or on unmodeled chem¬ 
ical species whose numbers differ from cell to 
cell (Swain et al. 2002). The combination of this 
extrinsic parameter variability with the intrinsic 
uncertainty of the stochastic process X{t) makes 
the identification of the values of these parame¬ 
ters especially challenging. 

To address this combination of intrinsic 
and extrinsic variabilities, one can compute 
the moments of the stochastic process X(t) 
together with the cross moments of X(t) and the 
extrinsic variability. In the process, the moments 
of the parametric uncertainty themselves become 
parameters of the extended system of ordinary 
differential equations and can, in principle, 
be identified from data. Even though doing 
so requires solving a challenging optimization 
problem, effective results can often be obtained 
by randomized optimization methods. For 
example, Zechner et al. (2012) presents the 
successful application of this approach to a 
complex model of the system regulating osmotic 
stress response in yeast. 

When external signals are available, or when 
one would like to determine what species to mea¬ 
sure when, such moment-based methods can also 
be used in experiment design. The aim here is to 
determine a priori which perturbation signals and 
which measurements will maximize the informa¬ 
tion on the underlying chemical process that can 
be extracted from experimental data, reducing the 
risk of conducting expensive but uninformative 
experiments. One can show that, given a tentative 
model for the biochemical process, the moments 
of the stochastic process X(t ) (and cross X(t)~ 
parameter moments in the presence of extrinsic 
variability) can be used to approximate the Fis¬ 
cher information matrix and hence characterize 
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the information that particular experiments con¬ 
tain about the model parameters; an approxima¬ 
tion of the Fischer information based on the first 
two moments was derived in Komorowski et al. 
(2011) and an improved estimate using correction 
terms based on moments up to order 4 was 
derived in Ruess et al. (2013). Once an estimate 
of the Fischer information matrix is available, 
one can design experiments to maximize the 
information gained about the parameters of the 
model. The resulting optimization problem (over 
an appropriate parametrization of the space of 
possible experiments) is again challenging but 
can be approached by randomized optimization 
methods. 


Control of Cell Populations 

There are two control strategies that one can 
implement. The control systems can be realized 
on a computer, using real-time measurements 
from the cell population to be controlled. These 
cells must be equipped with actuators that 
respond to the computer signals that close the 
feedback loop. We will refer to this as in silico 
feedback. Alternatively, one can implement the 
sensors, actuators, and control system in the 
entirety within the machinery of the living cells. 
At least in principle, this can be achieved through 
genetic manipulation techniques that are common 
in synthetic biology. We shall refer to this type of 
control as in vivo feedback. Of course some com¬ 
bination of the two strategies can be envisioned. 
In vivo feedback is generally more difficult to 
implement, as it involves working within the 
noisy uncertain environment of the cell and 
requires implementations that are biochemical in 
nature. Such controllers will work autonomously 
and are heritable, which could prove advanta¬ 
geous in some applications. Moreover, coupled 
with intercellular signaling mechanisms such 
as quorum sensing, in vivo feedback may lead 
to tighter regulation (e.g., reduced variance) of 
the cell population. On the other hand, in silico 
controllers are much easier to program, debug, 
and implement and can have much more complex 
dynamics that would be possible with in vivo 


controllers. However, in silico controllers require 
a setup that maintains contact with all the cells to 
be controlled and cannot independently control 
large numbers of such cells. In this entry we 
focus exclusively on in silico controllers. 

The Actuator 

There could be several ways to send actuating 
signals into living cells. One consists of chemical 
inducers that the cells respond to either through 
receptors outside the cell or through translocation 
of the inducer molecules across the cellular mem¬ 
brane. The chemical signal captured by these 
inducers is then transduced to affect gene expres¬ 
sion. Another approach we will describe here is 
induction through light. There are several light 
systems that can be used. One of these includes 
a light-sensitive protein called phytochrome B 
(PhyB). When red light of wavelength 650 nm 
is shined on PhyB in the presence of phyco- 
cyanobilin (PCB) chromophore, it is activated. In 
this active state it binds to another protein Pif3 
with high affinity forming PhyB-Pif3 complex. 
If then a far-red light (730 nm) is shined, PhyB 
is deactivated and it dissociates from Pif3. This 
can be exploited for controlling gene expression 
as follows: PhyB is fused to a GAL4 binding do¬ 
main (GAL4BD), which then binds to DNA in a 
specific site just upstream of the gene of interest. 
Pif3 in turn is fused to a GAL4 activating do¬ 
main (GAL4AD). Upon red light induction, Pif3- 
Gal4AD complex is recruited to PhyB, where 
Gal4AD acts as a transcription factor to initiate 
gene expression. After far-red light is shined, 
the dissociation of GAL4BD-PhyB complex with 
Pif3-Gal4AD means that Gal4AD no longer acti¬ 
vates gene expression, and the gene is off. This 
way, one can control gene expression - at least in 
open loop. 

The Sensor 

To measure the output protein concentration 
in cell populations, a florescent protein tag is 
needed. This tag can be fused to the protein of 
interest, and the fluorescence intensity emanating 
from each cell is a direct measure of the protein 
concentration in that cell. There are several 
technologies for measuring fluorescence of cell 
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Identification and Control of Cell Populations, Fig. 1 

Top figure: shows a yeast cell whose gene expression can 
be induced by light: red light turns on gene expression 
while far-red turns it off. Bottom figure: Each input light 
sequences can be applied to a culture of light responsive 

populations. While fluorimeters measure the 
overall intensity of a population, flow cytometry 
and microscopy can measure the fluorescence 
of each individual cell in a population sample 
at a given time. This provides a snapshot 
measurement of the probability density function 
of the protein across the population. Repeated 
measurements over time can be used as a basis 
for model identification (Fig. 1). 

The Control System 

Equipped with sensors, actuators, and a model 
identified with the methods outlined above 
one can proceed to design control algorithms 
to regulate the behavior of living cells. Even 
though moment equations lead to models that 
look like conventional ordinary differential 
equations, from a control theory point of view, 
cell population systems offer a number of 
challenges. Biochemical processes, especially 


yeast cells resulting in a corresponding gene expression 
pattern that is measured by flow cytometry. By applying 
multiple carefully chosen light input test sequences and 
looking at their corresponding gene expression patterns a 
dynamic model of gene expression can be identified 

genetic regulation, are often very slow with time 
constants of the order of tens of minutes. This 
suggests that pure feedback control without some 
form of preview may be insufficient. Moreover, 
due to our incomplete understanding of the 
underlying biology, the available models are 
typically inaccurate, or even structurally wrong. 
Finally, the control signals are often unconven¬ 
tional; for example, for the light control system 
outlined above, experimental limitations imply 
that the system must be controlled using discrete 
light pulses, rather than continuous signals. 

Fortunately advances in control theory allow 
one to effectively tackle most of these challenges. 
The availability of a model, for example, enables 
the use of model predictive control methods that 
introduce the necessary preview into the feedback 
process. The presence of unconventional inputs 
may make the resulting optimization problems 
difficult, but the slow dynamics work in our favor, 
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Identification and Control of Cell Populations, Fig. 2 

Architecture of the closed-loop light control system. Cells 
are kept darkness until they are exposed to light pulse se¬ 
quences from the in silico feedback controller. Cell culture 
samples are passed to the flow cytometer whose output 
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Precomputed Input Model 

(no feedback) 

Feedback Control Experiment 

Identification and Control of Cell Populations, Fig. 3 

Left panel: The closed-loop control strategy (orange) 
enables set point tracking, whereas an open-loop strategy 
(green) does not. Right panel: Four different experiments, 


is fed back to the computer which implements a Kalman 
filter plus a Model Predictive Controller. The objective 
of the control is to have the mean gene expression level 
follow a desired set value 


Setpoint=5 



Time (min) 

Random Pulses (t < 0); 
Feedback Control (t > 0) 


each with a different initial condition. Closed-loop con¬ 
trol is turned on at time t=0 shows that tracking can 
be achieved regardless of initial condition. (See Milias- 
Argeitis et al. (2011)) 
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providing time to search the space of possible 
input trajectories. Finally, the fundamental 
principle of feedback is often enough to deal 
with inaccurate models. Unlike systems biology 
applications where the goal is to develop a model 
that faithfully captures the biology, in population 
control applications even an inaccurate model 
is often enough to provide adequate closed-loop 
performance. Exploring these issues, Milias- 
Argeitis et al. (2011) developed a feedback 
mechanism for genetic regulation using the 
light control system, based on an extended 
Kalman filter and a model predictive controller 
(Figs. 2 and 3). A related approach was taken 
in Uhlendorf et al. (2012) to regulate the osmotic 
stress response in yeast, while Toettcher et al. 
(2011) develop what is affectively a PI controller 
for a faster cell signaling system. 

Summary and Future Directions 

The control of cell populations offers novel chal¬ 
lenges and novel vistas for control engineering 
as well as for systems and synthetic biology. 
Using external input signals and experiment 
design methods, one can more effectively probe 
biological systems to force them to reveal 
their secrets. Regulating cell populations in a 
feedback manner opens new possibilities for 
biotechnology applications, among them the 
reliable and efficient production of antibiotics 
and biofuels using bacteria. Beyond biology, the 
control of populations is bound to find further 
applications in the control of large-scale, multi¬ 
agent systems, including those in transportation, 
demand response schemes in energy systems, 
crowd control in emergencies, and education. 

Cross-References 

► Deterministic Description of Biochemical Net¬ 
works 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Stochastic Description of Biochemical 
Networks 

► System Identification: An Overview 
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Industrial MPC of Continuous 
Processes 

Mark L. Darby 

CMiD Solutions, Houston, TX, USA 


Abstract 

Model predictive control (MPC) has become 
the standard for implementing constrained, 
multivariable control of industrial continuous 
processes. These are processes which are 
designed to operate around nominal steady-state 
values, which include many of the important 
processes found in the refining and chemical 
industries. The following provides an overview 
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of MPC, including its history, major technical 
developments, and how MPC is applied today 
in practice. Possible future developments are 
provided. 

Keywords 

Constraints; Modeling; Model predictive con¬ 
trol; Multivariable systems; Process identifica¬ 
tion; Process testing 

Introduction 

Model predictive control (MPC) refer to a class 
of control algorithms that explicitly incorporate a 
process model for predicting the future response 
of a plant and relies on optimization as the means 
of determining control action. At each sample 
interval, MPC computes a sequence of future 
plant input signals that optimize future plant be¬ 
havior. Only the first of the future input sequence 
is applied to the plant, and the optimization is 
repeated at subsequent sample intervals. 

MPC provides an integrated solution for 
controlling non-square systems with complex 
dynamics, interacting variables, and constraints. 
MPC has become a standard in the continuous 
process industries, particularly in refining and 
chemicals, where it has been widely applied 
for over 25 years. In most commercial MPC 
products, an embedded steady-state optimizer 
is cascaded to the MPC controller. The MPC 
steady-state optimizer determines feasible, 
optimal settling values of the manipulated and 
controlled variables. The MPC controller then 
optimizes the dynamic path to optimal steady- 
state values. 

The scope of an MPC application may include 
a unit operation such as a distillation column or 
reactor, or a larger scope such as multiple distil¬ 
lation columns, or a scope that combines reaction 
and separation sections of a plant in one con¬ 
troller. MPC is positioned in the control and de¬ 
cision hierarchy of a processing facility as shown 
in Fig. 1. The variables associated with MPC con¬ 
sist of: manipulated variables (MVs), controlled 
variables (CYs), and disturbance variables (DVs). 


• Planning & Scheduling 

♦ Real Time Optimization 


targets, limits, objectives 



Industrial MPC of Continuous Processes, Fig. 1 

Industrial control and decision hierarchy 


CVs include variables normally controlled at a 
fixed value such as a product impurity and as 
well as those considered constraints, for example 
limits related to capacity or safety that may only 
be sometimes active. DVs are measurements that 
are treated as feedforward variables in MPC. The 
manipulated variables are typically setpoints of 
underlying PID controllers, but may also include 
valve position signals. Most of the targets and 
limits are local to the MPC, but others come 
directly from real-time optimization (if present), 
or indirectly from planning/scheduling, which are 
normally translated to the MPC in an open-loop 
manner by the operations personnel. 

Linear and nonlinear model forms are found in 
industrial MPC applications; however, the major¬ 
ity of the applications continue to rely on a linear 
model, identified from data generated from a 
dedicated plant test. Nonlinearities that primarily 
affect system gains are often adequately con¬ 
trolled with linear MPC through gain scheduling 
or by applying linearizing static transformations. 
Nonlinear MPC applications tend to be reserved 
for those applications where nonlinearities are 
present in both system gains and dynamic re¬ 
sponses and the controller must operate at signif¬ 
icantly different targets. 
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Origins and History 

MPC has its origins in the process industries 
in the 1970s. The year 1978 marked the first 
published description of predictive control under 
the name IDCOM, an acronym for Identification 
and Command (Richalet et al. 1978). A short time 
later, Cutler and Ramaker (1979) published a 
predictive control algorithm under the name Dy¬ 
namic Matrix Control (DMC). Both approaches 
had been applied industrially for several years 
before the first publications appeared. These pre¬ 
dictive control approaches targeted the more dif¬ 
ficult industrial control problems that could not 
be adequately handled with other methods, ei¬ 
ther with conventional PID control or with ad¬ 
vanced regulatory control (ARC) techniques that 
rely on single-loop controllers augmented with 
overrides, feedforwards/decouplers, and custom 
logic. 

The basic idea behind the predictive control 
approach is shown in Fig. 2 for the case of a 
single input single output, stable system. Future 
predictions of inputs out outputs are denoted with 
the hat symbol and shown as dashes; double 
indexes, t\k, indicate future values at time t 
based on information up to and including time 
k. The optimization problem is to bring future 
predicted outputs |^+i,..., yk\k+p) close to 
a desired trajectory over a prediction horizon, 


P, by means of a future sequence of inputs 
(uk\k, • • •, Uk\k+M-i) calculated over a control 
horizon M. The trajectory may be a constant 
setpoint. In the general case, the optimization 
is performed subject to constraints that may be 
imposed on future inputs and outputs. Only the 
first of the future moves is implemented and the 
optimization is repeated at the next time instant. 
Feedback, which accounts for unmeasured dis¬ 
turbances and model error, is incorporated by 
shifting all future output predictions, prior to the 
optimization, based on the difference between the 
output measurement yk and the previous predic¬ 
tion yk\k-\, denoted by d^ (i.e., the prediction 
error at time instant k). Future predicted values 
of the outputs depend on both past and future 
values inputs. If no future input changes are made 
(at time k or after), the model can be used to 
calculate the future “free” output response, y® k , 
which will ultimately settle at a new steady-state 
value based on the settling time (or time to steady 
state of the model, T ss ). For the unconstrained 
case, it is straightforward to show that the optimal 
result is a linear control law that depends only on 
the error between the desired trajectory and the 
free output response. 

The predictive approach seemed to contrast 
with the state-space optimal control method of 
the time, the linear quadratic regulator (LQR). 
Later research exposed the similarities to LQR 


Industrial MPC of 
Continuous Processes, 
Fig. 2 Predictive control 
approach 
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and also Internal Model Control (IMC) (Gar¬ 
cia and Morari 1982), although these techniques 
did not solve an online optimization problem. 
Optimization-based control approaches became 
feasible for industrial applications due to ( 1 ) the 
slower sampling requirements of most industrial 
control problems (on the order of minutes) and 
the hierarchical implementations in which MPC 
provides setpoints to lower level PID controllers 
which execute on a much faster sample time (on 
the order of seconds or faster). 

Although the basic ideas behind MPC 
remain, industrial MPC technology has changed 
considerably since the first formulations in the 
late 1970s. Qin and Badgwell (2003) describe 
the enhancements to MPC technology that 
occurred over the next 20 plus years until the 
late 1990s. Enhancements since that time are 
highlighted in Darby and Nikolaou (2012). 
These improvements to MPC reflect increases 
in computer processing capability and additional 
requirements of industry, which have led to 
increased functionality and tools/techniques 
to simplify implementation. A summary 
of the significant enhancements that have 
been made to industrial MPC is highlighted 
below. 

Constraints: Posing input and output constraints 
as linear inequalities, expressed as a function 
of the future input sequence (Garcia and 
Morshedi 1986), and solved by a standard 
quadratic program or an iterative scheme 
which approximates one. 

Two-Stage Formulations: Limitations of a 
single objective function led to two-stage for¬ 
mulations to handle MV degrees of freedom 
(constraint pushing) and steady-state opti¬ 
mization via a linear program (LP). 
Integrators. In their native form, impulse and 
step response models can be applied only to 
stable systems (in which the impulse response 
model coefficients approach zero). Extension 
to handle integrating variables included em¬ 
bedding a model of the difference of the in¬ 
tegrating signal or integrating a fraction of the 
current prediction error into the future (imply¬ 
ing an increasing |* 4+7 |fc | for j > 1 in Fig. 2). 
The desired value of an integrator at steady 


state (e.g., zero slope) has been incorporated 
into two-stage formulations (see, e.g., Lee and 
Xiao 2000). 

State Space Models. The first state space for¬ 
mulation of MPC, which was introduced in 
the late 1980s (Marquis and Broustail 1988) 
allowed MPC to be extended to integrating 
and unstable processes. It also made use of 
the Kalman filter which provided additional 
capability to estimate plant states and un¬ 
measured disturbances. Later, a state space 
MPC offering was developed based on an in¬ 
finite horizon (for both control and prediction) 
(Froisy 2006). These state space approaches 
provided a connection back to unconstrained 
LQR theory. 

Nonlinear MPC. The first applications of non¬ 
linear MPC, which appeared in the 1990s, 
were based on neural net models. In these 
approaches, a linear dynamic model was com¬ 
bined with a neural net model that accounted 
for static nonlinearity (Demoro et al. 1997; 
Zhao et al. 2001). 

The late 1990s saw the introduction of an 
industrial nonlinear MPC based on first prin¬ 
ciple models derived from differential mass 
and energy balances and reaction kinetic ex¬ 
pressions, expressed in differential algebraic 
equation (DAE) form (Young et al. 2002). 

A process where nonlinear MPC is routinely 
applied is polymer manufacturing. 

Identification Techniques. Multivariable 
prediction error techniques are now routinely 
used. More recently, industrial application of 
subspace identification methods has appeared, 
following the development of these algorithms 
in the 1990s. Subspace methods incorporate 
the correlation of output measurements in the 
identification of a multivariable state space 
model, which can be used directly in a state 
space MPC or converted to an impulse or step 
response model based MPC. 

Testing Methods. The 1990s saw increased 
use of automatic testing methods to gen¬ 
erate data for (linear) dynamic model 
identification using uncorrelated binary 
signals. Since the 2000, closed-loop testing 
methods have received considerable attention. 
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The motivation for closed-loop testing is to 
reduce implementation time and/or effort 
of the initial implementation as well as 
the ongoing need to re-identify the model 
of an industrial application in light of 
processes changes. These closed-loop testing 
methods, which require a preliminary or 
existing model, utilize uncorrelated dither 
signals either introduced as biases to the 
controller MVs or injected through the 
steady-state LP or QP, where additional 
logic or optimization of the test protocol 
may be performed (Kalafatis et al. 2006; 
Mac Arthur and Zhan 2007; Zhu et al. 
2012 ). 


Mathematical Formulation 


While there are differences in how the MPC 
problem is formulated and solved, the following 
general form captures most of the MPC products 
(Qin and Badgwell 2003), although not all terms 
may be present in a given product: 


^ y&+y'|fc — ^k+j\k L>. ^ Ik+^lk 

j — 1 " " V/ y' = l J 

min 

AU M — 1 M — 1 

E ||uhvi*-u m || r + E ||Au* +Jt -|*|| s 

_ j = \ y = l _ 

(1) 

subject to: 


*k+j\k = f(*k+j-l\k,U k +j-l\ k ), j = 1, . . . P 1 

> Model equations 

yk+j\k = S(xk+j\k,Uk+j\k), j = 1,...P J 
y™" - sj < y k +j\k < y max + Sj, j = l,..., p \ 

\ Output constraints/slacks 
sj >0, j = h...P ) 

u min < n k+j \ k < u max , j = 0,... M - 1 1 

> Input constraints 

_ Au min < Auk+jlk < Au max , j =0,...M-1 J 


where the minimization is performed over the 
future sequence of inputs U = u k \ k ,u k +i\ k ,, 
u k +M-i\k- The four terms in the objective 
function represent conflicting quadratic penalties 
(||x||i = x r Ax); the penalty matrices are most 
always diagonal. The first term penalizes the 
error relative to a desired reference trajectory 
(cf. Fig. 2) originating at y k \ k and terminating 
at a desired steady-state, y ss ; the second term 
penalizes output constraint violations over the 
prediction horizon (constraint softening); the 
third term penalizes inputs deviations from a 
desired steady-state, either manually specified 
or calculated. The fourth term penalizes input 
changes as a means of trading off output tracking 
and input movement (move suppression). 

The above formulation applies to both linear 
and nonlinear MPC. For linear MPCs, except 
for state space formulations, there are no state 


equations and the outputs in the dynamic model 
are a function of only past inputs, such as with the 
finite step response model. 

When a steady-state optimizer is present in the 
MPC, it provides the steady-state targets for u ss 
(in the third quadratic term) and y ss (in the output 
reference trajectory). Consider the case of linear 
MPC with LP as the steady-state optimizer. The 
LP is typically formulated as 

min clAu ss + clAy ss + q^s + + qls~ 

Anss U y ^ ' 


subject to: 


AySS = QSS Au SS 

u ss = Ujfc-1 + All™ 

y ss = y° k +T SS \k + A v 


Model equations 
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y min —s < y ss < y max + s + } Output constraints 

^min yj^ss < ^max A 

> Input constraints 
-MAu max < u ss < MAu max J 

G™ is formed from the gains of the linear dy¬ 
namic model. Deviations outside minimum and 
maximum output limits (s _ and s + , respectively) 
are penalized, which provide constraint softening 
in the event all outputs cannot be simultaneously 
controlled within limits. The weighting in q_ and 
q+ determine the relative priorities of the out¬ 
put constraints. The input constraints, expressed 
in terms of Au max , prevent targets from being 
passed to the dynamic optimization that cannot 
be achieved. The resulting solution - u ss and 
y ss - provides a consistent, achievable steady- 
state for the dynamic MPC controller. Notice that 
for inputs, the steady-state delta is applied to the 
current value and, for outputs, the steady-state 
delta is applied to the steady-state prediction of 
the output without future moves, after correcting 
for the current model error (cf. Fig. 2). If a real¬ 
time optimizer is present, its outputs, which may 
be targets for CVs and/or MVs, are passed to the 
MPC steady-state optimizer and considered with 
other objectives but at lower weights or priorities. 

Some additional differences or features found 
in industrial MPCs include: 

1-Norm formulations where absolute deviations, 
instead of quadratic deviations, are penalized. 
Use of zone trajectories or “funnels” with small 
or no penalty applied if predictions remain 
within the specified zone boundaries. 

Use of a minimum movement criterion in ei¬ 
ther the dynamic or steady-state optimizations, 
which only lead to MV movement when CV 
predictions go outside specified limits. This 
can provide controller robustness to modeling 
errors. 

Multiobjective formulations which solve a series 
of QP or LP problems instead of a single one, 
and can be applied to the dynamic or steady- 
state optimizations. In these formulations, 
higher priority objectives are solved first, 
followed by lesser priority objectives with 
the solution of the higher priority objectives 


becoming equality constraints in subsequent 
optimizations (Maciejowski 2002). 

MPC Design 

Key design decision for a given application are 
the number of MPC controllers and the selection 
of the MVs, DVs, and CVs for each controller; 
however, design decisions are not limited to just 
the MPC layer. The design problem is one of 
deciding on the best overall structure for the 
MPC(s) and the regulatory controls, given the 
control objectives, expected constraints, qualita¬ 
tive knowledge of the expected disturbances, and 
robustness considerations. It may be that exist¬ 
ing measurements are insufficient and additional 
sensors may be required. In addition, a measure¬ 
ment many not be updated on a time interval 
consistent with acceptable dynamic control, for 
example, laboratory measurements and process 
composition analyzers. In this case, a soft sensor, 
or inferential estimator, may need to be developed 
from temperature and pressure measurements. 

MPC is frequently applied to a major plant 
unit, with the MVs selected based on their sen¬ 
sitivity to key unit CVs and plant economics. 
Decisions regarding the number and size of the 
MPCs for a given application depend on plant ob¬ 
jectives, (expected) constraints, and also designer 
preferences. When the objective is to minimize 
energy consumption based on fixed or specified 
feed rate, multiple smaller controllers can be 
used. In this situation, controllers are normally 
designed based on the grouping of MVs with 
the largest effect on the identified CVs, often 
leading to MPCs designed for individual sections 
of equipment, such as reactors and distillation 
columns. When the objective is to maximize feed 
(or certain products), larger controllers are nor¬ 
mally designed, especially if there are multiple 
constraints that can limit plant throughput. The 
MPC steady-state LP or QP is ideally suited to 
solving the throughput maximization problem by 
utilizing all available MVs. The location of the 
most limiting constraints can impact the number 
of MPCs. If the major constraints are near the 
front-end of the plant, one MPC can be designed 
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which connects these constraints with key MVs 
such as feed rates, and other MPCs designed for 
the rest of the plant. If the major constraints are 
located near the back of the plant, then a single 
MPC is normally considered; alternatively, an 
MPC cascade could be considered, although this 
is not a common practice across the industry (and 
often requires customization). 

The feed maximization objective is a ma¬ 
jor reason why MPCs have become larger with 
the advances in computer processing capability. 
However, there is generally a higher requirement 
on model consistency for larger controllers due 
do the increased number of possible constraint 
sets against which the MPC can operate. A larger 
controller can also be harder to implement and 
understand. This is a reason why some practi¬ 
tioners prefer implementing smaller MPCs at the 
potential loss of benefits. 

MPC Practice 

An MPC project is typically implemented in the 
following sequence: 

Pretest and preliminary MPC design 
Plant testing 

Model and controller development 
Commissioning 

These tasks apply whether the MPC is linear 
or nonlinear, but with some differences, primar¬ 
ily model development and in plant testing. In 
nonlinear MPC, key decisions are related to the 
model form and level of rigor. Note that with a 
fundamental model, lower level PID loops must 
be included in the model, if the dynamics are 
significant; this is in contrast to empirical mod¬ 
eling, where the dynamics of the PID loops are 
embedded in the plant responses. A fundamental 
model will typically require less plant testing 
and make use of historical operating data to 
fit certain model parameters such as heat trans¬ 
fer coefficients and reaction constants. Historical 
data and/or data from a validated nonlinear static 
model can also be used to develop nonlinear 
static models (e.g., neural net) to combine with 
empirical dynamic models. As mentioned earlier, 
most industrial applications continue to rely on 


empirical linear dynamic models, fit to data from 
a dedicated plant test. This will be the basis in the 
following discussion. 

In the pretest phase of work, the key activity 
is one of determining the base level regulatory 
controls for MPC, tuning of these controls, and 
determining if the current plant instrumentation 
is adequate. It is common to retune a significant 
number of PID loops, with significant benefits 
often resulting from this step alone. 

A range of testing approaches are used in plant 
testing for linear MPC, including both manual 
and automatic (computer-generated) test signal 
designs, most often in open loop but, increas¬ 
ingly, in closed loop. Most input testing continues 
to be based on uncorrelated signals, implemented 
either manually or from computer-generated ran¬ 
dom sequences. Model accuracy requirements 
dictate accuracy across a range of frequencies 
which is achieved by varying the duration of the 
steps. Model identification runs are made through 
out the course of a test to determine when model 
accuracy is sufficient and a test can be stopped. 

In the next phase of work, modeling of the 
plant is performed. This includes constructing the 
overall MPC model from individual identifica¬ 
tion runs; for example, deciding which models 
are significant and judging the models charac¬ 
teristics (dead times, inverse response settling 
time, gains) based on engineering/process and 
a priori knowledge. An important step is an¬ 
alyzing, and adjusting if necessary, the gains 
of the constructed model to insure the models 
gains satisfy mass balances and gain ratios do 
not result in fictitious degrees of freedom (due 
to model errors) that the steady-state optimizer 
could exploit. Also included is the development 
of any required inferential or soft sensors, typi¬ 
cally based on multivariate regression techniques 
such as principal component regression (PCR), 
principal component analysis (PCA) and partial 
least squares (PLS), or sometimes based on a 
fundamental model. 

During controller development, initial con¬ 
troller tuning is performed. This relates to estab¬ 
lishing criteria for utilizing available degrees of 
freedom and setting control variable priorities. In 
addition, initial tuning values are established for 
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the dynamic control. Steady state responses cor¬ 
responding to expected constraint scenarios are 
analyzed to determine if the controller behaves as 
expected, especially with respect to the steady- 
state changes in the manipulated variables. 

Commissioning involves testing and tuning 
the controller against different constraint sets. It 
is not unusual to modify or revisit model de¬ 
cisions made earlier. In the worst case, control 
performance may be deemed unacceptable and 
the control engineer is forced to revisit earlier 
decisions such as the base level regulatory strat¬ 
egy or plant model quality, which would require 
re-testing and re-identification of portions of the 
plant model. The main commissioning effort typ¬ 
ically takes place over a two to three week period, 
but can vary based on the size and model density 
of the controller. In reality, commissioning, or 
more accurately, controller maintenance, is an 
ongoing activity. It is important that the operating 
company have in-house expertise that can be used 
to answer questions (“why is the controller doing 
that?”), troubleshoot, and modify the controller to 
reflect new operating modes and constraint sets. 

Future Directions 

Likely future developments are expected to fol¬ 
low extensions of current approaches. Due to 
the success in automatic, closed-loop testing, one 
possibility is extending it to “dual” or “joint” con¬ 
trol, where control and identification objectives 
are combined and allow the user to select how 
much the control (e.g., output variance) can be 
affected by test perturbation signals. Another is 
in formulating the plant test as a DOE 8 (design 
of experiments) optimization problem that could, 
for example, target specific models or model 
parameters. In the identification area, extensions 
have started to appear which allow constraints 
to be imposed, for example, on dead-times or 
gains, thus allowing a priori knowledge to be 
used. Another important area that has seen recent 
emphasis, and which more development can be 
expected, is in monitoring and diagnosis, for ex¬ 
ample, detecting which submodels of MPC have 
become inaccurate and require re-identification. 


As mentioned earlier, one of the advantages 
of state-space modeling is the inherent flexibility 
to model unmeasured disturbances (i.e., dk+j\j, 
cf. Fig. 2); however, these have not found wide¬ 
spread use in industry. A useful enhancement 
would be a framework for developing and imple¬ 
menting improved estimators in a convenient and 
transparent manner, that would be applicable to 
traditional FIR- and FSR- based MPCs. 

In the area of nonlinear control, the use of 
hybrid modeling approaches has increased, for 
example, integrating known fundamental model 
relationships with neural net or linear time- 
varying dynamic models. The motivation is in 
reducing complexity and controller execution 
times. The use of hybrid techniques can be 
expected to further increase, especially if 
nonlinear control is to be applied more broadly to 
larger control problems. Even in situations where 
control with linear MPC is adequate, there may 
be benefits from the use of hybrid or fundamental 
models, even if the models are not directly used 
in the control calculation. The resulting model 
could be used offline in model development or 
online to update the linear MPC model. Benefits 
would come from reduced plant testing and in 
ensuring model consistency. In the longer term, 
one can foresee a more general modeling and 
control environment where the user would not 
have to be concerned with the distinction between 
linear and nonlinear models and would be able 
to easily incorporate known relationships into the 
controller model. 

An area that has not received significant atten¬ 
tion, but is suggested as an area worth pursuing 
concerns MPC cascades. Most of the applica¬ 
tions and research are based on a single MPC 
or multiply distributed MPCs. An MPC cascade 
would permit the lower MPC to run at a faster 
time period and allow the user to decide which 
degrees of freedom are to be used for higher level 
objectives, such as feed maximization. 

Cross-References 

► Control Hierarchy of Large Processing Plants: 

An Overview 

► Control Structure Selection 
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► Model-Based Performance Optimizing 
Control 

► Model-Predictive Control in Practice 

► Nominal Model-Predictive Control 

► Real-Time Optimization of Industrial Pro¬ 
cesses 
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Abstract 

Information and communication complexity of 
a networked control system identifies the min¬ 
imum amount of information exchange needed 
between the decision makers (such as encoders, 
controllers, and actuators) to achieve a certain 
objective, which may be in terms of reaching a 
target state or achieving a given cost threshold. 
This formulation does not impose any constraints 
on the computational requirements to perform the 
communication or control. Both stochastic and 
deterministic formulations are considered. 


Keywords 
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Introduction 

Consider a dynamic team problem with L control 
stations (these will be referred to as decision 
makers and denoted by DMs) under the following 
dynamics and measurement equations: 

x t +1 = ft(x t ,u),...,u^,w t ), f = 0,1, - - • 

( 1 ) 

y't = g' t ( x t , «‘-i > • • •.; v\ ), (2) 
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Information and 
Communication 
Complexity of 
Networked Control 
Systems, Fig. 1 A 
decentralized networked 
control system with 
information exchange 
between decision makers 


Station 4 



where / e { 1,2 =: C and xo, W[o,r-i], 

v l [0 T _ ^ are mutually independent random 
variables with specified probability distributions. 
Here, we use the notation W[o, t ] '= {w s , 0<s<t}. 

The DMs are allowed to exchange limited in¬ 
formation: see Fig. 1. The information exchange 
is facilitated by an encoding protocol £ which is 
a collection of admissible encoding functions de¬ 
scribed as follows. Let the information available 
to DM i at time t be 

= iy l [i,t}’ u \ij-i]’ z {o,t]’ z [o,t ]’j e 

where z t ’ J takes values in Z\' J and is the informa¬ 
tion variable transmitted from DM i to DM j at 
time t generated with 

z\ = {£ J ,j eC} = £\ ( 3 ) 

and for t = 0, z', = {z 1 q ,j e C} = The 

control actions are generated with 

A = r/( 2 ?). 

for all DMs. Define log 2 (|i^ J |) to be the com¬ 
munication rate from DM i to DM j at time t 

and 7e(z [ 0 ,r-i]) = EjZo Eijec l ° 1) t0 
be the (total) communication rate. The minimum 
(total) communication rate over all coding and 
control policies subject to a design objective 


is called the communication complexity for this 
objective. 

The above is a fixed-rate formulation for com¬ 
munication complexity, since for any two coder 
outputs, a fixed number of bits is used at any 
given time. One could also use variable-rate for¬ 
mulations. The variable-rate formulation exploits 
the probabilistic distribution of the system vari¬ 
ables: see Cover and Thomas (1991). 


Communication Complexity for 
Decentralized Dynamic Optimization 

Let £) = {£\ , t > 0} and y l = {y l t , t > 
0}. Under a team-encoding policy £_ = 

{£}iE 2 , ... and a team-control policy 

y = {y 1 , y 2 , ..., y L }, let the induced cost be 

r -1 

E ~~ E c ( x ‘ >“*>“?>•"> )] • ( 4 ) 

t =o 

In networked control, the goal is to mini¬ 
mize (4) over all coding and control policies sub¬ 
ject to information constraints in the system. Let 
u t = {u], u 2 , • • • , u ^}. The following definition 
and example are from Yiiksel and Ba§ar (2013). 

Definition 1 Given a decentralized control prob¬ 
lem as above, team cost-rate function C : M M 
is 
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( T ~ X 

C(R): = inf j E^[J2 c ( x t, «r)] : 

—7^(z[ 0 ,r-i]) < • 

We can define a dual function. 

Definition 2 Given a decentralized control prob¬ 
lem as above, team rate-cost function R : M —> M 
is 

i^(C) : = inf j 4^(z[o,r-i]) • 

T ~ 1 ) 

- c [ • 

/=o ' 

The formulation here can be adjusted to in¬ 
clude sequential (iterative) information exchange 
given a fixed ordering of actions, as opposed to a 
simultaneous (parallel) information exchange at 
any given time t. That is, instead of (3), we may 
have 


Suppose that we wish to compute the minimum 
expected cost subject to a total rate of 2 bits that 
can be exchanged. Under a sequential scheme, if 
we allow DM 1 to encode y 1 to DM 2 with 1 bit, 
then a cost of 0 is achieved since DM 2 knows the 
relevant information that needs to be transmitted 
to DM 1, again with 1 bit: If p = 0, x 1 is the 
relevant random variable with an optimal policy 
u l = u 2 = x l , and if p = 1, x 2 is relevant with 
an optimal policy u l = u 2 = x 2 , and a cost of 0 is 
achieved. However, if the information exchange 
is parallel, then DM 2 does not know which state 
is the relevant one, and it can be shown that a cost 
of 0 cannot be achieved under any policy. 

The formulation in Definition 1 can also be 
adjusted to allow for multiple rounds of commu¬ 
nication per time stage. Having multiple rounds 
can enhance the performance for a class of team 
problems while keeping the total rate constant. 


Communication Complexity 
in Decentralized Computation 


A = {zt J J e {1,2 ,...,L}} 

= (2J_j ,u‘ t _ v y l t , {; z k ,'* ,k < /}). (5) 

Both to make the discussion more explicit and to 
show that a sequential (iterative) communication 
protocol may perform strictly better than an opti¬ 
mal parallel communication protocol given a total 
rate constraint, we state the following example: 
Consider the following setup with two DMs. Let 
x 1 , x 2 , p be uniformly distributed binary random 
variables, DM i have access to y l , i = 1,2, and 

x=(p,x l ,x 2 ), y 1 = p, y 2 — (x 1 , x 2 ), 

and the cost function be 


Yao (1979) initiated the research on communica¬ 
tion complexity in distributed computation. This 
may be viewed as a special case of the setting 
considered earlier but with finite spaces and in a 
deterministic and an error-free context: Consider 
two decision makers (DMs) who have access to 
local variables x e {0,1 } n ,y e {0,1}”. Given a 
function / of variables (x, y), what is the max¬ 
imum (over all input variables x, y) of the min¬ 
imum amount of information exchange needed 
for at least one agent to compute the value of 
the function? Let s(x, y) = {mi, m 2 , • • • , m t ) be 
the communication symbols exchanged on input 
(x, y) during the execution of a communication 
protocol. Let m z denote the it h binary message 
symbol with |m ? | bits. The communication com¬ 
plexity for such a setup is defined as 


c(x,u l ,u 2 ) = l{ p=0 }c(x 1 ,u l ,u 2 ) 

+ l{ /7 = l}C(x 2 , M 1 , U 2 ), 


R(f) = min max 

y,S (x,y)e{ 0 , 1 }" x { 0 , 1 }» 


\s(pc,y)\. 


( 6 ) 


with 


c(s, w 1 , u 2 ) = (s — u 1 ) 2 + (s — u 2 ) 2 . 


where \s(x, y)| = I m i \ an d £ is a protocol 

which dictates the iterative encoding functions as 
in (5) and y is a decision policy. 
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For such problems, obtaining good lower 
bounds is in general challenging. One lower 
bound for such problems is obtained through 
the following reasoning: A subset of the form 
A x B, where A and B are subsets of {0,1}”, 
is called an /-monochromatic rectangle if 
for every x e A,y e B, f(x,y) is the 
same. It can be shown that given any finite 
message sequence ,mj, the set 

{(x,y) : s(x,y ) = {mi, m 2 , • • • , m t }} is an 
/-monochromatic rectangle. Hence, to minimize 
the number of messages, one needs to minimize 
the number of /-monochromatic rectangles 
which has led to research in this direction. 
Upper bounds are typically obtained by explicit 
constructions. For a comprehensive review, see 
Kushilevitz and Nisan (2006). 

For control systems, the discussion takes fur¬ 
ther aspects into account including a design ob¬ 
jective, system dynamics, and the uncertainty in 
the system variables. 

Communication Complexity in Reach 
Control 

Wong (2009) defines the communication com¬ 
plexity in networked control as follows: Consider 
a design specification where two DMs wish to 
steer the state of a dynamical system in finite 
time. This can be viewed as a setting in (l)-(2) 
with 4 DMs, where there is iterative communi¬ 
cation between a sensor and a DM, and there is 
no stochastic noise in the system. Given a set of 
initial states Xo € <%, and finite sets of objective 
choices for each DM (^4 for DM 1, B for DM 2), 
the goal is to ensure that (i) there exists a finite 
time where both DMs know the final state of the 
system, (ii) the final state satisfies the choices 
of the DMs, and (iii) the finite time (when the 
objective is satisfied) is known by the DMs. 

The communication complexity for such a 
system is defined as the infimum over all pro¬ 
tocols of the supremum over the triple of initial 
states, and choices of the DMs, such that the 
above is satisfied. That is, 

R(Xo,A,B) = inf sup R(y,£_,a,/3,x 0 ), 

Yj— a,/Go 


where R(y, £_, a, /3, xo) denotes the communica¬ 
tion rate under the control and coding functions 
y,£, which satisfies the objectives given by the 
choices a, /3 and initial condition xo. 

Wong obtains a cut-set type lower bound: 
Given a fixed initial state, a lower bound is given 
by 2 D(f), where / is a function of the objec¬ 
tive choices and D(f) is a variation of R(f) 
introduced in (6) with the additional property that 
both DMs know / at the end of the protocol. An 
upper bound is established by the exchange of the 
initial states and objective functions also taking 
into account signaling, that is, the communication 
through control actions, which is discussed fur¬ 
ther below in the context of stabilization. Wong 
and Baillieul (2012) consider a detailed analysis 
for a real-valued bilinear controlled decentralized 
system. 


Connections with Information Theory 

Information theory literature has made significant 
contributions to such problems. An information 
theoretic setup typically entails settings where an 
unboundedly large sequence of messages are en¬ 
coded and functions of which are to be computed. 
Such a setting is not applicable in a real-time set¬ 
ting but is very useful for obtaining performance 
bounds (i.e., good lower bounds on complexity) 
which can at certain instances be achievable even 
in a real-time setting. That is, instead of a single 
realization of random variables in the setup of 
(l)-(2), the average performance for a large num¬ 
ber of independent realizations/copies for such 
problems is typically considered. 

In such a context, Definitions 1 and 2 can 
be adjusted so that the communication complex¬ 
ity is computed by mutual information (Cover 
and Thomas 1991). Replacing the fixed-rate or 
variable-rate (entropy) constraint in Definition 1 
with a mutual information constraint leads to 
convexity properties for C(R) and R(C). Such 
an information theoretic formulation can pro¬ 
vide useful lower bounds and desirable analytical 
properties. 

We note here the interesting discus¬ 
sion between decentralized computation and 
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communication provided by Orlitsky and Roche 
(2001) as well as by Witsenhausen (1976) where 
a probability-free construction is considered and 
a zero-error (non-asymptotic and error-free) 
computation is considered in the same spirit 
as in Yao (1979). 

Such decentralized computation problems can 
be viewed as multiterminal source coding prob¬ 
lems with a cost function aligned with the com¬ 
putation objective. Ma and Ishwar (2011) and 
Gamal and Kim (2012) provide a comprehensive 
treatment and review of information exchange 
requirements for computing. Essential in such 
constructions is the method of binning , which is 
a key tool in distributed source coding problems. 
Binning efficiently designates the enumeration of 
symbols (which can be confused in the absence 
of coding) given the relevant information at a 
receiver DM. 

Such problems involve interactive communi¬ 
cations as well as multiterminal coding problems. 
As mentioned earlier, it is also important to point 
out that multi-round protocols typically reduce 
the average rate requirements. 


Communication Complexity 
in Decentralized Stabilization 

An important relevant setting of reach control is 
where the target final state is the zero vector: The 
system is to be stabilized. Consider the following 
special case of (l)-(2) for an LTI system: 

L 

x t +\ = Ax t + ^ B j uj , 
j =i 

y\ = Cx t f = 0,1,... (7) 

where i e £, and it is assumed that the joint 
system is stabilizable and detectable, but the 
individual pairs (A, B l ) may not be stabilizable 
or (A, C 1 ) may not be detectable. Here, x t £ W 1 
is the state, u\ e is the control applied 
by station /, and y\ e R Pi is the observation 
available at station /, all at time t. The initial 
state xo is generated according to a probability 



Information and Communication Complexity of Net¬ 
worked Control Systems, Fig. 2 Decentralized stabi¬ 
lization with multiple controllers 


distribution supported on a compact set Af 0 C 
M". We denote controllable and unobservable 
subspaces at station i by K l and N l and refer to 
the subspace orthogonal to N l as the observable 
subspace at the / th station, denoted by O l . The 
information available to station i at time t is // = 
{y l [ot]’ u [ot-i]}’ For such a system (see Fig. 2), 
it is possible for the controllers to communicate 
through the plant with the process known as 
signaling which can be used for communication 
of mode information among the decision makers. 
Denote by i -> j the property that DM i 
can signal to DM j . This holds if and only if 
C j (A) 1 B l 0, for at least one /, 1 < / < n. 
A directed graph Q among the L stations can 
be constructed through such a communication 
relationship. 

Suppose that A is such that in its Jordan 
form, where each Jordan block admits distinct 
real eigenvalues. Then, a lower bound on the 
communication complexity (per time stage) for 
stabilizability is given by ^|i,|>i >1m, log 2 (|A,-|), 
where 

rj M = min {d(l,m) + 1 : / —>m, 

[A-'] c o { u o m , [A-'] c K m }, 
















Information and Communication Complexity of Networked Control Systems 


565 


with d(l,m ) denoting the graph distance (num¬ 
ber of edges in a shortest path) between DM 
/ and DM m in Q and [xf] denoting the sub¬ 
space spanned by X (. Furthermore, there exist 
stabilizing coding and control policies whose 
sum rate is arbitrarily close to this bound. When 
different Jordan blocks may admit repeated and 
possibly complex eigenvalues, variations of the 
result above are applicable. In the special case 
where there is a centralized controller which re¬ 
ceives information from multiple sensors (under 
stabilizability and joint detectability), even in the 
presence of noise, to achieve asymptotic stability, 
it suffices to have the average total rate be greater 
than J2\Xi\>\ l°g 2 (l h I)- The results above follow 
from Matveev and Savkin (2008) and Yiiksel and 
Ba§ar (2013). For the case with a single sensor, 
this result has been studied extensively in net¬ 
worked control (see the chapter on ► Quantized 
Control and Data Rate Constraints in the Ency¬ 
clopedia). 

Summary and Future Directions 

In this text, we discussed the problem of 
communication complexity in networked control 
systems. Our analysis considered both cost 
minimization and controllability/reachability 
problems subject to information constraints. We 
also discussed the communication complexity 
in distributed computing as has been studied in 
the computer science community and provided 
a brief discussion on the information theoretic 
approaches for such problems together with 
structural results. There are many relevant 
open problems on structural results for optimal 
policies, explicit solutions, as well as nontrivial 
upper and lower bounds on the optimal 
performance. 

Cross-References 

► Data Rate of Nonlinear Control Systems and 
Feedback Entropy 

► Flocking in Networked Systems 

► Information-Based Multi-Agent Systems 


► Networked Control Systems: Estimation and 
Control over Lossy Networks 

► Quantized Control and Data Rate Constraints 

Recommended Reading 

The information exchange requirements for 
decentralized optimization depend also on the 
structural properties of the cost functional to 
be minimized. For a class of team problems, 
one might simply need to exchange a sufficient 
statistic needed for optimal solutions. For some 
problems, there may be no need for an exchange 
at all, if the sufficient statistics are already 
available, as in the case of mean field equilibrium 
problems when the number of decision makers 
is unbounded or very large for almost optimal 
solutions; see Huang et al. (2006) and Lasry 
and Lions (2007). In case there is no common 
probabilistic information, the problem considered 
becomes further involved. The consensus 
literature, under both Bayesian and non-Bayesian 
contexts, aims to achieve agreement on a class of 
system variables under information constraints: 
see, e.g., Tsitsiklis et al. (1986). Optimization 
under local interaction and sparsity constraints 
and various criteria have been investigated in 
a number of publications including Rotkowitz 
and Lall (2006). A review for the literature 
on norm-optimal control as well as optimal 
stochastic dynamic teams is provided in Mahajan 
et al. (2012). Tsitsiklis and Athans (1985) have 
observed that from a computational complexity 
viewpoint, obtaining optimal solutions for a class 
of such communication protocol design problems 
is non-tractable (NP-hard). 

Even though obtaining explicit solutions for 
optimal coding and control results may be dif¬ 
ficult, it is useful to obtain structural results on 
optimal coding and control policies since one 
can reduce the search space to a smaller class of 
functions. For dynamic team problems, these typ¬ 
ically follow from the construction of a controlled 
Markov chain (see Walrand and Varaiya 1983) 
and applying tools from stochastic control theory 
which obtain structural results on optimal coding 
and control policies (see Nayyar et al. 2013). 
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Along these lines, for system (l)-(2), if the DMs 
can agree on a joint belief P(x t e -| I\,i e C) 
at every time stage, then the optimal cost that 
would be achieved under a centralized system 
could be attained (see Yiiksel and Ba§ar 2013). 
As a further important illustrative case, if the 
problem described in Definition 1 is for a real¬ 
time estimation problem for a Markov source, 
then the optimal causal fixed-rate coder minimiz¬ 
ing any cost function uses only the last source 
symbol and the information at the controller’s 
memory: see Witsenhausen (1979). We also note 
that the optimal design of information channels 
for optimization under information constraints 
is a non-convex problem; see Yiiksel and Lin¬ 
der (2012) and Yiiksel and Ba§ar (2013) for a 
review of the literature and certain topological 
properties of the problem. We refer the reader 
to Nemirovsky and Yudin (1983) for a com¬ 
prehensive resource on information complexity 
for optimization problems. A sequential setting 
with an information theoretic approach to the 
formulation of communication complexity has 
been considered in Raginsky and Rakhlin (2011). 
A formulation relevant to the one in Definition 1 
has been considered in Teneketzis (1979) with 
mutual information constraints. Giridhar and Ku¬ 
mar (2006) discuss distributed computation for a 
class of symmetric functions under information 
constraints and present a comprehensive review. 


Bibliography 

Cover TM, Thomas JA (1991) Elements of information 
theory. Wiley, New York 

Gamal AE, Kim YH (2012) Network information theory. 

Cambridge University Press, UK 
Giridhar A, Kumar P (2006) Toward a theory of in- 
network computation in wireless sensor networks. 
IEEE Commun Mag 44:98-107 
Huang M, Caines PE, Malhame RP (2006) Large 
population stochastic dynamic games: closed-loop 
McKean-vlasov systems and the nash certainty 
equivalence principle. Commun Inf Syst 6: 
221-251 

Kushilevitz E, Nisan N (2006) Communication complex¬ 
ity, 2nd edn. Cambridge University Press, New York 
Lasry JM, Lions PL (2007) Mean field games. Jpn J Math 
2:229-260 


Ma N, Ishwar P (2011) Some results on distributed source 
coding for interactive function computation. IEEE 
Trans Inf Theory 57:6180-6195 
Mahajan A, Martins N, Rotkowitz M, Yiiksel S (2012) 
Information structures in optimal decentralized con¬ 
trol. In: IEEE conference on decision and control, 
Hawaii 

Matveev AS, Savkin AV (2008) Estimation and 
control over communication networks. Birkhauser, 
Boston 

Nayyar A, Mahajan A, Teneketzis D (2013) The common- 
information approach to decentralized stochastic con¬ 
trol. In: Como G, Bemhardsson B, Rantzer A (eds) 
Information and control in networks. Springer Inter¬ 
national Publishing, Switzerland 
Nemirovsky A, Yudin D (1983) Problem complexity and 
method efficiency in optimization. Wiley-Interscience, 
New York 

Orlitsky A, Roche JR (2001) Coding for computing. IEEE 
Trans Inf Theory 47:903-917 
Raginsky M, Rakhlin A (2011) Information-based 
complexity, feedback and dynamics in convex 
programming. IEEE Trans Inf Theory 57: 
7036-7056 

Rotkowitz M, Lall S (2006) A characterization of convex 
problems in decentralized control. IEEE Trans Autom 
Control 51:274-286 

Teneketzis D (1979) Communication in decentralized 
control. PhD dissertation, MIT 
Tsitsiklis J, Athans M (1985) On the complexity of de¬ 
centralized decision making and detection problems. 
IEEE Trans Autom Control 30:440-446 
Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed 
asynchronous deterministic and stochastic gradient 
optimization algorithms. IEEE Trans Autom Control 
31:803-812 

Walrand JC, Varaiya P (1983) Optimal causal coding¬ 
decoding problems. IEEE Trans Inf Theory 19:814- 
820 

Witsenhausen HS (1976) The zero-error side information 
problem and chromatic numbers. IEEE Trans Inf The¬ 
ory 22:592-593 

Witsenhausen HS (1979) On the structure of real-time 
source coders. Bell Syst Tech J 58:1437-1451 
Wong WS (2009) Control communication complexity of 
distributed control systems. SIAM J Control Optim 
48:1722-1742 

Wong WS, Baillieul J (2012) Control communication 
complexity of distributed actions. IEEE Trans Autom 
Control 57:2731-2345 

Yao ACC (1979) Some complexity questions related to 
distributive computing. In: Proceedings of the 11th 
annual ACM symposium on theory of computing, 
Atlanta 

Yiiksel S, Ba§ar T (2013) Stochastic networked control 
systems: stabilization and optimization under informa¬ 
tion constraints. Birkhauser, Boston 
Yiiksel S, Linder T (2012) Optimization and convergence 
of observation channels in stochastic control. SIAM J 
Control Optim 50:864-887 



Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions 


567 


Information Structures, the 
Witsenhausen Counterexample, 
and Communicating Using Actions 

Pulkit Grover 

Carnegie Mellon University, Pittsburgh, 

PA, USA 

Abstract 

The concept of “information structures” in 
decentralized control is a formalization of the 
notion of “who knows what and when do they 
know it.” Even seemingly simple problems with 
simply stated information structures can be 
extremely hard to solve. Perhaps the simplest 
of such unsolved problem is the celebrated 
Witsenhausen counterexample, formulated 
by Hans Witsenhausen in 1968. This entry 
discusses how the information structure of the 
Witsenhausen counterexample makes it hard 
and how an information-theoretic approach, 
which explores the knowledge gradient due to 
a nonclassical information pattern, has helped 
obtain insights into the problem. 


Keywords 

Decentralized control; Information theory; Im¬ 
plicit communication; Team decision theory 


Introduction 

Modern control systems often comprise of multi¬ 
ple decentralized control agents that interact over 
communication channels (Fig. 1). What charac¬ 
teristic distinguishes a centralized control prob¬ 
lem from a decentralized one? One fundamental 
difference is a “knowledge gradient”: agents in 
a decentralized team often observe, and hence 
know, different things. This observation leads to 
the idea of information patterns (Witsenhausen 
1971), a formalization of the notion of “who 


knows what and when do they know it” (Ho et al. 
1978; Mitter and Sahai 1999). 

The information pattern is said to be classi¬ 
cal if all agents in the team receive the same 
information and have perfect recall (so they do 
not forget it). What is so special about classi¬ 
cal information patterns? For these patterns, the 
presence of external communication links has no 
effect on the optimal costs! After all, what could 
the agents use the communication links for, when 
there is no knowledge gradient? More interesting, 
therefore, are the problems for which the infor¬ 
mation pattern is fttwclassical. These problems sit 
at the intersection of communication and control: 
communication between agents can help reduce 
the knowledge differential that exists between 
them, helping them perform the control task. 
Intellectually and practically, the concept of non- 
classical information patterns motivates a lot of 
formulations at the control-communication inter¬ 
section. Many of these formulations - including 
some discussed in this Encyclopedia (e.g., ► Data 
Rate of Nonlinear Control Systems and Feed¬ 
back Entropy; ►Information and Communica¬ 
tion Complexity of Networked Control Systems 

► Quantized Control and Data Rate Constraints; 

► Networked Control Systems: Architecture and 
Stability Issues; and ► Networked Control Sys¬ 
tems: Estimation and Control Over Lossy Net¬ 
works) - intellectually ask the question: for a 
realistic channel that is constrained by noise, 
bandwidth, and speed, what is the optimal com¬ 
munication and control strategy? 

One could ask the question of optimal control 
strategy even for decentralized control problems 
where no external channel is available to bridge 
this knowledge gradient. Why could these 
problems be of interest? First, these problems 
are limiting cases of control with communi¬ 
cation constraints. Second, and perhaps more 
importantly, they bring out an interesting 
possibility that can allow the agents to 
“communicate,” i.e., exchange information, even 
when the external channel is absent. It is possible 
to use control actions to communicate through 
changing the system state! We now introduce 
this form of communication using a simple toy 
example. 
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Information Structures, the Witsenhausen Coun¬ 
terexample, and Communicating Using Actions, 
Fig- T The evolution of control systems. Modern “net¬ 


worked control systems” (also called “cyber-physical sys¬ 
tems”) are decentralized and networked using communi¬ 
cation channels 


Communicating Using Actions: 

An Example 

To gain intuition into when communication using 
actions could be useful, consider the inverted 
pendulum example shown in Fig. 2. The goal of 
the two agents is to bring the pendulum as close to 
the origin as possible. Both controllers have their 
strengths and weaknesses. The “weak” controller 
C w has little energy, but has perfect state observa¬ 
tions. On the other hand, the “blurry” controller 
Cb has infinite energy, but noisy observations. 
They act one after the other, and their goal is 
to move the pendulum close to the center from 
any initial state. The information structure of 
the problem is nonclassical: the C w , but not Cb, 
knows the initial state of the pendulum, and C w 
does not know the precise (noisy) observation of 
Cb using which Q takes actions. 

A possible strategy : A little thought reveals an 
interesting strategy - the weak controller, having 
perfect observations, can move the state to the 
closest of some predecided points in space, effec¬ 
tively quantizing the state. If these quantization 
points are sufficiently far from each other, they 
can be estimated accurately (through the noise) 
by the blurry controller, which can then use its 
energy to push the pendulum all the way to zero. 
In this way, the weak controller expends little 
energy, but is able to “communicate” the state 
through the noise to the blurry controller, by 



Information Structures, the Witsenhausen Coun¬ 
terexample, and Communicating Using Actions, 
Fig. 2 Two controllers, with their respective strengths 
and weaknesses, attempting to bring an inverted pendulum 
close to the center. Also shown (using green “+” signs) are 
possible quantization points chosen by the controllers for 
a quantization-based control strategy 

making it take values on a finite set. Once the 
blurry controller has received the state through 
the noise, it can use its infinite energy to push the 
state to zero. 

The Witsenhausen Counterexample 

The above two-controller inverted-pendulum ex¬ 
ample is, in fact, motivated by what is now 
known as “the Witsenhausen counterexample,” 
formulated by Witsenhausen in 1968 (see Fig. 3). 
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min|/ c 2 E [u 2 w \ + E [x %] } 

Information Structures, the Witsenhausen Coun¬ 
terexample, and Communicating Using Actions, 
Fig. 3 The Witsenhausen counterexample is a decep¬ 
tively simple two-time-step two-controller decentralized 
control problem. The weak and the blurry controllers, C w 
and Cb act in a sequential manner 

In the counterexample, two controllers (denoted 
here by C w for “weak” and Cb for “blurry”) act 
one after the other in two time-steps to minimize 
a quadratic cost function. The system state is 
denoted by x t , where t is the time index. u w and 
Ub denote the inputs generated by the two con¬ 
trollers. The cost function is k 2 E +E for 
some constant k. The initial state Xo and the noise 
z at the input of the blurry controller are assumed 
to be Gaussian distributed and independent, with 
variances cTq and 1 respectively. The problem 
is a “linear-quadratic-Gaussian” (LQG) problem, 
i.e., the state evolution is linear, the costs are 
quadratic, and the primitive random variables are 
Gaussian. 

Why is the problem called a “counterexam¬ 
ple”? The traditional “certainty-equivalence” 
principle (Bertsekas 1995) shows that for all 
centralized LQG problems, linear control laws 
are optimal. Witsenhausen (1968) provided 
a nonlinear strategy for the Witsenhausen 
problem which outperforms all linear strategies. 
Thus, the counterexample showed that the 
certainty-equivalence doctrine does not extend 
to decentralized control. 

What is this strategy of Witsenhausen that 
outperforms all linear strategies? It is, in fact, a 
quantization-based strategy, as suggested in our 
inverted-pendulum story above. Further, it was 
shown by Mitter and Sahai (1999) that multipoint 
quantization strategies can outperform linear 
strategies by an arbitrarily large factor! This 
observation, combined with the simplicity of 
the counterexample, makes the problem very 



Information Structures, the Witsenhausen Coun¬ 
terexample, and Communicating Using Actions, 
Fig. 4 The optimization solution of Baglietto et al. (1997) 
for k 2 = 0.5, cr ( 2 = 5. The information-theoretic strategy 
of “dirty-paper coding” Costa (1983) also yields the same 
curve (Grover and Sahai 2010) 

important in decentralized control. This simple 
two-time-step two-controller LQG problem 
needs to be understood to have any hope of un¬ 
derstanding larger and more complex problems. 

While the optimal costs for the problem are 
still unknown (even though it is known that an 
optimal strategy exists (Witsenhausen 1968; 
Wu and Verdu 2011)), there exists a wealth of 
understanding of the counterexample that has 
helped address more complicated problems. A 
body of work, starting with that of Baglietto 
et al. (1997), numerically obtained solutions that 
could be close to optimal (although there is no 
mathematical proof thereof). All these solutions 
have a consistent form (illustrated in Fig. 4), 
with slight improvements in the optimal cost. 
Because the discrete version of the problem, 
appropriately relaxed, is known to be NP- 
complete (Papadimitriou and Tsitsiklis 1986), 
this approach cannot be used to understand the 
entire parameter space and hence has focused on 
one point: k 2 = 0.5, = 5. Nevertheless, the 

approach has proven to be insightful: a recent 
information-theoretic body of work shows that 
the strategies of Fig. 4 can be thought of as 
information-theoretic strategies of “dirty-paper 
coding” Costa (1983) that is related to the idea of 
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embedding information in the state. The question 
here is: how do we embed the information about 
the state in the state itself 1 

An information-theoretic view of the coun¬ 
terexample: This information-theoretic approach 
that culminated in Grover et al. (2013) also 
obtained the first approximately optimal 
solutions to the Witsenhausen counterexample 
as well as its vector extensions. The result is 
established by analyzing information flows in the 
counterexample that work toward minimizing the 
knowledge gradient, effectively an information 
pattern in which C w can predict the observation 
of Cb more precisely. The analysis provides 
an information-theoretic lower bound on cost 
that holds irrespective of what strategy is used. 
For the original problem, this characterizes the 
optimal costs (with associated strategies) within 
a factor of 8 for all problem parameters (i.e., k 
and <Tq ). For any finite-length extension, uniform 
finite-ratio approximations also exist (Grover 
et al. 2013). The asymptotically infinite- 
length extension has been solved exactly 
(Choudhuri and Mitra 2012). 

The problem has also driven delineation of de¬ 
centralized LQG control problems with optimal 
linear solutions and those with nonlinear optimal 
solutions. This led to the development and under¬ 
standing of many variations of the counterexam¬ 
ple (Bansal and Ba§ar 1987; Ba§ar 2008; Ho et al. 
1978; Rotkowitz 2006) and understanding that 
can extend to larger decentralized control prob¬ 
lems. More recent work shows that the promise 
of the Witsenhausen counterexample was not 
a misplaced one: the information-theoretic ap¬ 
proach that provides approximately optimal solu¬ 
tions to the counterexample (Grover et al. 2013) 
yields solutions to other more complex (e.g., 
multi-controller, more time-steps) problems as 
well (Grover 2010; Park and Sahai 2012). 


Summary and Future Directions 

Even simple problems with nonclassical 
information structures can be hard to solve 
using classical techniques, as is demonstrated 


by the Witsenhausen counterexample. However, 
nonclassical information pattern for some simple 
problems - starting with the counterexample - 
has recently been explored via an information- 
theoretic lens, yielding the first optimal or 
approximately optimal solutions to these 
problems. This approach is promising for larger 
decentralized control problems as well. It is 
now important to explore what is the simplest 
decentralized control problem that cannot be 
solved (exactly or approximately) using ideas 
developed for the counterexample. In this 
manner, the Witsenhausen counterexample can 
provide a platform to unify the more modern 
(i.e., external-channel centric approaches, see 

► Quantized Control and Data Rate Constraints; 

► Data Rate of Nonlinear Control Systems 
and Feedback Entropy; ►Networked Control 
Systems: Architecture and Stability Issues; 

► Networked Control Systems: Estimation and 
Control Over Lossy Networks; ►Information 
and Communication Complexity of Networked 
Control Systems; in the encyclopedia) with the 
more classical decentralized LQG problems, 
leading to enriching and useful formula¬ 
tions. 
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Abstract 

Multi-agent systems are encountered in nature 
(animal groups), in various domains of technol¬ 
ogy (multi-robot networks, mixed robot-human 
teams) and in various human activities (such as 
dance and team athletics). Information exchange 
among agents ranges from being incidentally 
important to crucial in such systems. Several sys¬ 
tems in which information exchange among the 
agents is either a primary goal or a primary en¬ 
abler are discussed briefly. Specific topics include 
power management in wireless communication 
networks, data-rate constraints, the complexity 
of distributed control, robotics networks and for¬ 
mation control, action-mediated communication, 
and multi-objective distributed systems. 

Keywords 

Distributed control; Information constraints; 
Multi-agent systems 

Introduction 

The role of information patterns in the 
decentralized control of multi-agent systems has 
been studied in different theoretical contexts for 
more than five decades. The paper Ho (1972) 
provides references to early work in this area. 
While research on distributed decision making 
has continued, a large body of recent research on 
robotic networks has brought new dimensions of 
geometric aspects of information patterns to the 
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forefront (Bullo et al. 2009). At the same time, 
machine intelligence, machine learning, machine 
autonomy, and theories of operation of mixed 
teams of humans and robots have considerably 
extended the intellectual frontiers of information- 
based multi-agent systems (Baillieul et al. 2012). 
A further important development has been the 
study of action-mediated communication and the 
recently articulated theory of control communica¬ 
tion complexity (Wong and Baillieul 2012). These 
developments may shed light on nonverbal forms 
of communication among biological organisms 
(including humans) and on the intrinsic energy 
requirements of information processing. 

In conventional decentralized control, the con¬ 
trol objective is usually well-defined and known 
to all agents. Multi-agent information-based con¬ 
trol encompasses a broader scenario, where the 
objective can be agent dependent and is not 
necessarily explicitly announced to all. For il¬ 
lustration, consider the power control problem 
in wireless communication - one of the ear¬ 
liest engineering systems that can be regarded 
as multi-agent information based. It is common 
that multiple transmitter-receiver communication 
pairs share the same radio frequency band and 
the transmission signals interfere with each other. 
The power control problem searches for feedback 
control for each transmitter to set its power level. 
The goal is for each transmitter to achieve tar¬ 
geted signal-to-interference ratio (SIR) level by 
using information of the observed levels at the 
intended receiver only. 

A popular version of the power control 
problem (Foschini and Miljanic 1993) defines 
each individual objective target level by means 
of a requirement threshold, known only to the 
intended transmitter. As SIR measurements 
naturally reside on a receiver, the observed 
SIR needs to be communicated back to the 
transmitter. For obvious reasons, the bandwidth 
for such communication is limited. The resulting 
model fits the bill of multi-agent information- 
based control. In Sung and Wong (1999), a 
tristate power control strategy is proposed 
so that the power control outputs are either 
increased or decreased by a fixed dB or no 
change at all. Convergence of the feedback 


algorithm was shown using a Lyapunov-like 
function. 

This entry surveys key topics related to multi¬ 
agent information-based control systems, includ¬ 
ing control complexity, control with data-rate 
constraints, robotic networks and formation con¬ 
trol, action-mediated communication, and multi¬ 
objective distributed systems. 

Control Complexity 

In information-based distributed control systems, 
how to efficiently share computational and com¬ 
munication resources is a fundamental issue. One 
of the earliest investigations on how to schedule 
communication resources to support a network 
of sensors and actuators is discussed in Brockett 
(1995). The concept of communication sequenc¬ 
ing was introduced to describe how the commu¬ 
nication channel is utilized to convey feedback 
control information in a network consisting of 
interacting subsystems. In Brockett (1997), the 
concept of control attention was introduced to 
provide a measure of the complexity of a con¬ 
trol law against its performance. As attention is 
a shared, limited resource, the goal is to find 
minimum attention control. Another approach to 
gauge control complexity in a distributed system 
is by means of the minimum amount of communi¬ 
cated data required to accomplish a given control 
task. 


Control with Data-Rate Constraints 

A fundamental challenge in any control imple¬ 
mentation in which system components com¬ 
municate with each other over communication 
links is ensuring that the channel capacity is 
large enough to deal with the fastest time con¬ 
stants among the system components. In a single 
agent system, the so-called Data-Rate Theorem 
has been formulated in various ways to under¬ 
stand the constraints imposed between the sensor 
and the controller and between the controller 
and the actuator. Extensions to this fundamental 
result have been focused on addressing similar 
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problems in the network control system context. 
Information on such extensions in the distributed 
control setting can be found in Nair and Evans 
(2004) and Yiiksel and Basar (2007). 

Robotic Networks and Formation 
Control 

The defining characteristic of robotic networks 
within the larger class of multi-agent systems 
is the centrality of spatial relationships among 
network nodes. Graph theory has been shown 
to provide a generally convenient mathematical 
language in which to describe spatial concepts 
and it is the key to understanding spatial rigid¬ 
ity related to the control of formations of au¬ 
tonomous vehicles (Anderson et al. 2008), or 
in flocking systems (Leonard et al. 2012), or in 
consensus problems (Su and Huang 2012), or in 
rendezvous problems (Cortes et al. 2006). For 
these distributed control research topics, readers 
can consult other sections in this Encyclopedia 
for a comprehensive reference list. 

Much of the recent work on formation 
control has included information limitation 
considerations. For consensus problems, for 
example, Olfati-Saber and Murray (2004) 
introduced a sensing cost constraint, and in 
Ren and Beard (2005) information exchange 
constraints are considered, and in Yu and Wang 
(2010) communication delays are explicitly 
modeled. 

Action-Mediated Communication 

Biological organisms communicate through mo¬ 
tion. Examples of this include prides of lions or 
packs of wolves whose pursuit of prey is a coop¬ 
erative effort and competitive team athletics in the 
case of humans. Recent research has been aimed 
at developing a theoretical foundation of action- 
mediated communication. Communication proto¬ 
cols for motion-based signaling between mobile 
robots have been developed (Raghunathan and 
Baillieul 2009) and preliminary steps towards a 
theory of artistic expression through controlled 


movements in dance have been reported in Bail¬ 
lieul and Ozcimder (2012). Motion-based com¬ 
munication of this type involves specially tailored 
motion description languages in which sequences 
of motion primitives are assembled with the ob¬ 
jective of conveying artistic intent, while min¬ 
imizing the use of limited energy resources in 
carrying out the movement. These motion primi¬ 
tives constitute the alphabet that enables commu¬ 
nication, and physical constraints on the motions 
define the grammatical rules that govern the ways 
in which motion sequences may be constructed. 

Research on action-mediated communication 
helps illustrate the close connection between con¬ 
trol and information theory. Further discussion 
of the deep connection between the two can be 
found, for example, in Park and Sahai (2011), 
which argues for the equivalence between the 
stabilization of a distributed linear system and 
the capacity characterization in linear network 
coding. 

Multi-objective Distributive Systems 

In a multi-agent system, agents may aim to carry 
out individual objectives. These objectives can 
either be cooperatively aligned (such as in a 
cooperative control setting) or may contend an¬ 
tagonistically (such as in a zero-sum game set¬ 
ting). In either case, a common assumption is that 
the objective functions are a priori known to all 
agents. However, in many practical applications, 
agents do not know the objectives of other agents, 
at least not precisely. For example, in the power 
control problem alluded to earlier, the signal-to- 
interference requirement of a user may be un¬ 
known to other users. Yet this does not prevent the 
possibility of deriving convergence algorithms to 
allow the joint goals to be achieved. 

The issue of unknown objectives in a multi¬ 
agent system is formally analyzed in Wong 
(2009) via the introduction of choice-based 
actions. In an open access network, objectives of 
an individual agent may be known only partially, 
via the form of a random distribution in some 
cases. In order to achieve a joint control objective 
in general, some communication via the system 
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is required if there is no side communications 
channel. A basic issue is how to measure the 
minimum amount of information exchange 
that is required to perform a specific control 
task. Motivated by the idea of communication 
complexity in computer science, the idea 
of control communication complexity was 
introduced in Wong (2009), which can provide 
such a measure. In Wong and Baillieul (2009), 
the idea was extended to a rich class of nonlinear 
systems that arise as models of physical processes 
ranging from rigid body mechancs to quantum 
spin systems. 

In some special cases, control objectives 
can be achieved without any communication 
among the agents. For systems with bilinear 
input-output mapping, including the Brockett 
Integrator, it is possible to derive conditions 
that guarantee this property (Wong and Baillieul 
2012). Moreover, for quadratic type of control 
cost, it is possible to compute the optimal 
control cost. Similar results can be extended 
to linear systems as discussed in Liu et al. 
(2013). This circle of ideas is connected to the so- 
called standard parts problem as investigated in 
Baillieul and Wong (2009). Another connection 
is to correlated equilibrium problems that have 
been recently studied by game theorists Shoham 
and Leyton-Brown (2009). 


Cross-References 

► Motion Description Languages and Symbolic 
Control 

► Multi-vehicle Routing 

► Networked Systems 
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Abstract 

The notion of input to state stability (ISS) 
qualitatively describes stability of the mapping 
from initial states and inputs to internal states 
(and more generally outputs). This entry focuses 
on the definition of ISS and a discussion of 
equivalent characterizations. 


Keywords 

Asymptotic stability; Dissipation; Lyapunov 
functions 


Introduction 

We consider here systems with inputs in the usual 
sense of control theory: 

x(t) = f(x(t),u(t )) 

(the arguments “£” are often omitted). There 
are n state variables and m input channels. 
States x(t ) take values in Euclidean space 
M", and the inputs (also called “controls” 
or “disturbances” depending on the context) 
are measurable in locally essentially bounded 
maps uif) : [0, oo) -> M m . The map / : 
W 1 x M m -> W 1 is assumed to be locally 
Lipschitz with /(0,0) = 0. The solution, defined 
on some maximal interval [0, t max (x°, u)), for 
each initial state v° and input u , is denoted as 
x(t,x°,u) and, in particular, for systems with 
no inputs x(t) = f(x(t)), just as x(t,x°). The 
zero system associated to x = f(x,u) is by 
definition the system with no inputs x = f(x, 0). 
Euclidean norm is written as \x\. For a function 
of time, typically an input or a state trajectory, 
\\u\\, or IHI^ for emphasis, is the (essential) 
supremum or “sup” norm (possibly +oo, if u is 
not bounded). The norm of the restriction of a 
signal to an interval I is denoted by Hw/H^ (or 
juSt||M/||). 


Input-to-State Stability 

It is convenient to introduce “comparison func¬ 
tions” to quantify stability. A class /Coo function 
is a function a : M>o -> M>o which is con¬ 
tinuous, strictly increasing, and unbounded and 
satisfies o'(O) = 0, and a class ICC function is 
a function /3 : M>o x M>o M>o such that 
/3(*, t) e /Coo for each t and fi(r,t) decreases to 
zero as t —oo, for each fixed r. 
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For a system with no inputs x = f(x), there 
is a well-known notion of global asymptotic 
stability (for short from now on, GAS , or 
“O-GAS” when referring to the zero system 
x = f(x, 0 ) associated to a given system with 
inputs x = f(x,u )) due to Lyapunov and usually 
defined in “e-<5” terms. It is an easy exercise 
to show that this standard definition is in fact 
equivalent to the following statement: 

(3/8 e 1C£)\x{t, x°)| <p (|x°|, r) V x°, Vr >0. 

The notion of input to state stability (ISS) was 
introduced in Sontag (1989), and it provides theo¬ 
retical concepts used to describe stability features 
of a mapping (u(-),x( 0 )) ax(-) that sends initial 
states and input functions into states (or, more 
generally, outputs). Prominent among these fea¬ 
tures are that inputs that are bounded, “eventually 
small,” “integrally small,” or convergent should 
lead to outputs with the respective property. In 
addition, ISS and related notions quantify in what 
manner initial states affect transient behavior. The 
formal definition is as follows. 

A system is said to be input to state stable 
(ISS) if there exist some /3 e ICC and y e /Coo 
such that 

l*(0l < + y(IMIoo) (iss) 

holds for all solutions (meaning that the estimate 
is valid for all inputs £/(•), all initial conditions 
x°, and all t >0). Note that the supremum 
suP^mKI^OOI) over interval [ 0 A] is the 
same as y(||«[o, r ]||oo) = y(sup ie[ 0 i<] (|M(s)|)), 

because the function y is increasing, so one may 
replace this term by yOMI^), where = 

su Pse[0,oo) y(|w(s)|) i s the SU P norm of the input, 
because the solution x(t) depends only on values 
u(s),s < t (so, one could equally well consider 
the input that has values = 0 for all s > t). 

Since, in general, max{< 2 ,&} < a + b < 
max{ 2 ( 2 , 2b}, one can restate the ISS condition 
in a slightly different manner, namely, asking for 
the existence of some /3 e ICC and y e /Coo 
(in general, different from the ones in the ISS 
definition) such that 



Input-to-State Stability, Fig. 1 ISS combines over¬ 
shoot and asymptotic behavior 

\x(t)\ < max {/3(|x°|, ?), y (IMloo)} 

holds for all solutions. Such redefinitions, using 
“max” instead of sum, are also possible for each 
of the other concepts to be introduced later. 

Intuitively, the definition of ISS requires that, 
for t large, the size of the state must be bounded 
by some function of the sup norm - that is to say, 
the amplitude - of inputs (because fi(\x° |, t) 0 
as t oo). On the other hand, the P(\x°\ ,0) 
term may dominate for small t, and this serves 
to quantify the magnitude of the transient (over¬ 
shoot) behavior as a function of the size of the 
initial state x° (Fig. 1). The ISS superposition the¬ 
orem , discussed later, shows that ISS is, in a pre¬ 
cise mathematical sense, the conjunction of two 
properties, one of them dealing with asymptotic 
bounds on \x°\ as a function of the magnitude of 
the input and the other one providing a transient 
term obtained when one ignores inputs. 

For internally stable linear systems x = Ax + 
Bu , the variation of parameters formula gives 
immediately the following inequality: 

k(0l < P(t)\x°\ + y Moo, 

where 

P(t) = \\e tA \\ -* 0 and 

p OO 

y = ||£ || / \\e sA \\ ds < 00 . 

Jo 
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This is a particular case of the ISS estimate, 
|x(OI < P(\x °\,0 + Y (\\ u Hoo)* with linear 
comparison functions. 


Feedback Redesign 

The notion of ISS arose originally as a way to 
precisely formulate, and then answer, the follow¬ 
ing question. Suppose that, as in many problems 
in control theory, a system x = f(x,u) has been 
stabilized by means of a feedback law u = k(x) 
(Fig. 2), that is to say, k was chosen such that the 
origin of the closed-loop system x = f(x , k(x)) 
is globally asymptotically stable. (See, e.g., Son- 
tag 1999 for a discussion of mathematical aspects 
of state feedback stabilization.) Typically, the de¬ 
sign of k was performed by ignoring the effect of 
possible input disturbances d (•) (also called ac¬ 
tuator disturbances). These “disturbances” might 
represent true noise or perhaps errors in the calcu¬ 
lation of the value k{x) by a physical controller 
or modeling uncertainty in the controller or the 
system itself. What is the effect of considering 
disturbances? In order to analyze the problem, d 
is incorporated into the model, and one studies 
the new system x = f(x, k{x) + d), where d is 
seen as an input (Fig. 3). One may then ask what 
is the effect of d on the behavior of the system. 
Disturbances d may well destabilize the system, 
and the problem may arise even when using a rou¬ 
tine technique for control design, feedback lin¬ 
earization. To appreciate this issue, take the fol¬ 
lowing very simple example. Given is the system 

x = f(x,u) = x + (x 2 + 1 )u. 



Input-to-State Stability, Fig. 2 Feedback stabilization, 
closed-loop system x = f(x, k(x)) 



Input-to-State Stability, Fig. 3 Actuator disturbances, 
closed-loop system x = f(x, k(x) + d ) 

so that / (x, k(x)) = —x. This is a GAS system. 
The effect of the disturbance input d is analyzed 
as follows. The system x = f(x,k(x)+d) is 

x = — x + (x 2 + 1) d . 

This system has solutions which diverge to 
infinity even for inputs d that converge to zero; 
moreover, the constant input d = 1 results in 
solutions that explode in finite time. Thus k(x) = 
was not a good feedback law, in the sense 
that its performance degraded drastically once 
actuator disturbances were taken into account. 

The key observation for what follows is that 
if one adds a correction term x” to the above 
formula for k(x), so that now, 


In order to stabilize it, substitute u = (a pre¬ 
liminary feedback transformation), rendering the 
system linear with respect to the new input u: x = 
x-hu, and then use u = —2x in order to obtain the 
closed-loop system x = — x. In other words, in 
terms of the original input u , the feedback law is 


k(x) 


—2x 
x 2 + 1 


then the system x = f(x,k(x) + d) with 
disturbance d as input becomes instead 

x = — 2x — x 3 + (x 2 + 1) d 


k(x) 


—2x 
x 2 + 1 


and this system is much better behaved: it is still 
GAS when there are no disturbances (it reduces 
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to x = —2x — x 3 ), but, in addition, it is ISS (easy 
to verify directly, or appealing to some of the 
characterizations mentioned later). Intuitively, for 
large x, the term —x 3 serves to dominate the term 
(x 2 + 1 )d, for all bounded disturbances d(-), and 
this prevents the state from getting too large. 

This example is an instance of a general result, 
which says that, whenever there is some feedback 
law that stabilizes a system, there is also a (pos¬ 
sibly different) feedback so that the system with 
external input d is ISS. 

Theorem 1 (Sontag 1989). Consider a system 
affine in controls 

m 

X = f(x,u) = g 0 (x) + y] Uigi (x) (go(0) = 0) 
i = 1 

and suppose that there is some differentiable 
feedback law u = k(x) so that 

X = f(x,k(x)) 

has x = 0 as a GAS equilibrium. Then, there is a 
feedback law u = k(x) such that 

x = /(x, k(x) + d) 

is ISS with input df). 

The reader is referred to the book Krstic et al. 
(1995), and the references given later, for many 
further developments on the subjects of recursive 
feedback design, the “backstepping” approach, 
and other far-reaching extensions. 

Equivalences for ISS 

This section reviews results that show that ISS 
is equivalent to several other notions, including 
asymptotic gain, existence of robustness mar¬ 
gins, dissipativity, and an energy-like stability 
estimate. 

Nonlinear Superposition Principle 

Clearly, if a system is ISS, then the system with 
no inputs x = /(x, 0) is GAS: the term IMI^ 


vanishes, leaving precisely the GAS property. 
In particular, then, the system x = f(x,u) is 
0-stable , meaning that the origin of the system 
without inputs x = /(x, 0) is stable in the sense 
of Lyapunov: for each € > 0, there is some 8 > 0 
such that \x°\ < 8 implies \x(t,x°)\ < €. (In 
comparison-function language, one can restate 0- 
stability as follows: there is some y e JC such that 
\x(t,x°)\ < y(|x°|) holds for all small x°.) 

On the other hand, since f(\x°\,t) 0 as t 

cx), for t large one has that the first term in the 
ISS estimate |x(£)| < max{/3(|x°| ,t), y (Halloo)} 
vanishes. Thus an ISS system satisfies the fol¬ 
lowing asymptotic gain property (“AG”): there 
is some y e JCqq so that: 

- HIND vx°, m(-) 

(AG) 

(see Fig. 4). In words, for all large enough t, 
the trajectory exists, and it gets arbitrarily close 
to a sphere whose radius is proportional, in a 
possibly nonlinear way quantified by the function 
y, to the amplitude of the input. In the language 
of robust control, the estimate (AG) would be 
called an “ultimate boundedness” condition; it 
is a generalization of attractivity (all trajectories 
converge to zero, for a system x = f(x) with 
no inputs) to the case of systems with inputs; the 
“lim sup” is required since the limit of x(t) as 
t —> oo may well not exist. From now on (and 
analogously when defining other properties), we 
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will just say “the system is AG” instead of the 
more cumbersome “satisfies the AG property.” 

Observe that, since only large values of t mat¬ 
ter in the limsup, one can equally well consider 
merely tails of the input u when computing its sup 
norm. In other words, one may replace yOMI^) 
by )/(lim^ +00 \u(t)\), or (since y is increasing) 
lim/^ +ooy(K0D- 

The surprising fact is that these two necessary 
conditions are also sufficient. This is summarized 
by the ISS superposition theorem. 

Theorem 2 (Sontag and Wang 1996). A sys¬ 
tem is ISS if and only if it is 0-stable and AG. 

A minor variation of the above superposition 
theorem is as follows. Let us consider the limit 
property (LIM): 

inf \x(t,x°, u)\ < y(IMIoo) Vx°, u(-) (LIM) 
t> o 

(for some y e /Coo)- 

Theorem 3 (Sontag and Wang 1996). A sys¬ 
tem is ISS if and only if it is 0-stable and LIM. 

Robust Stability 

In this entry, a system is said to be robustly stable 
if it admits a margin of stability p, that is, a 
smooth function p e /Coo so the system 

X = g(x,d ) := f(x,dp(\x\)) 

is GAS uniformly in this sense: for some /3 e 

ICC , 

| x(t, x°, d) | < )8(|jc 0 |, t) 

for all possible df) : [0, oo) -> [— 1, l] m . An al¬ 
ternative way to interpret this concept (cf. Sontag 
and Wang 1995) is as uniform global asymptotic 
stability of the origin with respect to all possible 
time-varying feedback laws A bounded by p: 

| A(t, x)\ < p( \x\). In other words, the system 

x = f(x,A(t,x)) 

(Fig. 5) is stable uniformly over all such pertur¬ 
bations A. In contrast to the ISS definition, which 
deals with all possible “open-loop” inputs, the 



present notion of robust stability asks about all 
possible closed-loop interconnections. One may 
think of A as representing uncertainty in the 
dynamics of the original system, for example. 

Theorem 4 (Sontag and Wang 1995). A sys¬ 
tem is ISS if and only if it is robustly stable. 

Intuitively, the ISS estimate \x(t)\ < max 

{P(\x°\,t), y (IMloo)} says that the /3 term 
dominates as long as \u(t)\ \x(t) \ for all t, but 

\u(t)\ |jc(t)| amounts to u(t) = d(t).p(\x(t)\) 

with an appropriate function p. This is an instance 
of a “small gain” argument, see below. One 
analog for linear systems is as follows: if A is 
a Hurwitz matrix, then A + Q is also Hurwitz, 
for all small enough perturbations Q \ note that 
when Q is a nonsingular matrix, | Qx\ is a /Coo 
function of \x\. 

Dissipation 

Another characterization of ISS is as a dissipation 
notion stated in terms of a Lyapunov-like func¬ 
tion. A continuous function V : M" —> M is said 
to be a storage function if it is positive definite, 
that is, L(0) = 0 and V(x) > 0 for x ^ 0, and 
proper, that is, V(x) -> oo as \x\ —> oo. This 
last property is equivalent to the requirement that 
the sets L -1 ([0, A]) should be compact subsets 
of M", for each A > 0, and in the engineering 
literature, it is usual to call such functions radi¬ 
ally unbounded. It is an easy exercise to show that 
V : W 1 M is a storage function if and only if 
there exist a , a e /Cqo such that 

a(\x\) < V(x) < a(\x\) Vx Gt” 
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(the lower bound amounts to properness and 
V(x ) > 0 for v ^ 0, while the upper 

bound guarantees F(0) = 0). For convenience, 
V : W* x M m M is the function: 

V(x,u) := VV{x).f{x,u) 

which provides, when evaluated at ( x(t),u(t )), 
the derivative dV(x(t))/dt along solutions of 
i = f(x,u). 

An ISS-Lyapunov function for i = f(x,u ) 
is by definition a smooth storage function F for 
which there exist functions y, a e JCqq so that 

V(x,u ) < — a(|jc|) + y(\u\) Vi,m. 

(L-ISS) 

Integrating, an equivalent statement is that, along 
all trajectories of the system, there holds the 
following dissipation inequality: 

V(x(t 2 )) — V(x(t\)) < f w(u(s),x(s)) ds 
Jt\ 

where, using the terminology of Willems 
(1976), the “supply” function is w(u,x) = 
y(\u\ ) — cu(|jc|). For systems with no inputs, 
an ISS-Lyapunov function is precisely the same 
object as a Lyapunov function in the usual sense. 

Theorem 5 (Sontag and Wang 1995). A sys¬ 
tem is ISS if and only if it admits a smooth ISS- 
Lyapunov function. 

Since — ot(\x\) < — a(a~ l (V(x))), the ISS- 
Lyapunov condition can be restated as 

V(x,u) < —a(V(x)) + y(\u\) Vx,w 

for some a e /Coo- In fact, one may strengthen 
this a bit (Praly and Wang 1996): for any ISS 
system, there is a always a smooth ISS-Lyapunov 
function satisfying the “exponential” estimate 
V(x,u)<-V(x) + y(\u\). 

The sufficiency of the ISS-Lyapunov condi¬ 
tion is easy to show and was already in the orig¬ 
inal paper Sontag (1989). A sketch of proof is as 
follows, assuming for simplicity a dissipation es¬ 
timate in the form V(x, u) < —a(V(x)) + y(\u\). 
Given any v and u , either a(V(x)) < 2y(\u\) 


or V < —a(V)/2. From here, one deduces by 
a comparison theorem that, along all solutions, 

V(x(t))< max {P(V 0' _1 (2j/(||m|| oo ))} , 

where the KC function /3(s , t ) is the solution y(t) 
of the initial value problem 

y = a(j) + y(u), j( 0 ) = 5. 

Finally, an ISS estimate is obtained from 
V(x°) < a(x°). 

The proof of the converse part of the theorem 
is based upon first showing that ISS implies 
robust stability in the sense already discussed 
and then obtaining a converse Lyapunov 
theorem for robust stability for the system 
x = f(x, dp(jxj)) = g(x,d), which is 

asymptotically stable uniformly on all Lebesgue- 
measurable functions d(-) : M>o —> B( 0,1). This 
last theorem was given in Lin et al. (1996) and 
is basically a theorem on Lyapunov functions 
for differential inclusions. The classical result of 
Massera (1956) for differential equations (with 
no inputs) becomes a special case. 

Using "Energy" Estimates Instead of 
Amplitudes 

In linear control theory, H 0 0 theory studies L 2 -> 
L 2 induced norms, which under coordinate 
changes leads to the following type of estimate: 

«(l*Cs)l))^ < Q!o(|x 0 |)+ [ y(|w(s)|)d,s 

Jo 

along all solutions and for some a, ao, y e /Cqo- 
Just for the statement of the next result, a system 
is said to satisfy an integral-integral estimate if 
for every initial state x° and input u , the solution 
x(t,x°,u) is defined for alW > 0 and an estimate 
as above holds. (In contrast to ISS, this definition 
explicitly demands that t max = oo.) 

Theorem 6 (Sontag 1998). A system is ISS if 
and only if it satisfies an integral-integral esti¬ 
mate. 
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This theorem is quite easy to prove, in 
view of previous results. A sketch of proof 
is as follows. If the system is ISS, then 
there is an ISS-Lyapunov function satisfying 
V(x,u) < —V(x) + y(\u\), so, integrating along 
any solution: 


[ V(x(s))ds < f V(x(s))ds + V{x(t)) 

Jo Jo 

< V(x(0)) + ( y(\u(s)\)ds 

Jo 


+ a ^ j y(|w(s)|)<i,s 
for appropriate /Coo functions (note the additional 

“a”). 

Theorem 7 (Angeli et al. 2000b). A system 
satisfies a weak integral to integral estimate if 
and only if it is ilSS. 

Another interesting variant is found when consid¬ 
ering mixed integral/supremum estimates: 


and thus an integral-integral estimate holds. Con¬ 
versely, if such an estimate holds, one can prove 
that x = f(x, 0) is stable and that an asymptotic 
gain exists. 


Integral Input to State Stability 

A concept of nonlinear stability that is truly 
distinct from ISS arises when considering a 
mixed notion which combines the “energy” of the 
input with the amplitude of the state. A system 
is said to be integral-input to state stable (ilSS) 
provided that there exist a, y e /Coo and /3 e ICC 
such that the estimate 

«(l*(0l) < K\x°\,0 + f y(\u(s)\)ds 

Jo 

(ilSS) 

holds along all solutions. Just as with ISS, one 
could state this property merely for all times 
t £ Since the right-hand side is 

bounded on each interval [0, t] (because, recall, 
inputs are by definition assumed to be bounded 
on each finite interval), it is automatically true 
that t max (x°,u) = -boo if such an estimate 
holds along maximal solutions. So forward¬ 
completeness (solution exists for all t > 0) can 
be assumed with no loss of generality. 

One might also consider the following type of 
“weak integral to integral” mixed estimate: 

[ a(|v(s)|) < k(\x°\) 

Jo 


«(I*(0I < 0(1 AO + f n(\u(s)\)ds 

Jo 

+ MIMloo) 

for suitable f> e ICC and a , y* e /Coo • One then 
has 

Theorem 8 (Angeli et al. 2000b). A system 
satisfies a mixed estimate if and only if it is ilSS. 

Dissipation Characterization of ilSS 

A smooth storage function V is an ilSS-Lyapunov 
function for the system x = f(x, u) if there are 
a ye /Coo and an a : [0, -boo) -> [0, -boo) 
which is merely positive definite (i.e., Q'(O) = 0 
and ct(r) >0 for r > 0) such that the inequality 

V(x,u) < -a(\x\) + y( \u\) (L-iISS) 

holds for all (. x,u ) e M” x To compare, 
recall that an ISS-Lyapunov function is required 
to satisfy an estimate of the same form but where 
a is required to be of class /Coo; since every /Coo 
function is positive definite, an ISS-Lyapunov 
function is also an ilSS-Lyapunov function. 

Theorem 9 (Angeli et al. 2000a). A system is 
ilSS if and only if it admits a smooth ilSS- 
Lyapunov function. 

Since an ISS-Lyapunov function is also an ilSS 
one, ISS implies ilSS. However, ilSS is a strictly 
weaker property than ISS, because a may be 
bounded in the ilSS-Lyapunov estimate, which 
means that V may increase, and the state become 
unbounded, even under bounded inputs, so long 
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as y(\u(t)\) is larger than the range of a. This 
is also clear from the iISS definition, since a 
constant input with \u(t)\ = r results in a term 
in the right-hand side that grows like rt. 

An interesting general class of examples is 
given by bilinear systems 


x — -\- ^ ' Uf Aj J x Bu 

for which the matrix A is Hurwitz. Such systems 
are always iISS (see Sontag 1998), but they are 
not in general ISS. For instance, in the case when 
B = 0, boundedness of trajectories for all con¬ 
stant inputs already implies that A + Yl?=\ A 
must have all eigenvalues with nonpositive real 
part, for all u e M m , which is a condition 
involving the matrices A t (e.g., x = — x + ux 
is iISS but it is not ISS). 

The notion of iISS is useful in situations where 
an appropriate notion of detectability can be 
verified using LaSalle-type arguments. There 
follow two examples of theorems along these 
lines. 

Theorem 10 (Angeli et al. 2000a). A system is 
iISS if and only if it is 0-GAS and there is a 
smooth storage function V such that, for some 

G E / Coo *’ 

V(x,u) < g(\u \) 

for all (x,u). 

The sufficiency part of this result follows from 
the observation that the 0-GAS property by itself 
already implies the existence of a smooth and 
positive definite, but not necessarily proper, func¬ 
tion Vo such that Vo < yo(M) — ^o(kl) for all 
(. x,u ), for some yo e /Coo and positive definite 
oio (if Vo were proper, then it would be an ilSS- 
Lyapunov function). Now, one uses Vo + V as an 
ilSS-Lyapunov function (V provides properness). 

Theorem 11 (Angeli et al. 2000a). A system is 
iISS if and only if there exists an output function 
y = h(x) (continuous and with h( 0) = 0) 
which provides zero detectability (u = 0 and 
y = 0 ^ x(t) — >0) and dissipativity in the 


following sense: there exists a storage function V 
and g e /Coo, & positive definite, so that 

V(x, u) < g(\u\) — a(h(x)) 

holds for all (x,u). 

Angeli et al. (2000b) contains several additional 
characterizations of iISS. 

Superposition Principles for iISS 

There are also asymptotic gain characterizations 
for iISS. A system is bounded energy weakly 
converging state (BEWCS) if there exists some 
g e /Coo so that the following implication holds: 

r +oo 

/ ct(|m(^)|) ds < -boo =>► 

Jo 

liminf \x(t, x°, u)\ = 0 BEWCS 

t —>+00 

(more precisely: if the integral is finite, 
then t m3LX (x°,u) = +00 and the liminf 

is zero). It is bounded energy frequently 
bounded state (BEFBS) if there exists some 
g e /Coo so that the following implication 
holds: 

p +00 

/ cr(|w(s)|) ds < +00 =>► 

Jo 

liminf \x(t, x°, u)\ < +00 BEFBS 

t —>-|-oo 

(again, meaning that t max (x°, u) = +00 and the 
lim inf is finite). 

Theorem 12 (Angeli et al. 2004). The follow¬ 
ing three properties are equivalent for any given 
system x = f(x,u): 

• The system is iISS. 

• The system is BEWCS and 0-stable. 

• The system is BEFBS and 0-GAS. 


Summary and Future Directions 

This entry focuses on stability notions relative to 
steady states, but a more general theory is also 
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possible that allows consideration of more 
arbitrary attractors, as well as robust and/or 
adaptive concepts. Much else has been omitted 
from this entry. Most importantly, one of the key 
results is the ISS small-gain theorem due to Jiang 
et al. (1994), which provides a powerful sufficient 
condition for the interconnection of ISS systems 
being itself ISS. 

Other topics not treated include, among many 
others, all notions involving outputs; ISS proper¬ 
ties of time-varying (and in particular periodic) 
systems; ISS for discrete-time systems; questions 
of sampling, relating ISS properties of continuous 
and discrete-time systems; ISS with respect to 
a closed subset K\ stochastic ISS; applications 
to tracking, vehicle formations (“leader to fol¬ 
lowers” stability); and averaging of ISS systems. 
Sontag (2006) may also be consulted for further 
references, a detailed development of some of 
these ideas, and citations to the literature for 
others. In addition, the textbooks Isidori (1999), 
Krstic et al. (1995), Khalil (1996), Sepulchre 
et al. (1997), Krstic and Deng (1998), Freeman 
and Kokotovic (1996), and Isidori et al. (2003) 
contain many extensions of the theory as well as 
applications. 

Cross-References 

► Feedback Stabilization of Nonlinear Systems 

► Fundamental Limitation of Feedback Control 

► Linear State Feedback 

► Lyapunov’s Stability Theory 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 

► Stability: Lyapunov, Linear Systems 
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Abstract 

The main functional and support facilities of¬ 
fered by interactive environments and tools for 
computer-aided control system design (CACSD) 
and reference examples of such software systems 
are presented, from both a user and a developer 
perspective. The essential functions these envi¬ 
ronments should possess and requirements which 
should be satisfied are discussed. The importance 
of reliability and efficiency is highlighted, be¬ 
sides the desired friendliness and flexibility of 
the user interface. Widely used environments and 
software tools for CACSD, including MATLAB, 
Mathematica, Maple, and the SLICOT Library, 
serve as illustrative examples. 


Keywords 

Automatic control; Controller design; Numerical 
algorithms; Simulation; User interface 


Introduction 

The complexity of many processes or systems 
to be controlled, and the strong performance 
requirements to be fulfilled nowadays, makes 
it very difficult or even impossible to design 
suitable control laws and algorithms without 
resorting to computers and dedicated software 
tools, computer-aided control system design 
(CACSD) is the use of computer programs to 
support the creation, analysis, evaluation, or 
optimization of a control system design. CACSD 
is a specialization of computer-aided design 
(CAD) for control systems. CAD is used in many 


domains, to enhance designer’s productivity and 
the design quality and to manage the design 
versions and documentation. CACSD is not a 
new paradigm, since the first such software 
systems have been developed about 50 years 
ago. See the historical overview in a companion 
paper. 

The interactive environments and tools for 
CACSD have evolved significantly during the last 
decades, in parallel with the developments of 
numerical linear algebra, scientific computations, 
and computer hardware and software, includ¬ 
ing programming and networking capabilities. 
Starting from simple collections of specialized 
tools for solving well-defined system analysis 
and design problems, the CACSD became in¬ 
creasingly more sophisticated and powerful, al¬ 
lowing complicated tasks to be orchestrated for 
fully covering the stages of control engineering 
design, prototyping, and testing, including even 
the transfer to practical systems and applications. 
Modeling, system analysis and synthesis, and 
control system assessment are activities which 
are assisted by the nowadays advanced CACSD 
environments and software tools. The main aim is 
to help the designer to concentrate on the design 
problem itself, not on theoretical approaches, 
numerical algorithms, and computational details. 
Moreover, CACSD environments allow the de¬ 
velopers and users to do conceptual thinking, 
but also programming and debugging at a higher 
level of abstraction, in comparison with standard 
programming languages, like Fortran, C/C++, or 
Java™. 

There are both commercial or free and open- 
source CACSD environments and tools. State- 
of-the-art CACSD systems exist for several 
platforms (Windows, Linux/UNIX, and Mac 
OS X). Multiple high-speed CPUs, graphics 
cards, and large amounts of RAM are well suited 
to perform graphically and computationally 
intensive tasks. A common feature is the presence 
of a “friendly” graphical user interface, but often 
a dedicated command language is also available. 
The user interacts with the CACSD environment, 
e.g., by specifying the model or control structure, 
the design requirements, and the values of es¬ 
sential parameters or by selecting and combining 
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Interactive CACSD environment (e.g., MATLAB, Mathematica, Maple) 

(for modeling, simulation, analysis, synthesis, etc.) 

Toolboxes or packages with executables or functions written in the environment language 
(Graphical) User interface, Interactive language, Graphical functions, API 
CACSD subroutine libraries (e.g., SLICOT) 

Mathematical subroutine libraries (e.g., LAPACK, ARPACK, IMSL, NAG) 
Computer-optimized mathematical libraries or their generators (e.g., BLAS, MKL, ATLAS) 
Libraries of intrinsic functions (e.g., in Fortran or C/C++) 


Interactive Environments and Software Tools for CACSD, Fig. 1 Hierarchy of the software components incorpo¬ 
rated in an interactive CACSD environment 


the tools to be used. The process can be repeated 
until a satisfactory behavior is obtained. 

Usually, the underlying computational tools 
on which the interactive environments are based 
are hidden to the user. Moreover, software for 
extensive testing is not normally provided, but 
demonstrators running few examples are offered. 
Unfortunately, even mathematically simple 
problems of small dimension can conduct to 
wrong results when using unsuitable algorithms. 
Illustrative control-related examples are given, 
e.g., in Van Huffel et al. (2004). Since system 
analysis and design tasks usually involve sequen¬ 
tial or iterative solution of large and complex 
subproblems, it follows that the quality of the 
intermediate results is of utmost importance. 
Consequently, the interactive environments for 
CACSD should be based on reliable, efficient, 
and thoroughly tested computational building 
blocks, which are called at the lower layers of 
calculations. These blocks constitute the compu¬ 
tational engine of an interactive environment. 

Figure 1 gives a typical hierarchy of the soft¬ 
ware components incorporated in an interactive 
CACSD environment. 


Interactive Environments for CACSD 

Main Functionality 

A comprehensive set of functions of and require¬ 
ments for interactive environments for control 
engineering are described in MacFarlane et al. 
(1989), but such a set has probably not yet been 
covered by any single environment. State-of-the- 
art interactive environments for CACSD include 
many attractive functional features: 


• Define or find (via first principles or system 
identification) various system models (e.g., 
state-space models or transfer-function matri¬ 
ces) and convert between different representa¬ 
tions 

• Find reduced order (or simplified) models, 
which can more economically be used for 
simulation, control, prediction, etc. 

• Analyze basic system properties, like stabil¬ 
ity, controllability, observability, stabilizabil- 
ity, detectability, minimality, properness, etc. 

• Analyze interactively the behavior of a control 
system for various scenarios 

• Provide alternative tools for different cate¬ 
gories of users, from novice to expert, and 
from classical to “modern” or advanced anal¬ 
ysis and synthesis techniques, in time domain 
or frequency domain 

• Provide a wide range of tools, covering mod¬ 
eling, system identification, filtering, control 
system design, simulation, real-time behav¬ 
ior, hardware-in-the-loop simulation, and code 
generation for easy deployment and ensure 
their interoperability 

• Allow the user to add extensions at vari¬ 
ous levels, new functions, interfaces, or even 
toolboxes or packages, which can be made 
available to a general community and allow 
customization 

In addition to the functional and computational 
tools, essential components of an interactive en¬ 
vironment are the user interface, the application 
program interface (API), and the support tools 
which enable to easily specify, document, and 
store a design solution, to visualize and interpret 
the results, to export them to other applications 
for further processing, to generate reports, etc. 
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A good paradigm for the data environment is 
object orientation. 

It is a common feature of an interactive envi¬ 
ronment for CACSD to address the requirements 
of a large diversity of users, in various stages 
of familiarity with the environment. This feature 
is expressed, e.g., by the option to use either 
a graphical user interface or a command lan¬ 
guage to call and sequence various computational 
procedures. In addition, tools for easy building 
new computational or graphical procedures, or 
for managing the codes and results, are often 
included. The command language should oper¬ 
ate both on low-level data constructs, such as 
a matrix, and on high-level ones (e.g., system 
objects), and it should allow operator overloading 
(e.g., taking G\ * G 2 as the result of a series 
interconnection of the systems represented by the 
system objects G\ and G 2 ). 

An environment for CACSD should integrate 
advanced user interfaces and API, a collection 
of problem solvers based on reliable and ef¬ 
ficient numerical and possibly symbolic algo¬ 
rithms, and tools for visualizing and interpreting 
the results. Widely used such environments are, 
for instance, MATLAB from The Math Works, 
Inc., Mathematica from Wolfram Research, or 
Maple from Waterloo Maple Inc. (Maplesoft). 
Earlier developments of CACSD packages are 
surveyed in Frederick et al. (1991). There are 
also environments dedicated to modeling and 
simulation, which cover a broad range of tech¬ 
nical and engineering computations, including 
those for mechanical, electrical, thermodynamic, 
hydraulic, pneumatic, or thermal systems. An 
example is Dymola, presented in a subsequent 
subsection. 

Reference interactive environments and tools 
for CACSD are presented in the following 
(sub)sections. 

Reference Interactive Environments 

MATLAB (MATrix LAB oratory) is an 
integrated, interactive environment for tech¬ 
nical computing, visualization, and program¬ 
ming (MathWorks 2013). Based on a powerful 
high-level interpreter language and development 
tools, an easy-to-use, flexible, and customizable 


graphical user interface, complemented with 
attractive visualization capabilities, and open 
for extensions with new toolkits, MATLAB 
can be used for solving intricate scientific 
and engineering problems, as well as for the 
development and deployment of applications. 

MATLAB® and Simulink® are registered 
trademarks of The Math Works, Inc. MATLAB, 
Simulink, and several toolboxes, including Sys¬ 
tem Identification Toolbox, Control System Tool¬ 
box, and Robust Control Toolbox, are suitable 
for solving various control engineering problems; 
other toolboxes, such as Signal Processing Tool¬ 
box, Optimization Toolbox, and Symbolic Math 
Toolbox, offer additional useful facilities. See 
http://www.mathworks.com/products/. 

Simulink is a high-level implementation of the 
engineering approach, based on block diagrams, 
to analyze and design control systems. It is also a 
powerful modeling and multi-domain simulation 
and model-based design tool for dynamic sys¬ 
tems, which supports hierarchical system-level 
design, simulation, automatic code generation, 
and continuous test and verification of embed¬ 
ded systems. Simulink offers a graphical edi¬ 
tor, customizable block libraries, and solvers for 
modeling and simulating dynamic systems. The 
models may include MATLAB algorithms, and 
the simulation results may be further processed 
to MATLAB. Managing projects (files, compo¬ 
nents, data), connecting to hardware for real-time 
testing, and deploying the designed system are 
additional, useful Simulink features. Real-Time 
Workshop code generation allows to speed up 
the design and implementation, by generating 
syntactically and semantically correct code which 
can be uploaded to the target machine. 

MATLAB environment is very suitable for 
rapid prototyping, seen in a broad sense. This 
may include not only fully designing and imple¬ 
menting a new control law, testing it on a host 
computer, and deploying on a target computer 
but also support for developing and testing new 
mathematical or control theories and algorithms. 

Born around 1980, MATLAB has evolved and 
improved impressively. Since 2004, two releases 
have been issued each year. There was a major 
change of the interface in Release 2012b, visible 
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both in the core MATLAB “Desktop” and in 
Simulink. The so-called Toolstrip interface re¬ 
places former menus and toolbars and includes 
tabs which group functionality for common tasks. 
A gallery of applications from the MATLAB 
family of products is additionally available and 
can be extended by the user. 

MATLAB supports developing applications 
with graphical user interface (GUI) features; 
this itself can be done graphically using GUIDE 
(GUI development environment). MATLAB has 
support for object-oriented programming and 
interfacing with other languages or connecting to 
similar environments as Maple or Mathematica. 
When using the command-line interface, 
MATLAB helps the user, e.g., by showing the 
arguments of the typed MATLAB functions; also, 
MATLAB allows execution profiling, for increas¬ 
ing the computational efficiency, and its editor 
can suggest changes in the user functions (the so- 
called M-files) for improving the performance. 

MATLAB users may upload their own contri¬ 
butions to the MATLAB Central website or may 
download tools developed by other people. User 
feedback is used by the MATLAB developers 
to improve the functionality, reliability, and effi¬ 
ciency of the computations. 

Commercial competitors to MATLAB include 
Mathematica, Maple, and IDL; free open-source 
alternatives are, e.g., GNU Octave, FreeMat, 
and Scilab, intended to be mostly compatible 
with the MATLAB language. For instance, 
a set of free CACSD tools for GNU Octave 
version 3.6.0 or beyond has been very recently 
developed (see http://octave.sourceforge.net/ 
control/). The Octave extension package called 
control is based on the SLICOT Library and 
includes functionalities for system identification, 
system analysis, control system design (including 
Hoq synthesis), and model reduction, which are 
the basic steps of the control engineer design 
workflow. 

Mathematica is an interactive environment 
which supports complete computational work- 
flows, making it suitable for a convenient 
endeavor from ideas to deployed solutions 
(see http://www.wolfram.com/mathematica/). 


Mathematica offers, e.g., tools for 2D and 3D 
data and function visualization and animation, 
numeric and symbolic tools for discrete and 
continuous calculus, a toolkit for adding user 
interfaces to applications, control systems 
libraries, tools for parallel programming, 
etc. High-performance computing capabilities 
include the use of packed and sparse arrays, 
multiple precision arithmetic, automatic multi¬ 
threading on multi-core computers (based on 
processor-specific optimized libraries), hardware 
accelerators, support for grid technology, 
and CUDA and OpenCL GPU hardware. 
Mathematica and SystemModeler (based on 
Modelica© language) offer numerous built-in 
functions which allow to design, analyze, and 
simulate continuous- and discrete-time control 
systems; simplify models; interactively test 
controllers; and document the design. Both 
classical and modern techniques are provided. A 
powerful symbolic-numeric computation engine 
and highly efficient numerical algorithms are 
used. Mathematica allows to define the system 
models in a more natural form than MATLAB. 
It can analyze not only numeric systems but 
also symbolic ones, represented by state-space 
or transfer-function models. The computational 
precision and algorithms can be automatically 
controlled and selected, respectively, and using 
arbitrary precision arithmetic is possible. 

Maple is a computer algebra system, which 
combines a powerful engine for mathematical 
calculations with an intuitive user interface 
(see http://www.maplesoft.com/). Classical 
mathematical notation can be used, and the 
interface is customizable. Arbitrary precision 
numerical computations, as well as symbolic 
computations, can be performed. The Maple 
language is provided by a small kernel. NAG 
Numerical Libraries, ATLAS libraries, and other 
libraries written in this language are used for 
numerical calculations. Symbolic expressions 
are stored as directed acyclic graphs. The 
latest release, Maple 17, added hundreds of 
new problem-solving commands and interface 
enhancements. Many calculations recorded an 
impressive improvement in efficiency, compared 
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to the previous release. Examples include cal¬ 
culations with complex floating-point numbers 
and linear algebra operations. It is possible 
to use multiple cores and CPUs. The parallel 
memory management has been improved. Maple 
includes some CACSD tools for linear and 
nonlinear dynamic systems. For instance, the 
built-in package DynamicSystems (available 
since Maple 12 release) covers the analysis of 
linear time-invariant systems. Numerical solvers 
for Sylvester and Lyapunov equations have been 
added to the LinearAlgebra packages in Maple 
13, and solvers for algebraic Riccati equations - 
based on SLICOT Library routines - have been 
included in Maple 14 (available in multiple 
precision arithmetic since Maple 15). Moreover, 
the MapleSim environment, based on Modelica, 
is dedicated to physical modeling and simulation. 
Symbolic simplification, numerical solution 
of the differential-algebraic equations (DAEs), 
and model post-processing (sensitivity analysis, 
linearization, parameter optimization, code 
generation, etc.) can be performed in MapleSim. 
Its Control Design Toolbox provides solutions 
for optimal control, Kalman filtering, pole 
assignment, etc. Bidirectional communication 
with MATLAB is possible. 

MuPAD is another computer algebra system, 
initially developed by a group at the University 
of Paderborn, Germany, and then in cooperation 
with SciFace Software GmbH & Co. KG, com¬ 
pany purchased in 2008 by The Math Works, Inc. 
MuPAD has been used with Scilab, and now it is 
available in the Symbolic Math Toolbox. MuPAD 
is able to operate on formulas symbolically or 
numerically (with specified accuracy). It offers a 
programming language allowing object-oriented 
and functional programming, several packages 
for linear algebra, differential equations, number 
theory, and statistics, an interactive graphical sys¬ 
tem supporting animations and transparent areas 
in tridimensional images, etc. 

Lab VIEW (Laboratory Virtual Instrumentation 
Engineering Workbench), from National 
Instruments, is an interactive development 
environment, based on MATRIXx, for a visual 
programming language mainly used for data 


acquisition, instrument control, and industrial 
automation. Its Control Design and Simulation 
Module (see http://sine.ni.eom/psp/app/doc/p/id/ 
psp-648/lang/en) can be used to build process and 
controller models using transfer-function, state- 
space, or zero-pole-gain representations, analyze 
the open- and closed-loop system behavior, 
deploy the designed controllers to real-time 
hardware using built-in functions and Lab VIEW 
Real-Time Module, etc. 

Software Tools for CACSD 

The software tools for CACSD are formally 
divided below into computational and support 
tools. SLICOT Library and Dymola serve as 
illustrative examples. The support tools can also 
include computational components. 

Computational Tools 

The computational tools for CACSD implement 
the main numerical algorithms of the systems and 
control theory and should satisfy several strong 
requirements: 

• Reliability or guaranteed accuracy, which im¬ 
plies the use of numerically stable algorithms 
as much as possible and the estimation of the 
problem sensitivity (conditioning) and of the 
results accuracy; backward numerical stability 
ensures that the computed results are exact for 
slightly perturbed original data. 

• Computational efficiency, which is important 
for large-scale engineering design problems or 
for real-time control. 

• Robustness, which is mainly ensured by 
avoiding overflows, harmful underflows, 
and unacceptable accumulation of rounding- 
errors; scaling the data may be essential. 

• Ease-of-use, achieved by simplified user inter¬ 
face (hiding the details), and default values for 
algorithmic parameters, such as tolerances. 

• Wide scope and rich functionality, which ad¬ 
dress the range of problems and system repre¬ 
sentations that can be handled. 

• Portability to various platforms, in the sense 
of functional correctness. 

• Reusability, in building several dedicated en¬ 
gineering software systems or environments. 
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More details are given, e.g., in Van Huffel et al. 
(2004). An example addressing all these aspects 
is discussed in what follows. 

SLICOT Library Benner et al. (1999) and 
Van Huffel et al. (2004) is one of the most com¬ 
prehensive libraries for control theory numerical 
computations, containing over 500 subroutines 
which cover system analysis, benchmark and 
test problems, data analysis, filtering, identifica¬ 
tion, mathematical routines, some capabilities for 
nonlinear systems, synthesis, system transforma¬ 
tion, and utility routines (see http://www.slicot. 
org/). The requirements above have been taken 
into account in the SLICOT Library develop¬ 
ment. Some of the SLICOT components are used 
in several interactive environments for CACSD, 
including MATLAB, Maple, Scilab, and Octave 
control package. The library is still under devel¬ 
opment. It is worth mentioning the new focus 
on structure-preserving algorithms, which offer 
increased accuracy, reliability, and efficiency, in 
comparison with standard solvers. Many proce¬ 
dures for optimal control and filtering, model 
reduction, etc., can benefit from using the “struc¬ 
tured” solvers. There are also separate SLICOT- 
based toolboxes for MATLAB (Benner et al. 
2010). SLICOT components follow predefined 
implementation and documentation standards. 

SLICOT Library routines, and functions from 
many interactive environments for CACSD call 
components from the Basic Linear Algebra 
Subprograms (BLAS, see Dongarra et al. 1990 
and the references therein) and Linear Algebra 
PACKage (LAPACK, Anderson et al. 1999). This 
approach enhances portability and efficiency, 
since optimized BLAS and LAPACK Libraries 
are provided for major computer platforms. 

Support Tools 

The support software tools for CACSD offer 
additional capabilities compared to compu¬ 
tational tools. They may include alternative 
algorithms, symbolic computations (usually, 
for low-dimensional problems), and extended 
functionality, e.g., for modeling/simulation of 
nonlinear systems, code generation, etc. The 
support tools can be used by software developers 
of CACSD environments or computational tools 


or directly by other users. For instance, symbolic 
calculations are useful for checking the accuracy 
of numerical algorithms. The code generation 
facility offers a safe and convenient support 
for deploying a design solution to the control 
hardware. A reference support software tool is 
briefly presented below. 

Dymola (Dynamic modeling laboratory), from 
Dassault Systemes (see http://www.3ds.com/ 
products/catia/portfolio/dymola), deals with 
high-fidelity modeling and simulation of complex 
systems from various domains, like aerospace, 
automotive, robotics, process control, and other 
applications. Compatible and comprehensive 
model libraries, developed by leading experts, 
exist for many engineering branches. The 
users may create their own libraries or adapt 
existing libraries. This flexibility and openness is 
provided by the use of the open, object-oriented 
modeling language Modelica©, currently further 
developed by the Modelica Association. 

Equation-oriented models, based on DAEs, 
and symbolic manipulation are used, stimulating 
the reuse of components and enhancing the re¬ 
liability and efficiency of the calculations. This 
approach enables to simplify generating the equa¬ 
tions, which result from interconnecting various 
subsystems, and to deal with algebraic loops 
and structurally singular models. Algebraic loops 
are encountered when some auxiliary variables 
depend algebraically upon each other in a mu¬ 
tual way (Cellier and Elmqvist 1992). Structural 
singularities are related to DAE of index higher 
than 1. 

Dymola allows performing hardware-in-the- 
loop simulation and real-time 3D animation. A 
model can be built by graphical composition, 
connecting components from various libraries 
using simple dragged-and-dropped operations. 
The parameters a model depends on can be tuned 
either by parameter estimation (also called model 
calibration ), which minimizes the error between 
the physical measurements and simulation 
results, or by optimization, which minimizes 
certain performance criteria. Sometimes, e.g., 
when designing certain controllers, the criteria 
values are obtained by simulation. Dymola offers 
also facilities for model management, including 
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checking, testing, encrypting, or comparing 
models, and version control. 

Summary and Future Directions 

The main functional and support facilities of¬ 
fered by interactive environments and software 
tools for CACSD and reference examples have 
been presented. Their remarkable evolution dur¬ 
ing the past decades, combined with the im¬ 
portance of the design solutions they offer, is 
the strong argument that the CACSD software 
arsenal will continue to evolve and more reli¬ 
able, efficient, and powerful systems will come 
into place. Progress is expected at all levels, 
including basic algorithms and numerical and 
symbolic libraries but also command languages, 
user interfaces, human-machine communication, 
and associated hardware. Tools for adaptive, non¬ 
linear, and distributed control systems design 
should be developed and integrated. Artificial 
intelligence support might be required to add 
expert capabilities to the forthcoming interactive 
environments. 


Cross-References 

► Computer-Aided Control Systems Design: In¬ 
troduction and Historical Overview 

► Model Order Reduction: Techniques and Tools 

► Multi-domain Modeling and Simulation 

► Optimization-Based Control Design Tech¬ 
niques and Tools 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 

► System Identification: An Overview 

► Validation and Verification Techniques and 
Tools 


Recommended Reading 

CACSD is well presented in many textbooks. 
A very recent one is Chin (2012), which covers 
modeling, control system design, implemen¬ 
tation, and testing, and describes practical 


applications using MATLAB and Simulink. 
Many IFAC (International Federation of 
Automatic Control) and IEEE (Institute for Elec¬ 
trical and Electronics Engineers) international 
conferences and symposia have been dedicated 
to CACSD, going back more than two decades. 
A wealth of material is available, e.g., on IEEE 
Xplore (ieeexplore.ieee.org), containing the 
proceedings of many of the IEEE CACSD events. 
A recent event is the 2011 IEEE International 
Symposium on CACSD. Similar IEEE events 
were hold on 2010, 2008, 2006, 2004, 2002, 
2000, 1999, 1996, 1994, 1992, and 1989. A new 
IEEE CACSD Conference, for Systems under 
Uncertainty, took place in July 2013. 
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Abstract 

This entry is a brief survey of classical inventory 
models and their extensions in several direc¬ 
tions such as world-driven demands, presence 
of forecast updates, multi-delivery modes and 
advanced demand information, incomplete in¬ 
ventory information, and decentralized inventory 
control in the context of supply chain manage¬ 
ment. Important references are provided. We con¬ 
clude with suggestions for future research. 


Keywords 
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Introduction 

Optimal inventory theory deals with managing 
stock levels of goods to effectively meet the 
demand of those goods. Because of the huge 
amount of capital that is tied up in inventory, 
its management is critical to the profitability of 
firms. A systematic analysis of inventory prob¬ 
lems began with the development of the classi¬ 
cal economic order quantity (EOQ) formula of 
Ford W. Harris in 1913. A substantial amount 
of research was reported in 1958 by Kenneth J. 
Arrow, Samuel Karlin, and Herbert Scarf, and 
much more has accumulated since then. Books on 
the topic include Zipkin (2000), Porteus (2002), 
Axsater (2006), and Bensoussan (2011). 

In this entry, we review single- and multi¬ 
period models with deterministic, stochastic, par¬ 
tially observed demand for a single product. In 
these models, our aim is to decide on the time 
of the orders and the order quantities. The time 


between issuing an order and its receipt is called 
the lead time. For most of this review, we will 
assume the lead time to be zero, and the reader 
can consult the referenced books for nonzero lead 
time extensions and other topics not covered here. 


Deterministic Demand 

We will describe two classical models: the EOQ 
model and the dynamic lot size model. 

The EOQ Model 

This basic and most important deterministic 
model is concerned with a product that has a 
constant demand rate D in continuous time over 
an infinite horizon. No shortages are allowed. 
The costs consist of a fixed setup/ordering cost K 
and a holding cost h per unit of average on-hand 
stock per unit time. The production/purchase cost 
per unit time is a sunk cost since there is no 
choice of a total amount to produce, and hence 
it can be ignored. Although dynamic, the model 
can be reduced to a static model by a simple 
argument of periodicity. Moreover, it is obvious 
that one should never produce or order except for 
when the inventory level is zero, and one should 
order the same lot size Q each time the inventory 
level reaches zero. Since the average inventory 
level over time is Q /2 and the number of setups 
is D/Q per unit time, the long-run average cost 
to be minimized is KD/Q+hQ/2. The optimal 
policy that minimizes this cost, obtained using 
the first-order condition, is to order the lot size 


every time the inventory level reaches zero. 
Harris (1913) introduced the model. Erlenkotter 
(1990) provides a historical account of the 
formula, and Beyer and Sethi (1998) provide 
a mathematically rigorous proof involving 
quasi-variational inequalities (QVI) that arise 
in the course of dealing with continuous-time 
optimization problems involving fixed costs. 
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The Dynamic Lot Size Model 

This is an analogue of the EOQ model when 
the demand varies over time. Wagner and Whitin 
(1958) developed it in the discrete-time finite 
horizon framework. With D(t) denoting the de¬ 
mand in period t and other costs similar to those 
in the EOQ model, they showed that there exists 
an optimal policy in which an order will be 
issued just as the inventory level reaches zero, 
except for the first order. This policy is called 
the zero-inventory policy. With this in hand, the 
problem reduces to selecting only the order times. 
This is accomplished by applying a shortest path 
algorithm. Moreover, there are forward (recur¬ 
sion) procedures for solving the problem. 

An important feature of this model is that in 
most cases, one can detect a forecast horizon 
which essentially separates earlier periods from 
later ones. More specifically, T is a forecast 
horizon if the first order in a T horizon problem 
remains optimal in any finite horizon problem 
with horizon longer than T, regardless of the 
demands beyond the period T. For an extensive 
bibliography of this literature, see Chand et al. 
( 2002 ). 

Stochastic Demand 


less, and the marginal paper at the optimum 
should be worth exactly zero. Thus, c u times the 
probability of selling the marginal paper minus 
c 0 times the probability of not selling it should 
equal zero. Now, if F denotes the cumulative 
probability distribution function of the demand 
D , then clearly the optimal order quantity Q 
satisfies c 0 • F ( Q ) — c u • (1 — F(Q)) = 0, which 
gives us the famous newsvendor formula for the 
optimal order quantity 


where c u /(c u + c 0 ) is known as the critical frac- 
tile. 

If p denotes the unit sale price, c the unit 
cost, and h the holding cost per unit per unit 
time, then c u = p — c and c 0 = c + h, and 
therefore, the critical fractile can be expressed as 
(p — c) / (p + h). An extension of the newsven¬ 
dor formula to allow for a unit cost g of lost 
goodwill and a unit salvage value s received at 
the end of the period for each unit not sold is 
immediate. If we let a >0 denote the periodic 
discount factor, then c u = p + g — c and c 0 = 
c + h — as and the critical fractile becomes 
(P + g — c ) / (P + g + h — as) , and therefore, 


We shall discuss three classical models and some 
of their extensions. 


Q = F~ l 


( P±gj2l \ 

\p + g + h—as ) 


(3) 


The Single-Period Problem: The 
Newsvendor Model 

The problem of a newsvendor is to decide on an 
order quantity of newspapers to meet a stochas¬ 
tic demand at a minimum cost. If the realized 
demand is larger than the ordered quantity, it is 
lost and there is an opportunity loss of c u (selling 
price minus purchase cost) for each paper short. 
On the other hand, for each paper ordered but not 
sold, there is an opportunity loss of c 0 (purchase 
cost plus holding cost). The newsvendor concep¬ 
tualizes the decision by each additional paper as a 
separate marginal contribution. The first is almost 
certain to be sold. Each additional paper is less 
likely to be sold than the previous one. Thus, 
each additional paper will be worth somewhat 


The newsvendor model has been used exten¬ 
sively in the context of supply chain management 
with multiple agents maximizing their individual 
objectives. In this case, inefficiencies arise due 
to double marginalization. Then, a question of 
appropriate contracts that can lead to the first-best 
solution, or coordinate the supply chain, becomes 
important. Cachon (2003) surveys this literature. 

Multi-period Inventory Models: No 
Fixed Cost 

The newsvendor model is a single-period model, 
and its multi-period generalization requires that 
the inventory not sold in a period is carried over 
to the next period. This results in the multi-period 
inventory model with lost sales. It is assumed 
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that demand in each period is independent and 
identically distributed (i.i.d.) with F denoting its 
cumulative probability distribution function. A 
rigorous analysis requires the method of dynamic 
programming, and it shows that there is a stock 
level S t called base stock in period t , that we 
would ideally like to have at the beginning of 
period t. Thus, the optimal policy in period t , 
called the base stock policy , is to order 


e.w = (? 


if x<s t , 

if x>s t . 


(4) 


In the special case when the terminal salvage 
value of an item is exactly equal to its cost c, it is 
possible to come up with the optimal policy using 
intuition. Since we do not need to salvage unused 
items in the multi-period setting, one can argue 
that an item carried over to the next period is 
worth its purchase cost c. Therefore, its presence 
means that the next period will need to order one 
less and thus save an amount c. In the last period, 
when there is no next period, our terminal sal¬ 
vage value assumption also guarantees a leftover 
item’s worth to be also c. Thus, we can modify 
(3) and obtain a stationary base stock level 


S = F~ l 
= F~ l 


P + g-c 


( 

f P + g-c \ 

Vi? + g + h -etc) 


(P + g ~ c ) + ( c + h - etc) / 
P + g-c 


(5) 


for each period t . 

Thus, the elimination of the endgame effect 
delivers us a myopic policy , a policy optimal in 
the single-period case to be also optimal in the 
dynamic multi-period setting. A more general 
concept than the optimality of a myopic policy 
is that of the forecast horizon mentioned earlier 
in the context of the dynamic lot size model. 

Sometimes, when the demand exceeds the 
on-hand inventory in the period, the demand is 
not lost but backlogged. In this case, each unit 
of backlogged demand is satisfied in the next 
period, and unit revenue p is recovered, but a unit 
backlogging cost b is incurred, due to expediting, 
special handling, delayed receipt of revenue, and 
loss of goodwill. Thus, c u = b — (1 — a)c. 


where the second term represents the savings due 
to postponing the purchase of the backlogged 
demand unit by one period, and c 0 = c + h — ac 
as in (4). This gives us the base stock level 


S = F~ l 


/ b-(l-a)c \ 

V b + h )’ 


( 6 ) 


which can be used in (5) to give the optimal 
policy. 

Sometimes it is possible to have multiple de¬ 
livery modes such as fast, regular, and slow as 
well as demand forecast updates. Then, at the be¬ 
ginning of each period, on-hand inventory and de¬ 
mand information are updated. At the same time, 
decisions on how much to order using each of the 
modes are made. Fast, regular, and slow orders 
are delivered at the ends of the current, the next, 
and one beyond the next periods, respectively. 
In such models, a modified base stock policy is 
optimal only for the two fastest modes. For details 
and further generalization, see Sethi et al. (2005). 

An important extension includes serial inven¬ 
tory systems where stage 1 receives supplies from 
an outside source and each downstream stage 
receives supplies from its immediate upstream 
stage. Clark and Scarf (1960) introduced the 
notion of the echelon inventory position at a stage 
to consist of the stock at that stage plus stock 
in transit to that stage plus all downstream stock 
minus the amount backlogged at the final stage. 
Then, the optimal ordering policy at each stage 
is given by an echelon base stock policy with 
respect to the echelon inventory position at that 
stage. It is known that assembly systems can be 
reduced to a serial system. Details can be found 
in Zipkin (2000). 


Multi-period Inventory Models: Fixed Cost 

When there is a fixed cost of ordering, it is clear 
that it would not be reasonable to follow the base 
stock policy when the inventory level is not much 
below the base stock level. Indeed, Scarf (1960) 
proved that there are numbers s t and S t , s t < S t , 
for period t such that the optimal policy in period 
t is to order 


QAx) 


{ S t —x if x<s t 
0 ifx>5 ? . 


(7) 
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Such a policy is famously known as an (s, S) 
policy. 

When the demands are not i.i.d., the model 
has been extended to Markovian demands. In this 
case, there is an exogenous Markov process, and 
the distribution of the demand in each period 
depends on the state of the Markov process, 
called the demand state, in that period. It can 
be shown that the optimal policy in period t 
is (s l t ,S l t ), where i denotes the demand state 
in the period. Such a policy is also called a 
state-dependent (s, S) policy. Further details are 
available in Beyer et al. (2010). Recent advances 
in information technology have allowed man¬ 
agers to obtain advance demand information in 
addition to forecast updates. In such cases, a 
state-dependent (s, S) policy can be shown to be 
optimal. For details, refer to Ozer (201 1). 

The Continuous-Time Model: Fixed Cost 

The marriage of the two classical results (1) 
and (7) is accomplished by Presman and Sethi 
(2006) in a continuous-time stochastic inventory 
model involving a demand that is the sum of a 
constant demand rate and a compound Poisson 
process. The optimal policies that minimize a 
discounted cost or the long-run average cost are 
both of (s, S) type. The (s, S) policy minimizing 
the long-run average cost reduces to the EOQ for¬ 
mula when the intensity of the compound Poisson 
process is set to zero. And when the constant de¬ 
mand component vanishes, the model reduces to 
the continuous-review stochastic inventory model 
with fixed cost and compound Poisson demand. 

Incomplete Inventory Information 
Models (i3) 

A critical assumption in the vast inventory theory 
literature has been that the level of inventory at 
any given time is fully observed. The celebrated 
results (1) and (7) have been obtained under the 
assumption of full observation. Yet the inventory 
level is often not fully observed in practice, for a 
variety of reasons such as replenishment errors, 
employee theft, customer shoplifting, improper 
handling and damaging of merchandise, 
misplaced inventories, uncertain yield, imperfect 
inventory audits, and incorrect recording of 


sales. In such an environment of incomplete 
information, inventories are known to be partially 
observed and most of the well-known inventory 
policies including (1) and (7) are not even 
admissible, let alone optimal. In such cases, 
Bensoussan et al. (2010) show that the dynamic 
programming equation can be written in terms of 
the unnormalized conditional probability of the 
current inventory level given past observations, 
referred to as signals, instead of just the inventory 
level in the full observation case. Furthermore, 
one can write the evolution of the conditional 
probability in terms of its current value, the 
current order, and the current observation. How¬ 
ever, there are no longer simple optimal policies 
except in cases of information delay reported in 
Bensoussan et al. (2009) where modified base 
stock and (s, S) policies are shown to be optimal. 

Summary and Future Directions 

We briefly describe some classical results in in¬ 
ventory theory. These are based on full obser¬ 
vation. Some recent work on inventory models 
under incomplete information is reported. This 
work leads to a number of new research direc¬ 
tions, both theoretical and empirical as reported 
in Sethi (2010). It would be of much interest 
to know the industries where the i3 problem is 
serious enough to warrant the difficult mathemat¬ 
ical analysis required. Furthermore, how are the 
observed signals related to the inventory level? It 
is also clear from the reviewed literature that there 
are no simple optimal policies for most i3 prob¬ 
lems, so it would be important to develop effi¬ 
cient computational procedures to obtain optimal 
solutions or to specify a class of simple imple- 
mentable policies and optimize within this class. 
An important benefit of solving i3 problems op¬ 
timally is the provision of an economic justifi¬ 
cation for technologies such as RFID that may 
reduce inaccuracies in inventory observations. 

Another area of research would be to study 
multi-period multi-agent supply chains with a 
stochastic inventory dynamics. While these can 
be formulated as dynamic games, there are a 
number of equilibrium concepts to deal with, 
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depending on the information the agents have. 
Some of them are time consistent or subgame 
perfect and some are not. Regardless, there are 
inefficiencies that arise from these decentralized 
game settings, and developing contracts for coor¬ 
dinating dynamic supply chains remains a wide 
open topic of research. 


Cross-References 

► Nonlinear Filters 

► Stochastic Dynamic Programming 
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Investment-Consumption Modeling 

L.C.G. Rogers 

University of Cambridge, Cambridge, UK 

Abstract 

The simplest investment-consumption problem is 
the celebrated example of Robert Merton (J Econ 
Theory 3(4):373-413, 1971). This survey shows 
three different ways of solving the problem, each 
of which is a valuable solution method for more 
complicated versions of the question. 


Keywords 

Budget constraint; Flamilton-Jac obi-Bellman 
(HJB) equation; Merton problem; Value function 


Introduction 

Consider an investor in a market with a riskless 
bank account accruing continuously compounded 
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interest at rate r t , and with a single risky asset 
whose price S t at time t evolves as 


dS t = S t (a t dW t + ii t dt ), (1) 

where W is a standard Brownian motion, and a 
and \i are processes previsible with respect to the 
filtration of W. The investor starts with initial 
wealth wo and chooses the rate c t of consuming, 
and the wealth 9 t to invest in the risky asset, so 
that his overall wealth evolves as 


dw t = 9 t {p t dW t + ix t dt) + r t (w t — 9 t )dt — c t dt 

( 2 ) 


= r t w t dt + 6 t {(j t dW t + (pi t — r t )dt } — c t dt. 

( 3 ) 

For convenience, we assume that a, and /z 
are bounded. See Rogers and Williams (2000a, b) 
for background information on stochastic pro¬ 
cesses. The three terms in (2) have natural in¬ 
terpretations: The first expresses the evolution of 
the wealth invested in the stock, the second the 
interest accruing on the wealth ( w—6 ) invested in 
the bank account, and the third is the cash being 
withdrawn for consumption. 

To avoid so-called doubling strategies, we in¬ 
sist that the wealth process so generated by the 
controls (c,9) must remain bounded below in 
some suitable way, which here is just the condi¬ 
tion w t > 0 for all t > 0; any (c, 6) satisfying 
this condition will be called admissible. The set 
of admissible (c,9) will be denoted A(wo), a 
notation which makes explicit the dependence on 
the investor’s initial wealth. 

The investor’s objective is taken to be to obtain 


V(wo) = sup E 

(c,6)eA(wo) 



e pt U(c t ) dt 


(4) 


for some constant p > 0. The problem cannot 
be solved explicitly at this level of generality, 
but if we take some special cases, we are able 
to illustrate the main methods used to attack 
it. Many other objectives with various different 
constraints can be handled by similar techniques: 
see Rogers (2013) for a wide range of examples. 


The Main Techniques 

We present here three important techniques for 
solving such problems: the value function ap¬ 
proach; the use of dual variables; and the use of 
martingale representation. The first two methods 
only work if the problem is Markovian; the third 
only works if the market is complete. There is 
a further method, the Pontryagin-Lagrange ap¬ 
proach; see Sect. 1.5 in Rogers (2013). While 
this is a quite general approach, we can only 
expect explicit solutions when further structure is 
available. 

The Value Function Approach 

To illustrate this, we focus on the original Merton 
problem (Merton 1971), where a and fi are both 
constant, and the utility U is constant relative risk 
aversion (CRRA): 


U'(x) = x~ R (x > 0) (5) 

for some R > 0 different from 1. The case 
R = 1 corresponds to logarithmic utility, and 
can be solved by similar methods. Perhaps the 
best starting point is the Davis-Varaiya Martin¬ 
gale Principle of Optimal Control (MPOC): The 
process Y t = e~ pt V(w t ) + e~ ps U(c s ) ds 
is a supermartingale under any control, and a 
martingale under optimal control. If we use Ito’s 
formula, we find that 

e pt dY t = —pV(w t )dt + V\w t )dw t 

+ \o 2 9fV"(w t )dt + U(c t )dt 
= [~pV + {ftO - r) - c t + r}V' 

+ \g 2 0?V" + U(c t )\ dt, (6) 

where the symbol = denotes that the two sides 
differ by a (local) martingale. If the MPOC is to 
hold, then we expect that the drift in d Y should 
be non-positive under any control, and equal to 
zero under optimal control. We simply assume 
for now that local martingales are martingales; 
this is of course not true in general, and is a point 
that needs to be handled carefully in a rigorous 
proof. Directly from (6), we then deduce the 
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Hamilton-Jacobi-Bellman (HJB) equations for 
this problem: 

0 = sup [—pV + {6{]i — r) — c + r}V' 

c,e 

+ \o 2 6 2 V" + U(c)\. (7) 

Write U ( y ) = sup {U(x) — xy} for the convex 
dual of U, which in this case has the explicit form 


U(y) = 


y 


l-R' 


1 - R' 


( 8 ) 


Dual Variables 

We illustrate the use of dual variables in the 
constant-coefficient case of the previous section, 
except that we no longer suppose the special form 
(5) for U. The analysis runs as before all the way 
to (9), but now the convex dual U is not simply 
given by (8). Although it is not now possible to 
guess and verify, there is a simple transformation 
which reduces the nonlinear ODE (9) to some¬ 
thing we can easily handle. We introduce the new 
variable z > 0 related to w by z = V'(w), and 
define a function J by 


with R' = 1 /R. We are then able to perform the 
optimizations in (7) quite explicitly to obtain 


J(z) = V(w) - wz. (14) 


0 = -pV + rV’ + U(V) - \k 2 (9) 


where 

li — r 

k = - 

a 


( 10 ) 


Nonlinear PDEs arising from stochastic optimal 
control problems are not in general easy to solve, 
but (9) is tractable in this special setting, because 
the assumed CRRA form of U allows us to 
deduce by a scaling argument that V(w) oc 
w l ~ R oc U(w ), and we find that 


V(w) = Ym R U(w), (11) 


where 


RYm = P + (R ~ l)(r + i k 2 /R). (12) 

The optimal investment and consumption 
behavior is easily deduced from the optimal 
choices which took us from (7) to (9). After some 
calculations, we discover that 


li ~ r * 

6 t = ir M Wt = ^ 2r w t , c t = y M w t 

(13) 

specifies the optimal investment/consumption 
behavior in this example. (The positivity of ym 
is necessary and sufficient for the problem to 
be well posed; see Sect. 1.6 in Rogers (2013)). 
Unsurprisingly, the optimal solution scales 
linearly with wealth. 


Simple calculus gives us J' = — w, J n = 
— 1/ V", so that the HJB equation (9) transforms 
into 

0 = U(z) - pJ (z) + (p — r)zJ'(z) + \k 2 z 2 J"(z), 

(15) 

which is now a second-order linear ODE, 
which can be solved by traditional methods; 
see Sect. 1.3 of Rogers (2013) for more details. 

Use of Martingale Representation 

This time, we shall suppose that the coefficients 
pit , r t , and a t in the wealth evolution (3) are 
general previsible processes; to keep things sim¬ 
pler, we shall suppose that /x, r, and g~ x are 
all bounded previsible processes. The Markovian 
nature of the problem which allowed us to find 
the HJB equation in the first two cases is now 
destroyed, and a completely different method 
is needed. The way in is to define a positive 
semimartingale £ by 

d^ t = ^ t (-r t dt-K t dW t ), £ 0 = 1 (16) 

where K t = (/x ? — r t )/G t is a previsible process, 
bounded by hypothesis. This process, called the 
state-price density process , or the pricing kernel , 
has the property that if w evolves as (3), then 
M t = + f* % s c s ds is a positive local 

martingale. 

Since positive local martingales are super¬ 
martingales, we deduce from this that 
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M 0 = W 0 > e\J 1; S C S ds 


(17) 


Thus, for any ( c,0 ) G A(wo), the budget con¬ 
straint (17) must hold. So the solution method 
here is to maximize the objective (4) subject 
to the constraint (17). Absorbing the constraint 
with a Lagrange multiplier A, we find the uncon¬ 
strained optimization problem 


sup E 


+ A w 0 
( 18 ) 


■ p oo 

/ {e-^U(c s )-X^c s }ds 

Jo 

whose optimal solution is given by 

e~P s U'(c s ) = Xt ; s , (19) 


and this determines the optimal c, up to knowl¬ 
edge of the Lagrange multiplier A, whose value is 
fixed by matching the budget constraint (17) with 
equality. 

Of course, the missing logical piece of this 
argument is that if we are given some c > 0 
satisfying the budget constraint, is there necessar¬ 
ily some 0 such that the pair (c, 6) is admissible 
for initial wealth wo? In this setting, this can be 
shown to follow from the Brownian integral rep¬ 
resentation theorem, since we are in a complete 
market; however, in a multidimensional setting, 
this can fail, and then the problem is effectively 
insoluble. 


Summary and Future Directions 


Cross-References 
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► Stochastic Dynamic Programming 

► Stochastic Linear-Quadratic Control 

► Stochastic Maximum Principle 
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Iterative Learning Control 

David H. Owens 
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This brief survey states some of the main ideas 
of consumption-investment optimization, and 
sketches some of the methods in common use. 
Explicit solutions are rare, and much of the inter¬ 
est of the subject focuses on efficient numerical 
schemes, particularly when the dimension of 
the problem is large. A further area of interest 
is in continuous-time principal-agent problems; 
Cvitanic and Zhang (2012) is a recent account 
of some of the methods of this subject, but it has 
to be said that the theory of such problems is 
much less complete than the simple single-agent 
optimization problems discussed here. 


Synonyms 

ILC 


Abstract 

Iterative learning control addresses tracking con¬ 
trol where the repetition of a task allows im¬ 
proved tracking accuracy from task to task. The 
area inherits the analysis and design issues of 
classical control but adds convergence conditions 








Iterative Learning Control 


599 


for task to task learning, the need for acceptable 
task-to-task performance and the implications of 
modeling errors for task-to-task robustness. 


Keywords 

Adaptation; Optimization; Repetition; Robust¬ 
ness 


Introduction 

Iterative learning control (ILC) is relevant to 
trajectory tracking control problems on a finite in¬ 
terval [0, T] (Ahn et al. 2007b; Bien and Xu 1998; 
Chen and Wen 1999). It has close links to multi¬ 
pass process theory (Edwards and Owens 1982) 
and repetitive control (Rogers et al. 2007) plus 
conceptual links to adaptive control. It focuses 
on problems where the repetition of a specified 
task creates the possibility of improving tracking 
accuracy from task to task and, in principle, 
reducing the tracking error to exactly zero. The 
iterative nature of the control schemes proposed, 
the use of past executions of the control to up¬ 
date/improve control action, and the asymptotic 
learning of the required control signals put the 
topic in the area of adaptive control, although 
other areas of study are reflected in its method¬ 
ologies. 

Application areas include robotic assembly 
(Arimoto et al. 1984), electromechanical test sys¬ 
tems (Daley et al. 2007), and medical rehabili¬ 
tation robotics (Rogers et al. 2010). For example, 
consider a manufacturing robot required to under¬ 
take an indefinite number of identical tasks (such 
as “pick and place” of components) specified by 
a spatial trajectory on a defined time interval. 
The problem is two-dimensional. More precisely, 
the controlled system evolves with two variables, 
namely, time t G [0, T] (elapsed in each iteration) 
and iteration index k > 0. Data consists of signals 
fk(t) denoting the value of the signal / at time t 
on iteration k. The conceptual algorithm used is: 
Step one: (Preconditioning) Implement loop 

controllers to condition plant dynamics. 


Step two: (Initialization) Given a demand sig¬ 
nal r(t ), t G [0, T], choose an initial input 
uo(t), t e [0, T] and set k = 0. 

Step three: (Response measurement) Return 
the plant to a defined initial state. Find the 
output response yk to the input Uk . Construct 
the tracking error ek = r — yk- Store data. 
Step four: (Input signal update) Use past 
records of inputs used and tracking er¬ 
rors generated to construct a new input 
Wfc+i(0, t G [0, T] to be used to improve 
tracking accuracy on the next trial. 

Step five: (Termination/task repetition) 

Either terminate the sequence or increase k 
by unity and return to step 3. 

It is the updating of the input signal based 
on observation that provides the conceptual link 
to adaptive control. IEC causality defines “past 
data ” at time t on iteration k as data on the 
interval [0A] on that iteration plus all data on 
[0, T] on all previous iterations. Feedback plus 
feedforward control normally contains feedfor¬ 
ward transfer of information from past iterations 
to the current iteration. 

Modeling Issues 

Design approaches have been model-based. Most 
nonlinear problems assume nonlinear state space 
models relating the l x 1 input vector u(t) to the 
m x 1 output vector y(t) via an n x 1 state vector 
x(t ) as follows: 

x(t) = f(x(t),u(t)), y(t) = h(x(t),u(t)), 

where t G [0, T], x(0) = Xo and / and h 
are vector-valued functions. The discrete time 
(sample data) version replaces derivatives by a 
forward shift, where t is now a sample counter, 
0 < t < N (the index of the last sample). The 
continuous time linear model is 

x(t) = Ax(t ) + Bu(t), y(t) = Cx(t) + Du(t ) 

with an analogous model for discrete systems. In 
both cases, the matrices A, B,C, D are constant 
or time varying of appropriate dimension. 
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Nonlinear systems present the greatest 
technical challenge. Linear system’s challenges 
are greater for the time-varying, continuous time 
case. The simplest linear case of discrete time, 
time-invariant systems can be described by a 
matrix relationship 

y — Gu + d (1) 

where y denotes the m(N + 1) x 1 “su¬ 
pervector” generated by the time series 
y(0), y(l),..., y(N) and the construction y = 
[y r (0), y T ( 1),..., y r (A0] r , the supervector 
u is generated, similarly, by the time series 
u( 0), u( 1),..., u(N), and d is generated by the 
times series Cxq, CAxq ,..., CA N xo. The matrix 
G has the lower block triangular structure 



D 

0 

0 • 

■■ 0 " 


CB 

D 

0 • 

0 

G = 

CAB 

CB 

D • 

0 


_CA N ~ l 

B CA n ~ 2 B 


D _ 


defined in terms of the Markov parameter matri¬ 
ces D , CB , CAB ,... of the plant. This structure 
has led to a focus on the discrete time, time- 
invariant case, and exploitation of matrix algebra 
techniques. 

More generally, G \U —> y can be a bounded 
linear operator between suitable signal spaces U 
and y. Taking G as a convolution operator, the 
representation (1) also applies to time-varying 
continuous time and discrete time systems. The 
representation also applies to differential-delay 
systems, coupled algebraic and differential sys¬ 
tems, multi-rate systems, and other situations of 
interest. 


Formal Design Objectives 

Problem Statement: Given a reference signal 
r e y and an initial input signal uq e U, 
construct a causal control update rule/algorithm 


Mk -\-1 — 1 ? ^ 0 ? Uk> Uk— 1 , • • • , Uq) 

that ensures that lim^oo e^ = 0 (convergence) 
in the norm topology of y. 

The update rule represents the simple 

idea of expressing Uk+\ in terms of past data. A 
general linear “high-order” rule is 

k k +1 

Uk +1 = Wj Uk-j + y; Kjek+\-j (2) 

7=0 7=0 

with bounded linear operators Wj \U U and 
Kj '.y ->U, regarded as compensation elements 
and/or filters to condition the signals. Typically 
Kj = 0 (resp. Wj = 0) for j > M e (resp. j > 
M u ). A simple structure is 

Uk +i = WoUk + K 0 e k +i + K\ek (3) 

Assuming that G and Wq commute (i.e., GWq = 
WqG), the resultant error evolution takes the form 

e k+ i = (/ + GKq)~\Wq - GK\)e k 

+(I + GK 0 )~ l (I -Wo)(r-d) 

ROBUST ILC: An ILC algorithm is said to be 
robust if convergence is retained in the presence 
of a defined class of modeling errors. 

Results from multipass systems theory (Ed¬ 
wards and Owens 1982) indicate robust conver¬ 
gence of the sequence {ek}k>o to a limit e^ e y 
(in the presence of small modeling errors) if the 
spectral radius condition 

r[(/ + GK 0 )-\W 0 - GKQ ] < 1 (4) 

is satisfied where r [•] denotes the spectral radius 
of its argument. However, the desired condition 
£?oo ~ 0 is true only if Wo = /. For a given 
r, it may be possible to retain the benefits of 
choosing Wq ^ I and still ensure that e^ is 
sufficiently small for the application in mind, 
e.g., by limiting limit errors to a high-frequency 
band. This and other spectral radius conditions 
form the underlying convergence condition when 
choosing controller elements but are rarely com¬ 
puted. The simplest algorithm using eigenvalue 
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computation for a linear discrete time system 
defines the relative degree to be k* = 0 if D 7 ^ 0 
and the smallest integer k such that CA k ~ l B 7 ^ 
0 otherwise. Replacing y by the range of G; 
choosing Wo = I,Ko = 0, and K\ = /; 
and supposing that k* > 1, the Arimoto input 
update rule u k +i(t) = u k (t) + e k (t + k*), 0 < 

t < A + 1 — k* provides robust convergence if, 
and only if, r[I — CA k *~ l B] < 1 . It does not 
imply that the error signal necessarily improves 
each iteration. Errors can reach very high values 
before finally converging to zero. However, if (4) 
is replaced by the operator norm condition 

|| (/ + GKo)- 1 (Wo - GKO II < 1 , then (5) 

{\\e k — &oo II q }£>o monotonically decreases to 
zero. 

The spectral radius condition throws light on 
the nature of ILC robustness. Choosing, for sim¬ 
plicity, Ko = 0 and Wo = I , the requirement 
that r[I — GKi] < 1 will be satisfied by a 
wide range of processes G, namely those for 
which the eigenvalues of I — GK\ lie in the open 
unit circle of the complex plane. Translating this 
requirement into useful robustness tests may not 
be easy in general. The discussion does however 
show that the behavior of GK\ must be “sign- 
definite” to some extent as, if r[I — GK\ ] < 1, 
then r[I — (— G)K\ ] > 1, i.e., replacing the plant 
by — G (no matter how small) will inevitably pro¬ 
duce non-convergent behavior. A more detailed 
characterization of this property is possible for 
inverse model ILC. 


Inverse Model-Based Iteration 

If a linear system G has a well-defined inverse 
model G -1 , then the required input signal is 
Uoo = G _1 (r — d). The simple update rule 

Uk +1 =u k + PG~ l e k , ( 6 ) 

where /3 is a learning gain , produces the dynam¬ 
ics 

e k +\ = (1 - P)e k or e k +\ = (1 - P) k e 0 , 


proving that zero error is attainable with added 
flexibility in convergence rate control by choos¬ 
ing P e (0,2). Errors in the system model used 
in ( 6 ) are an issue. Insight into this problem 
has been obtained for single-input, single-output 
discrete time systems with multiplicative plant 
uncertainty U as retention of monotonic conver¬ 
gence is ensured (Owens and Chu 2012) by a 
frequency domain condition 

|f ~U(e i9 )\ < f for all 9 e [0,2*] (7) 

P P 

that illustrates a number of general empirical 
rules for ILC robust design. The first is that 
a small learning gain (and hence small input 
update changes and slow convergence) will 
tend to increase robustness and, hence, that it is 
necessary that multiplicative uncertainties satisfy 
some form of strict positive real condition which, 
for ( 6 ), is 

Re[t/(e^)] > 0, for all 9 e [0, 2jt] , ( 8 ) 

a condition that limits high-frequency roll-off 
error and constrains phase errors to the range 
( — f ’ f )• The second observation is that if G is 
non-minimum phase, the inverse G -1 is unstable, 
a situation that cannot be tolerated in practice. 

Optimization-Based Iteration 

Design criteria can be strengthened by a mono¬ 
tonicity requirement. Measuring error magnitude 
by a norm \\e\\ q on y, such as the weighted mean 
square error (with Q symmetric, positive definite) 


f e T (t)Qe(t)dt , 
o 

then the condition ||^ + i||g < \\e k \\q for all 
k > 0 provides a performance improvement from 
iteration to iteration. This idea leads to a number 
of design approaches, Owens and Daley (2008) 
and Ahn et al. (2007b) (which also examines 
aspects of robustness). 
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Function/Time Series Optimization 

Norm optimal ILC (NOILC) (Owens and Daley 
2008) guarantees monotonicity and convergence 
to C(X) — 0 by computing Uk+\ to minimize an 
objective function 

J(u) = \\e\\ 2 Q + \\u-u k \\ 2 R , 
subject to plant dynamics. For linear models (1), 
Uk+\ = Uk + G*ek +1 

where G* : y -> U is the adjoint operator of 
G. For continuous or discrete time linear state 
space models, the problem is a classical optimal 
tracking problem with a solution with online state 
feedback and a feedforward term generated off¬ 
line by simulation of an “adjoint” model. Re¬ 
ducing R in J leads to faster convergence rates, 
but the presence of non-minimum-phase zeros 
has a negative effect on convergence (Owens 
and Chu 2010). Monotonicity and convergence to 
zero is retained, but, after an initial fall, the error 
norm then reduces infinitesimally each iteration 
producing the practical effect of limited error 
reductions over finite iteration horizons. Rules 
exist (Owens and Chu 2010) to minimize the 
effect by choice of uo and r. 

Related Linear NOILC Problems 

If y and U are real Hilbert spaces, geometrical 
arguments can be used to generate algorithms ex¬ 
tending the NOILC algorithm to include (Owens 
and Daley 2008) acceleration mechanisms, pre¬ 
dictive control, and the inclusion of input signal 
constraints. They also allow more flexibility in 
the form and specification of the task. In the 
intermediate point NOILC problem (denoted IP- 
NOILC), the task requirement is that the output 
signal y(t), 0 < t < T takes specified values 
r{t\), r(^),..., r(tM ) as it passes through the M 
intermediate points 0 < t\ < ^ < * * * < • The 

precise nature of the trajectory between points 
is of secondary importance. Again, the solution, 
for linear state space systems, can be constructed 
from Riccati equation-based feedback rules com¬ 
bined with “jump” conditions and feedforward 
control signals computed off-line. 


The IPNOILC solution is nonunique, and the 
remaining degrees of freedom can be used to 
satisfy other design objectives. Switching algo¬ 
rithms (Owens et al. 2013) converge to a solution 
of the problem while simultaneously minimizing 
an auxiliary criterion 

4ux(m) = ||z-Zoll| + l|w-Molls- 

Auxiliary optimization is a tool for shaping the 
solution of the IPNOILC problem. The auxiliary 
variable z could be internal states whose behavior 
is important to plant operation or simply defined 
by the output, e.g., z = y which, if small, might 
reduce input “forces” and hence actuator activity. 

Parameter Optimization 

NOILC can be simplified by reducing the degrees 
of freedom defining control action to a small 
number of control law parameters. For a discrete 
system (1), a general update rule is 

Uk +1 = u k + r(J} k +i)ek, k > 0 . 

Here the matrix T(/3) is linear in the p x 1 
parameter vector /3 with T (0) = 0. Under these 
conditions T(/3)e = F(e)/3 where the matrix 
F(e) is linear in e with F( 0) = 0. Examples 
of useful parameterizations include inverse model 
control (Owens et al. 2012). 

Monotonicity of the error norm is ensured by 
choosing the parameter vector fik+\ to minimize 

J(P) = IMIg + P T Wk+ifi 

subject to the dynamic constraint (1). Each p x p 
weighting matrix Wk +i is symmetric, positive 
definite, and may be iteration dependent. The 
algorithm creates a nonlinear ILC law providing 
a link between parameter evolution, past errors, 
and the choice of weight Wk+\. 

Summary and Future Directions 

The basic structure of ILC is now well understood 
with a number of algorithms available with 
known convergence properties and empirical 
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links between parameter choice and convergence 
rates (Ahn et al. 2007a; Bristow et al. 2006; 
Owens and Daley 2008; Wang et al. 2009; 
Xu 2011). Optimization-based algorithms 
provide a structured approach to convergence 
and have a familiar quadratic optimal control 
structure. Despite the practical benefits of 
monotonic error norms, this approach underlines 
the difficulties induced by non-minimum- 
phase (NMP) properties of the plant. Operator 
representations extend this theory to include more 
general problems such as the intermediate point 
tracking problem and, where solutions are non¬ 
unique, can be converted into iterative algorithms 
that inherit the properties of NOILC but converge 
to a solution that also minimizes an auxiliary 
optimization criterion. 

Many of the challenges addressed by NOILC 
are inherited by other algorithms, many of which 
mimic established control design paradigms. For 
example, the commonly used PD update law 

Uk+\(t) = Uk(t) + K\ek(t) + K 2 &k (0 

can produce convergence by suitable choice of 
K\ and K 2 . Proofs of convergence are typi¬ 
cally based on spectral radius conditions similar 
to (4) for linear systems or on techniques such as 
contraction mapping (fixed point) theorems (Xu 
2011) for nonlinear systems. The nonlinear case 
generally suggests local convergence conditions 
dependent on growth conditions on the nonlinear¬ 
ity. They typically cannot be checked in practice 
but do link convergence to simple, empirical, gain 
selection rules. 

ILC, as a topic, is a very large area of study. 
Survey papers indicate that progress has been 
made in a number of other areas including adap¬ 
tive ILC, the use of intelligent control ideas of 
fuzzy logic and neural networks-based control 
structures, 2D systems theory, and mathematical 
studies of fractional order control laws (Chen 
et al. 2013). The further development of ILC 
from its current strong base will draw extensively 
from classical control knowledge but relies on the 
three aspects of plant modeling , control design , 
and coping with uncertainty. Issues central to 
medium-term success include: 


1. Extending current ILC knowledge to other 
classes of model needed for applications. 

2. Integration of online data-based modeling into 
ILC schemes to enhance adaptive control op¬ 
tions. 

3. Ensuring the property of error monotonicity 
or characterizing any non-monotonicity to be 
expected. 

4. The construction of robustness tests and using 
the ideas in new robust design methodologies. 

5. Providing a better understanding of the effect 
of noise and disturbances on algorithm perfor¬ 
mance. 

6. Extending the range of tasks to include, for 
example, different challenges for different out¬ 
puts on different subintervals of [0, T]. 

7. Creating design tools for nonlinear plant that 
ensure convergence and a degree of robustness 
but, in particular, provide some control of 
internal plant states that may be subject to 
dynamical constraints. 
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Abstract 

The Kalman filter is a very useful algorithm 
for linear Gaussian estimation problems. It is 
extremely popular and robust in practical appli¬ 
cations. The algorithm is easy to code and test. 
There are many reasons for the popularity of the 
Kalman filter in the real world, including stability 
and generality and simplicity. Moreover, the real¬ 
time computational complexity is very reason¬ 
able for high-dimensional problems. In particular, 
the computational complexity scales as the cube 
of the dimension of the state vector. 

Keywords 

Controllability; Discrete time measurements; 
Estimation algorithm; Extended Kalman filter; 
Filtering; Gaussian errors; Linear dynamical 
system; Linear system; Observability; Recursive; 
Smoothing; Stability 

Description of Kalman Filter 

The Kalman filter is an algorithm that computes 
the best estimate of the state vector of a linear 


dynamical system given discrete time measure¬ 
ments of a linear function of the state vector 
corrupted by additive white Gaussian noise. The 
Kalman filter also quantifies the uncertainty in 
its estimate of the state vector, using the covari¬ 
ance matrix of estimation errors. The detailed 
equations of the Kalman filter algorithm and the 
problem that it solves are given in Gelb et al. 
(1974), which is the most accessible but thorough 
book on Kalman filters. The linear dynamical 
system can be time varying, but its parameters 
must be known exactly. The measurements can 
be made at arbitrary (nonuniform) discrete times, 
but these times must be known exactly. Likewise, 
the covariance matrices of the measurement er¬ 
rors and the process noise can be arbitrary and 
time varying, but the numerical values of these 
covariance matrices must be known exactly. Also, 
the initial uncertainty in the state vector must be 
Gaussian and the mean and covariance matrix can 
be arbitrary, but these must be known exactly. 
There is a very powerful theory of Kalman filter 
stability due to Kalman (1963), which guaran¬ 
tees that the Kalman filter is stable under very 
mild technical assumptions which can always be 
satisfied in practice. In particular, the Kalman 
filter is stable for estimating the state vector 
of linear dynamical systems that are stable or 
unstable, for arbitrarily slow measurement rates, 
provided that the mild technical assumptions are 
fulfilled. These assumptions require that the di¬ 
mension of the state vector is minimal and that 
the measurement error covariance matrix and the 
process noise covariance matrices are positive 
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definite, although weaker conditions are also suf¬ 
ficient for stability in some cases; see Kailath 
et al. (2000) for such details on the stability 
of the Kalman filter. Kalman filter stability is 
connected with observability and controllability 
of the input-output model of the relevant dynam¬ 
ical system in Kalman (1963) The corresponding 
algorithm for continuous time linear measure¬ 
ments (with Gaussian additive white noise) and 
continuous time linear dynamical systems (with 
Gaussian additive white process noise) is called 
the Kalman-Bucy filter; see Kalman (1961). 

Design Issues 

In engineering practice, almost all real-world 
applications are nonlinear or non-Gaussian, and 
therefore, they do not fit the Kalman filter theory. 
Nevertheless, by approximating the nonlinear dy¬ 
namics and measurements with linear equations, 
one can apply the Kalman filter theory; this is 
called the “extended Kalman filter” (EKF); see 
Gelb et al. (1974). The linearization of the non¬ 
linear dynamics and measurements is made by 
computing the first-order Taylor series expansion 
and evaluating it at the estimated state vector; this 
is a very simple and fast approximation that is 
widely used in real- world applications, and it 
often gives good estimation accuracy, although 
there is no guarantee of that. Moreover, there is 
no guarantee that the EKF will be stable, even if 
the linearized system satisfies all the theoretical 
requirements for stability of the Kalman filter. 

Even if the dynamics and measurements 
are exactly linear and if the measurement 
noise and process noise and initial uncertainty 
are all exactly Gaussian with exactly known 
means and covariance matrices, there can still 
be significant practical problems with Kalman 
filter accuracy, owing to ill-conditioning. In 
particular, the Kalman filter can be extremely 
sensitive to quantization errors in the computer 
arithmetic and storage. On the other hand, there 
are many different methods to try to mitigate ill- 
conditioning including (1) double or quadruple 
or octuple precision arithmetic, (2) making the 
covariance matrices symmetric before and after 


every operation, (3) Tychonov regularization, 

(4) tuning the process noise covariance matrix, 

(5) coding the Kalman filter in principal coor¬ 
dinates or approximately principal coordinates 
(i.e., aligned with the eigenvectors of the state 
vector error covariance matrix), (6) sequential 
scalar measurement updates in a preferred order, 
and (7) various factorizations of the covariance 
matrices (e.g., square root, information matrix, 
information square root, upper triangular and 
lower triangular factorization, UDL, etc.). The 
classic book on error covariance matrix factor¬ 
izations is by Bierman (2006). Unfortunately, 
there is no guarantee that the Kalman filter will 
work well even if all of these mitigation methods 
are used. Moreover, there is no useful theoretical 
analysis of this phenomenon, with the exception 
of a few not very tight upper bounds on the 
condition number. Plotting the numerical values 
of the condition number of the covariance matrix 
vs. time is often a helpful diagnostic. In certain 
real-world applications, the condition number of 
the Kalman filter error covariance matrix can be 
ten billion or larger. 

Why Is the Kalman Filter So Useful 
and Popular? 

The Kalman filter has been enormously success¬ 
ful in real-world applications, and it is inter¬ 
esting to reflect on why it has been so useful 
and so popular. In particular, Kalman himself 
believes that his filter was successful because it 
was based on probability rather than statistics; 
see Kalman (1978). For example, the error co- 
variance matrix for the Kalman filter is computed 
from the assumed dynamics and measurement 
model and the assumed values of the initial state 
uncertainty and process noise and measurement 
noise covariance matrices, rather than by com¬ 
puting sample covariance matrices. Likewise, the 
Kalman filter computes the estimated state vector 
from assumed Gaussian and linear probability 
models, rather than computing the sample mean, 
as would be done in statistics. There is substantial 
wisdom in Kalman’s assertion, owing to the dif¬ 
ficulty of estimating sample covariance matrices 
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and sample vectors that are sufficiently accu¬ 
rate, given a limited number of samples and ill- 
conditioning and high-dimensional state vectors. 
A second reason that Kalman filters are so popu¬ 
lar is that the real-time computational complexity 
is very reasonable for modern digital comput¬ 
ers, even for problems with a high-dimensional 
state vector. In particular, the computational com¬ 
plexity of the Kalman filter scales as the cube 
of the dimension of the state vector; for ex¬ 
ample, the modern GPS system uses a Kalman 
filter with a state vector of dimension of about 
1,000 to jointly estimate the orbits of the satel¬ 
lites in the GPS constellation. In 1960, when 
Kalman’s paper was first published, digital com¬ 
puters were starting to become fast enough at 
reasonable cost to multiply large matrices in real 
time, which is the most challenging computation 
in a Kalman filter. Today computers are roughly 
ten orders of magnitude faster per unit cost than 
in 1960, and hence, we can run Kalman filters 
for high-dimensional problems on very inexpen¬ 
sive computers that fit into your wristwatch. A 
third reason that Kalman filters are popular is 
that the algorithms are easy to understand and 
code and test. A fourth reason is the guaran¬ 
teed stability of the Kalman filter under very 
mild conditions which can always be satisfied in 
practical applications. A fifth reason is that the 
Kalman filter is optimal for time-varying unstable 
linear dynamics with time-varying measurement 
noise covariance and process noise covariance. 
A sixth reason is that one can use Kalman fil¬ 
ters for nonlinear problems by approximating the 
nonlinear dynamics and measurement equations 
with a first-order Taylor series; this is called the 
extended Kalman filter (EKF), which is without 
a doubt the most widely used algorithm in real- 
world estimation applications. A seventh reason 
is that the Kalman filter automatically provides 
a convenient quantification of uncertainty of the 
estimated state vector using the error covariance 
matrix. The final reason is that it is easy to test 
the accuracy of the Kalman filter by comparing 
the theoretical error covariance matrix to errors 
computed by Monte Carlo simulations of the 
filter; the two errors should agree approximately, 
and statistically significant discrepancies suggest 


bugs in the code or ill-conditioning of the error 
covariance matrices or nonlinearities or errors in 
modeling the dynamics or measurements. 

Kalman’s 1960 paper represented a big 
paradigm shift in two ways: (1) it exploited fast 
low-cost modern digital computers, whereas the 
literature up to that time did not, and (2) it used 
time domain methods rather than the ubiquitous 
Fourier transform methods, which limited the 
dynamics to steady state asymptotic in time, 
which in turn limited the theory to cover stable 
dynamics. Of course today we take both of 
these big points as normal engineering rather 
than revolutionary and surprising. The state of 
the art prior to Kalman’s 1960 paper was the 
Wiener filter, which was based firmly on the 
Fourier transform. The Wiener filter required 
very lengthy and cumbersome algebraic spectral 
factorization with complex variables, resulting in 
erroneous formulas published in certain books, 
owing to algebraic errors which were not obvious 
and which could not be checked by computers, 
owing to the nonexistence of computer algebra 
software (e.g., MATHEMATICA) in 1960. 
Kalman explains many other problems with the 
Wiener filter in Kalman (2003). 

Summary and Future Directions 

To a large extent the Kalman filter theory is 
complete. There is a simple and useful theory 
of stability of the Kalman filter, which is com¬ 
pletely lacking for nonlinear filters including the 
extended Kalman filter (EKF) and particle fil¬ 
ters. Moreover, there are many robust versions 
of the Kalman filter that have been invented to 
mitigate ill-conditioning of the covariance matrix 
as well as uncertainty in system models and 
system parameters. However, there remain many 
important design issues that need to be addressed 
for practical applications; see Daum (2005) for 
details. The most obvious issues include nonlin¬ 
ear measurements, nonlinear plant models, ro¬ 
bustness to uncertainty in measurement models 
and plant models, non-Gaussian measurement 
noise and plant noise, non-Gaussian initial un¬ 
certainty in the state vector, and ill-conditioning 




608 


KYP Lemma and Generalizations/Applications 


of the error covariance matrix. That is, any de¬ 
viation from the exact mathematical assumptions 
in the Kalman filter theory can cause problems 
in practice. It is the job of engineers to miti¬ 
gate such problems and design filters that are 
robust to such perturbations. Kalman’s recent 
papers on how the Kalman filter was invented 
and why it is so popular contain interesting and 
useful ideas; see Kalman (1978) and Kalman 
(2003). 
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Abstract 

Various properties of dynamical systems 
can be characterized in terms of inequality 
conditions on their frequency responses. The 
Kalman-Yakubovich-Popov (KYP) lemma shows 
equivalence of such frequency domain inequality 
(FDI) and a linear matrix inequality (LMI). The 
fundamental result has been a basis for robust 
and optimal control theories in the past several 
decades. The KYP lemma has recently been 
generalized to the case where an FDI on a 
possibly improper transfer function is required 
to hold in a (semi)finite frequency range. The 
generalized KYP lemma allows us to directly 
deal with practical situations where design 
parameters are sought to satisfy FDIs in multiple 
(semi)finite frequency ranges. Various design 
problems, including FIR filter and PID controller, 
reduce to LMI problems which can be solved via 
semidefinite programming. 


Keywords 

Bounded real; Frequency domain inequality; 
Linear matrix inequality; Multi-objective design; 
Optimal control; Positive real; Robust control 


Introduction 

In linear systems analysis and control design, 
dynamical properties are often characterized 
by frequency responses. The shape of a 
frequency response, as visualized by the 
Bode or Nyquist plot, is closely related to 
various performance measures including the 
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steady state error, fast and smooth transient, 
and robustness against unmodeled dynamics. 
Hence, desired system properties can be 
formalized in terms of a set of frequency 
domain inequalities (FDIs) on selected transfer 
functions. The analysis and design problems 
then reduce to verification and satisfaction of 
the FDIs. 

The Kalman-Yakubovich-Popov (KYP) 
lemma (Anderson 1967; Kalman 1963; Rantzer 
1996; Willems 1971) establishes the equivalence 
between an FDI and a linear matrix inequality 
(LMI). The LMI is defined by state space 
matrices of the transfer function in the FDI so 
that the FDI holds true if and only if the LMI 
admits a solution. The LMI characterization of 
an FDI is useful since it replaces the process of 
checking the FDI at infinitely many frequency 
points by the search for a symmetric matrix 
satisfying a finite dimensional convex constraint 
defined by the LMI. In addition to exact and 
tractable computations, benefits of the LMI 
conditions include analytical understanding of 
robust and optimal controls through spectral 
factorizations and storage/Lyapunov functions. 
The KYP lemma is a fundamental result in the 
systems and control field that has provided, in 
the past half century, a theoretical basis for 
developments of various tools for system analysis 
and design. 

A drawback of the KYP lemma is its in¬ 
ability to characterize an FDI in a finite fre¬ 
quency range. Feedback control designs typically 
involve a set of specifications given in terms 
of multiple FDIs in various frequency ranges. 
However, the KYP lemma is not capable of treat¬ 
ing such FDIs directly since it has to consider 
the entire frequency range. To address this defi¬ 
ciency, the KYP lemma has recently been gener¬ 
alized to characterize an FDI in a finite frequency 
range exactly (Iwasaki et al. 2000). Further gen¬ 
eralizations (Iwasaki and Hara 2005) are avail¬ 
able for FDIs within various frequency ranges 
for both continuous- and discrete-time, possi¬ 
bly improper, rational transfer functions. The 
generalized KYP lemma allows for direct multi¬ 
objective design of filters, controllers, and dy¬ 
namical systems. 


KYP Lemma 

The KYP lemma may be motivated from various 
aspects, but let us explain it as an extension of a 
gain condition. Consider a stable linear system 

x = Ax + Bu, G(s ) := (si — A)~ l B , 

where x(t) e W 1 is the state, u(t) e M m is 
the input, and G(s ) is the transfer function from 
u to v. If u is a disturbance to the system and 
v represents the error from a desired operating 
point, we may be interested in how large the 
state variables can become for a given magnitude 
of the disturbance. The gain ||G(y<z>)|| captures 
this property for the case of a sinusoidal distur¬ 
bance at frequency co, where || • || denotes the 
spectral norm (= absolute value for a scalar). If 
|| G(jco) || < y holds for all frequency co with a 
small y, then the system has a good disturbance 
attenuation property. 

A version of the KYP lemma states that the 
FDI \\G(jco)\\ < y with y = 1 holds for 

all frequency co if and only if there exists a 
symmetric matrix P satisfying the LMI: 

[PA + A T P + I PB~\ „ 

L B'P -/J <0 

Thus, existence of one particular P satisfying the 
LMI is enough to conclude that the gain is less 
than one for all, infinitely many, frequencies. This 
result is known as the bounded real lemma and 
has played a fundamental role in the robust and 
H 0 o control theories. 

The KYP lemma can be introduced as a gen¬ 
eralization of the bounded real lemma. First, note 
that the gain bound condition \\G(jco)\\ < 1 and 
the LMI condition can equivalently be written as 


G(jco)' 

I 

* 

© 

'G(jcoy 

I 

< o, 

(1) 

PA + A 
B J P 

T P 


+ 0<O, 

(2) 


where 
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In these equations, the particular matrix © is 
chosen to describe the gain bound condition as 
a special case of the quadratic form (1), and we 
observe that 0 appears in the LMI as in (2). 
It turns out that the equivalence of (1) and (2) 
holds not only for this particular 0 but also for 
an arbitrary symmetric matrix 0. This result is 
called the KYP lemma, which states that, given 
arbitrary matrices A, B, and 0 = 0 T , the FDI 
(1) holds for all frequency co if and only if there 
exists a matrix P = P T satisfying the LMI (2), 
provided A has no eigenvalues on the imaginary 
axis. 

The FDI in (1) can be specialized to an FDI 


L{j<o) 

I 


n 


L{jco) 

i 


< 0. 


(3) 


on transfer function 

L(s) := C(sl — A)~ X B + D 


by choosing 


0 := 


~c 

D~ 

T 

n 

"c 

D~ 

_ 0 

I 

_ 0 

I 


(4) 


The choice of matrix II allows for characteri¬ 
zations of important system properties involving 
gain and phase of L(s). For instance, the FDI (3) 
with 



gives L(jco) + L{jco )* > 0. This is called the 
positive real property, with which the phase angle 
remains between ±90° when L{jco) is a scalar. 


Generalization 


For instance, a version of the generalized KYP 
lemma states that the FDI (1) holds in the low 
frequency range \co\ < mi if and only if there 
exist matrices P = P T and Q = Q T > 0 
satisfying 


~A B~ 

T 

'-Q p " 

~A B~ 

J °_ 


. p ™ 2 tQ_ 

J 0 _ 


+ 0 < 0, (5) 


provided A has no imaginary eigenvalues in the 
frequency range. In the limiting case where mi 
approaches infinity and the FDI is required to 
hold for the entire frequency range, the solution 
Q to (5) approaches zero, and we recover (2). 

The role of the additional parameter Q is to 
enforce the FDI only in the low frequency range. 
To see this, consider the case where the system is 
stable and a sinusoidal input u = ^{ue ja)t ], with 
(complex) phasor vector u, is applied. The state 
converges to the sinusoid x = ^[xe^ 1 ] in the 
steady state where x := G{jco)u. Multiplying (5) 
by the column vector obtained by stacking x and 
u in a column from the right, and by its complex 
conjugate transpose from the left, we obtain 


{ml — co 2 )x*Qx 


r A -i 

* 

r ^ -i 

V 

0 

V 



U 


U 


In the low frequency range \co\ < mi, the first 
term is nonnegative, enforcing the second term to 
be negative, which is exactly the FDI in (1). If co 
is outside of the range, however, the first term is 
negative, and the FDI is not required to hold. 

Similar results hold for various frequency 
ranges. The term involving Q in (5) can be 
expressed as the Kronecker product (g) Q with 
^ being a diagonal matrix with entries (—1, m 2 ). 
The matrix T* arises from characterization of the 
low frequency range: 


The standard KYP lemma deals with FDIs that 
are required to hold for all frequencies. To allow 
for more flexibility in practical system designs, 
the KYP lemma has been generalized to deal with 
FDIs in (semi)finite frequency ranges. 


jco 

1 




jco 

1 


ml — co 2 > 0. 


By different choices of T*, middle and high fre¬ 
quency ranges can also be characterized: 



























KYP Lemma and Generalizations/Applications 


611 



Low 

Middle 

High 

ft 

1 

\co\ < mi 

m\ < co < m 2 

Al 

JL £ 

vL 

1 1 

0 J- 

4.0 

1 _ 1 


-1 jm c ' 

—J m c ~m\ m 2 _ 


"1 0 ' 
_° -m 2 h _ 


where m c := (m\ + tu 2)/2 and ft is the 

frequency range. For each pair (ft, 'I'), the FDI 
( 1 ) holds in the frequency range co e ft if and 
only if there exist real symmetric matrices P and 
Q > 0 satisfying 

F T (<F 0 P + 'F 0 g)F + © < 0, ( 6 ) 


provided A has no eigenvalues in ft, where 


0 > := 


0 

1 


1 

0 



B 

0 


Further generalizations are available Iwasaki 
and Hara (2005). The discrete-time case (fre¬ 
quency variable on the unit circle) can be simi¬ 
larly treated by a different choice of 0. FDIs for 
descriptor systems and polynomial (rather than 
rational) functions can also be characterized in a 
form similar to ( 6 ) by modifying the matrix F. 
More specifically, the choices 


O := 


-1 

0 


O' 

1 



B 

O 


give the result for the discrete-time transfer func¬ 
tion L{z) = (zE - A)~\B-zO). 


Applications 

The generalized KYP lemma is useful for a va¬ 
riety of dynamical system designs. As an exam¬ 
ple, let us consider a classical feedback control 
design via shaping of a scalar open-loop transfer 
function in the frequency domain. The objective 
is to design a controller K(s ) for a given plant 
P(s ) such that the closed-loop system is stable 
and possesses a good performance dictated by 
reference tracking, disturbance attenuation, noise 
sensitivity, and robustness against uncertainties. 



KYP Lemma and Generalizations/Applications, Fig. 1 

Loop shaping design specifications 


Typical design specifications are given in 
terms of bounds on the gain and phase of the 
open-loop transfer function L(s ) := P(s)K(s ) 
in various frequency ranges as shown in Fig. 1. 
The controller K(s ) should be designed so that 
the frequency response L(jco) avoids the shaded 
regions. For instance, the gain should satisfy 
\L(jco)\ > 1 for \eo\ < 0)2 and \L(jco)\ < 1 for 
H > 0 )3 to ensure the gain crossover occurs 
in the range 0)2 < 00 < 0 ) 3 , and the phase 
bound AL(jco) >6 in this range ensures robust 
stability by the phase margin. 

The design specifications can be expressed as 
FDIs of the form (3), where a particular gain or 
phase condition can be specified by setting IT as 


fl 0 1 


0 j — tan 0 

1 

0 

1 

1 10 

or 

—j — tan 6 0 


with y = yi, 3 / 4 , or 1 , and the +/— signs 
for upper/lower gain bounds. These FDIs in the 
corresponding frequency ranges can be converted 
to inequalities of the form ( 6 ) with 0 given by (4). 

The control problem is now reduced to the 
search for design parameters satisfying the set of 
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inequality conditions (6). In general, both coeffi¬ 
cient matrices F and 0 may depend on the design 
parameters, but if the poles of the controller are 
fixed (as in the PID control), then the design 
parameters will appear only in 0. If in addition 
an FDI specifies a convex region for L ( jco ) on the 
complex plane, then the corresponding inequality 
(6) gives a convex constraint on f, Q , and the 
design parameters. This is the case for specifica¬ 
tions of gain upper bound (disk: \L\ < y) and 
phase bound (half plane: 6 < AL < 6 -\- n). A 
gain lower bound \L\ > y is not convex but can 
often be approximated by a half plane. The design 
parameters satisfying the specifications can then 
be computed via convex programming. 

Various design problems other than the open- 
loop shaping can also be solved in a similar 
manner, including finite impulse response (FIR) 
digital filter design with gain and phase con¬ 
straints in a passband and stop-band and sensor 
or actuator placement for mechanical control sys¬ 
tems (Hara et al. 2006; Iwasaki et al. 2003). Con¬ 
trol design with the Youla parametrization also 
falls within the framework if a basis expansion 
is used for the Youla parameter and the coeffi¬ 
cients are sought to satisfy convex constraints on 
closed-loop transfer functions. 

Summary and Further Directions 

The KYP lemma has played a fundamental role 
in systems and control theories, equivalently con¬ 
verting an FDI to an LMI. Dynamical systems 
properties characterized in the frequency domain 
are expressed in terms of state space matrices 
without involving the frequency variable. The 
resulting LMI condition has been found useful for 
developing robust and optimal control theories. 

A recent generalization of the KYP lemma 
characterizes an FDI for a possibly improper ra¬ 
tional function in a (semi)finite frequency range. 
The result allows for direct solutions of practical 
design problems to satisfy multiple specifications 
in various frequency ranges. A design problem is 
essentially solvable when transfer functions are 
affine in the design parameters and are required 


to satisfy convex FDI constraints. An important 
problem, which falls outside of this framework 
and remains open, is the design of feedback con¬ 
trollers to satisfy multiple FDIs on closed-loop 
transfer functions in various frequency ranges. 
There have been some attempts to address this 
problem, but none of them has so far succeeded 
to give an exact solution. 

The KYP lemma has been extended in 
other directions as well, including FDIs 
with frequency-dependent weights (Graham 
and de Oliveira 2010), internally positive 
systems (Tanaka and Langbort 2011), full 
rank polynomials (Ebihara et al. 2008), real 
multipliers (Pipeleers and Vandenberghe 2011), 
a more general class of FDIs (Gusev 2009), 
multidimensional systems (Bachelier et al. 2008), 
negative imaginary systems (Xiong et al. 2012), 
symmetric formulations for robust stability 
analysis (Tanaka and Langbort 2013), and 
multiple frequency intervals (Pipeleers et al. 
2013). Extensions of the KYP lemma and related 
S-procedures are thoroughly reviewed in Gusev 
and Likhtarnikov (2006). A comprehensive 
tutorial of robust LMI relaxations is provided 
in Scherer (2006) where variations of the KYP 
lemma, including the generalized KYP lemma as 
a special case, are discussed in detail. 

Cross-References 

► Classical Frequency-Domain Design Methods 

► H-Infinity Control 

► LMI Approach to Robust Control 
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Abstract 

This chapter provides an overview of lane keep¬ 
ing systems. First, a general architecture is in¬ 
troduced and existing solutions for the necessary 
sensors and actuators are then overviewed. The 
threat assessment and the lane position control 
problems are discussed, highlighting challenges 
and solutions implemented in lane keeping sys¬ 
tems available on the market. 
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Introduction 

Lane keeping systems are vehicle guidance 
systems that aim at preventing lane departure 
maneuvers, which may lead to accidents, i.e., 
collision with surrounding obstacles and vehicles. 


By resorting to radar and/or lasers and cameras, a 
lane keeping system monitors the adjacent lanes. 
Crossing the lane markings in the absence of 
vehicles and/or obstacles in the adjacent lanes 
should not cause any reaction of the lane keeping 
systems and let the driver freely perform the lane 
change maneuver. In the presence of vehicles or 
obstacles in the adjacent lanes, the system should 
assess the threat and, in case a risk of collision is 
detected, either warn the driver or automatically 
issue either a steering or a single-wheel braking 
command, in order to prevent the crossing of 
the lane markings. As discussed next, despite 
the simplicity of the threat assessment and the 
decision-making and control problem, challenges 
arise in real traffic scenarios which may lead 
to nuisance due to unnecessary warnings and/or 
assisting interventions. 

In this entry, we overview the most important 
aspects in the design of a lane keeping sys¬ 
tem. This entry is structured as follows. Section 
“Lane Keeping Systems Architecture” illustrates 
a generic architecture. Section “Sensing and Ac¬ 
tuation” reviews the most used sensors suitable 
for lane keeping applications. Section “Decision 
Making and Control” introduces the threat as¬ 
sessment and the lane position control problems, 
highlighting the most relevant challenges. 

Lane Keeping Systems Architecture 

The main components of a lane keeping system 
and their interconnections are shown in Fig. 1. 


J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9, 
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Relative positions and velocities of the host ve¬ 
hicle w.r.t. the surrounding environment are mea¬ 
sured by one radar, typically installed on the 
front of the vehicle, and possibly by the camera, 
typically installed on the windshield. Position 
of the host vehicle within the lane and further 
information, e.g., road geometry, are measured by 
the camera. These measurements are then fused 
by the sensor fusion module to provide accurate 
measurements of the position and velocity of 
the vehicle w.r.t. the surrounding environment 
and the lane in the widest range of operating 
conditions and scenarios. 

The task of the decision-making and control 
module is to assess the risk that the vehicle 
crosses the lane in a dangerous way and, possibly, 
to take an action that can range from warning 
the driver or issuing an assisting intervention, 
e.g., braking and/or steering. Such steering and 
braking commands are actually implemented by 
low-level controllers. 

The different modules will be overviewed in 
the following sections. 

Sensing and Actuation 

Radar 

Radars for automotive applications are placed in 
the front of the car, typically behind the grille. 
The radar emits radio waves and distance from 
the vehicle ahead is calculated by measuring the 
arrival time and direction of the reflected radio 
waves. The relative velocity is determined by 


relying on the Doppler effect, i.e., by measuring 
the frequency change of the reflected waves. 
Relative distance and velocity measurements are 
typically updated with a frequency of 10 Hz. 

Radars for automotive applications emit waves 
with a frequency of 77 GHz and detect objects 
within an approximate range of 150 m and a 
view angle of about ±10°, with a deviation of 
20-30 cm from the correct value for 95 % of 
the measurements (Eidehall 2004). New radar 
systems increase the range up to about 200 m 
with a view angle of about ±10° (News Releases 
DENSO Corporation 2013a). 

Typically, radar units are equipped with com¬ 
puter systems running signal processing algo¬ 
rithms that detect and track objects and, for each 
of them, calculate relative position and speed, 
azimuth angle, also providing additional informa¬ 
tion, e.g., the time an object has been tracked and 
a flag indicating that a target has been locked. 
Such additional information are typically used in 
logics implementing the decision-making algo¬ 
rithms of, among others, the lane keeping system. 

There are several issues arising from the use of 
a radar in automotive applications, e.g., wave re¬ 
flections due to road bumps and barriers that may 
induce the signal processing algorithms to false 
object detections (Eidehall 2004). Moreover, in¬ 
terference and the vehicle dynamics (News Re¬ 
leases DENSO Corporation 2013a), e.g., pitching 
due to braking, may limit the capability of the sig¬ 
nal processing algorithms of correctly detecting 
and tracking the surrounding objects. The latter 
may be solved by, e.g., using electric motors that 
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adjust the radar antenna axes in order to com¬ 
pensate for the vehicle dynamics (News Releases 
DENSO Corporation 2013a). 

Vision Systems 

Vision systems in lane keeping applications 
are typically based on a single, CCD camera 
mounted next to the rear-view mirror placed at the 
center of the windshield. The image is typically 
captured by 640 x 480 pixels and then processed 
by an image processing unit. The sampling time 
of the vision system is about 0.1s, but it can 
change depending on, e.g., the complexity of the 
scene, for example, in city traffic (Eidehall 2004). 

Lane markings are detected by using differ¬ 
ences in the image contrast (Technology Daim¬ 
ler and Safety Innovation 2013). The camera 
can be either monochrome or full colored. The 
latter is used to enhance the detection of lane 
markings, which have different colors around 
the world (News Releases DENSO Corporation 
2013b). Distances to the lane markings and road 
geometry parameters, like heading angle and cur¬ 
vature, are determined by the image processing 
algorithms, which must be robust to poor image 
due to bad weather conditions or worn lane mark¬ 
ings. Estimation of road geometry parameters, 
like curvature measurement, can be a challenging 
problem (Lundquist and Schon 2011), especially 
during rain or fog (Eidehall 2004). 

Depending on the image processing al¬ 
gorithms the cameras are equipped with, 
surrounding objects can also be detected 
and tracked. In particular, pattern recognition 
algorithm can be used to find objects in the 
images and classify them into cars, trucks, 
motorcycles, and pedestrians. Vehicles (or other 
objects) can be typically detected in a range 
of about 60-70 m, with lower accuracy than a 
radar (Eidehall 2004). 

Actuators 

In order to keep the vehicle within its lane, the 
most convenient actuator is the steering. Hence, 
a lane keeping system can be quite easily built 
in those vehicles equipped with electric power- 
assisted steering (EPAS) systems. In particular, 
an additional steering torque can be added by the 


EPAS to the driver’s steering torque, in order to 
generate the desired yaw moment calculated by 
the decision-making and control module. 

Clearly, the steering command is not the only 
available to affect the vehicle yaw motion, thus 
changing its orientation and lateral position 
within the lane. Individual wheel braking may 
also be used (Technology Daimler and Safety 
Innovation 2013). In particular, in vehicles 
equipped with yaw motion control system via 
individual braking, a braking torque request 
for each wheel can be sent to the yaw motion 
control system in order to generate the desired 
yaw motion. 

Decision Making and Control 

The decision making and control in a lane keep¬ 
ing problem can be conceptually divided into two 
tasks: the threat assessment and the lane position 
control. The threat assessment problem can be 
stated as the problem of detecting the risk of 
accident due to an unintended lane departure, for 
a given situation of the surrounding environment 
(i.e., surrounding vehicles and obstacles). The 
lane position control problem is the problem of 
controlling the vehicle yaw and lateral motion in 
order to stay within the lane. The lane position 
control is activated once the threat assessment 
detects the risk of accident. 

We point out that the border between the corre¬ 
sponding modules executing these two tasks may 
be blurred for different existing commercial lane 
keeping systems. That is, the two problems may 
not be solved by two separate modules, but rather 
seen and solved as a single problem. Moreover, 
the following presentation of the threat assess¬ 
ment and the lane position control problems and 
approaches abstracts from the implementation of 
a particular lane keeping system available on the 
market, rather focusing on fundamental concepts. 

Threat Assessment 

The core information in a threat assessment algo¬ 
rithm for lane keeping applications is given by a 
measure called time to lane crossing (TLC). This 
is the predicted time when a front tire intersects a 
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lane boundary. As explained in van Winsum et al. 
(2000), the TLC can be calculated in different 
ways. Next, its simplest expression is reported as 
(Eidehall 2004) 


TLC = 


W72-WW2-joff 

joff 


( 1 ) 


where W is the lane width, y 0 ff is the vehicle 
lateral position within the lane, and h is the 
vehicle width. Equation (1) can be easily modi¬ 
fied to calculate the TLC w.r.t. any lane boundary 
relative to the adjacent lanes. 

The simplest way of using the TLC is just 
monitoring it and triggering an action as the 
TLC passes a threshold. Nevertheless, depending 
on the vehicle manufacturer, more sophisticated 
logics can be developed in order to correctly 
interpret the driver’s intention and minimize the 
unnecessary assisting interventions. Next, few 
scenarios follow that must be taken into account 
while developing such logics in order to not 
interfere with the driver. In particular, the threat 
assessment module should stop or not trigger 
any assisting intervention while the vehicle is 
approaching or crossing a lane boundary if 

• The indicators are active, 

• A risk of collision with the vehicle ahead 
is detected, such that the vehicle is crossing 
the lane markings as results of an evasive 
maneuver, 

• The radar detects a slower vehicle ahead and 
the driver accelerates, since this may be an 
overtaking (Technology Daimler and Safety 
Innovation 2013), 

• The driver’s steering wheel torque indicates 
that the driver is acting against the system, 

• The driver manually initiates a maneuver, 
driving the vehicle back to its lane (i.e., the 
driver executes “the right” maneuver) 

• The vehicle enters a motor highway or 
a bend (Technology Daimler and Safety 
Innovation 2013). 

Part of the threat assessment task is predicting 
the trajectories of the surrounding vehicles. Lor 
instance, if a threat vehicle is traveling in the 
adjacent lane (in the same or opposite direction), 
its position has to be predicted at the TLC in 


order to decide whether to trigger an intervention, 
if a collision is predicted, or not (Eidehall 
2004). This step is repeated for all the detected 
threat vehicles, provided that the onboard 
radar and the camera support multiple-target 
tracking. 

In order to minimize the interference of the 
lane keeping system with the driver and/or to 
not let the system perform dangerous maneuvers, 
assisting interventions should not be triggered if 
the quality of the measurements is such that the 
information about the surrounding environment 
is poor. Lor instance, in case of low visibility 
that limits the detection of the lane markings 
and the estimation of the road geometry, the sys¬ 
tem should be temporarily deactivated or down¬ 
graded. 

In summary, the threat assessment module has 
to be designed with the objective of detecting 
the risk of accident due to lane departure while 
not interfering with the driver with unnecessary 
interventions (i.e., nuisance minimization). 

Lane Position Control 

As observed in section “Actuators,” the vehicle 
motion within the lane can be affected in two 
ways, i.e., through steering and individual wheel 
braking. Clearly, a steering command can be 
issued by both the driver and the lane keeping 
system. 

Before issuing a steering command, in order to 
minimize the system nuisance, the lane keeping 
system may issue other types of low-intrusiveness 
interventions. Lor instance, if a “low”-level threat 
is detected by the threat assessment module (i.e., 
a threat where the risk of accidents is not im¬ 
minent), warnings or other stimuli to the driver 
may be issued in order to induce the driver to 
execute the right maneuver. Lor instance, based 
on, e.g., spectrum analysis of the driver’s steering 
command, driver’s inattention or drowsiness may 
be detected and a warning issued. As observed 
in Technology Daimler and Safety Innovation 
(2013), different types of warning can be used for 
different vehicle types. In passenger cars, in such 
cases, a vibration motor in the steering wheel may 
warn the driver. In trucks, audible, directional 
warning signals can be used to let the driver know 
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that the vehicle trajectory needs to be adjusted. In 
buses, in order to avoid bothering the passengers, 
driver warning is issued through vibration motors 
placed in the driver’s seat. 

Other types of “soft intervention” aim at in¬ 
creasing the steering impedance in the direction 
leading to lane crossing that might cause a col¬ 
lision with surrounding vehicles. Generating the 
desired steering impedance can be easily formu¬ 
lated as a steering torque control problem. Never¬ 
theless, tuning the control algorithm to obtain the 
desired steering feeling can be an involving and 
time-consuming procedure based on extensive in- 
vehicle testing. 

Besides warnings and “soft interventions” 
aiming at inducing the driver to perform correct 
maneuvers, as part of the lane position control 


task in a lane keeping system, a lateral control 
algorithm w.r.t. the lane boundaries is needed. 
Consider the vehicle sketched in Fig. 2. The 
equations describing the vehicle motion within 
the lane can be compactly written in a state-space 
form as 

x = Ax + B8 + Di/^es, (2) 

where v = [e y e y e^ e^], \j/d is the desired yaw 
rate, e.g., calculated based on the road curvature, 
and A, B, D are speed-dependent matrices that 
can be found in Rajamani (2003). The (unstable) 
system can be stabilized by a state-feedback con¬ 
trol law 


8 = —Kx + 8 ff , 


(3) 
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where K is a stabilizing static gain and <5// is a 
feedforward term that can be used to compensate 
for the road curvature. In Rajamani (2003), it is 
shown that, while e y (t ) 0 as t 0, ap¬ 

proaches a nonzero steady-state value, no matter 
how 8 // is chosen, for non-straight road. 

Despite a simple problem formulation and 
solution, controlling the vehicle position within 
the lane is not a trivial task. Indeed, having 
the control law (3) active all the time may 
increase the nuisance, leading to unacceptable 
driving experience. For this reason, the steering 
command calculated through the (3) may be 
active only when the vehicle significantly 
deviates from the road centerline, i.e., approaches 
the lane markings. Clearly, adding such logics 
complicates the analysis of the closed-loop 
behavior, thus making necessary extensive in- 
vehicle tuning and verification. 

Summary and Future Directions 

In this chapter, we have overviewed the general 
issues and requirements that must be considered 
in the design of a lane-keeping system. 

The variety of environmental conditions the 
sensing system should operate in, together with 
the range of diverse scenarios the decision¬ 
making module should cope with, render the 
design and verification problems challenging, 
costly, and time consuming for a lane-keeping 
system. It is, therefore, necessary to approach 
the design of such systems by also providing 
safety guarantees to the largest extent, yet 
minimizing conservatism and intrusiveness of 
the overall system. Model-based approaches to 
threat assessment and decision-making problems, 
as proposed in Falcone et al. (2011) for a 
lane departure application, provide neat design 
and verification frameworks, which can clearly 
describe the safe operation of the overall 
system. Adopting such design methodologies can 
potentially contribute to a consistent reduction of 
the development time by consistently reducing 
the a posteriori safety verification phase. On the 
other hand, the computational complexity of 
formal model-based verification methods can 
dramatically increase in those scenarios where 
system nonlinearity and nonconvex state spaces 


become relevant. Hence, future research efforts 
aiming at developing low-complexity verification 
methods might greatly impact the future 
development of automated driving systems. 

Cross-References 

► Adaptive Cruise Control 

► Vehicle Dynamics Control 
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Abstract 

In a Nash equilibrium, each player selects a 
strategy that is optimal with respect to the strate¬ 
gies of other players. This definition does not 
mention the process by which players reach a 




Learning in Games 


621 


Nash equilibrium. The topic of learning in games 
seeks to address this issue in that it explores how 
simplistic learning/adaptation rules can lead to 
Nash equilibrium. This entry presents a selec¬ 
tive sampling of learning rules and their long- 
run convergence properties, i.e., conditions under 
which player strategies converge or not to Nash 
equilibrium. 


Keywords 

Cournot best response; Fictitious play; Log-linear 
learning; Mixed strategies; Nash equilibrium 


Introduction 

In a Nash equilibrium , each player’s strategy is 
optimal with respect to the strategies of other 
players. Accordingly, Nash equilibrium offers a 
predictive model of the outcome of a game. That 
is, given the basic elements of a game - (i) a set 
of players; (ii) for each player, a set of strategies; 
and (iii) for each player, a utility function that 
captures preferences over strategies - one can 
model/assert that the strategies selected by the 
players constitute a Nash equilibrium. 

In making this assertion, there is no suggestion 
of how players may come to reach a Nash equi¬ 
librium. Two motivating quotations in this regard 
are: 

The attainment of equilibrium requires a disequi¬ 
librium process (Arrow 1986). 

and 

The explanatory significance of the equilibrium 
concept depends on the underlying dynamics 
(Skyrms 1992). 

These quotations reflect that a foundation for 
Nash equilibrium as a predictive model is dynam¬ 
ics that lead to equilibrium. Motivated by these 
considerations, the topic of “learning in games” 
shifts the attention away from equilibrium and 
towards underlying dynamic processes and their 
long-run behavior. The intent is to understand 
how players may reach an equilibrium as well 


as understand possible barriers to reaching Nash 
equilibrium. 

In the setup of learning in games, players 
repetitively play a game over a sequence 
of stages. At each stage, players use past 
experiences/observations to select a strategy 
for the current stage. Once player strategies 
are selected, the game is played, information 
is updated, and the process is repeated. The 
question is then to understand the long-run 
behavior, e.g., whether or not player strategies 
converge to Nash equilibrium. 

Traditionally the dynamic processes consid¬ 
ered under learning in games have players se¬ 
lecting strategies based on a myopic desire to 
optimize for the current stage. That is, play¬ 
ers do not consider long-run effects in updating 
their strategies. Accordingly, while players are 
engaged in repetitive play, the dynamic processes 
generally are not optimal in the long run (as in the 
setting of “repeated games”). Indeed, the survey 
article of Hart (2005) refers to the dynamic pro¬ 
cesses of learning in games as “adaptive heuris¬ 
tics.” This distinction is important in that an 
implicit concern in learning in games is to un¬ 
derstand how “low rationality” (i.e., suboptimal 
and heuristic) processes can lead to the “high ra¬ 
tionality” (i.e., mutually optimal) notion of Nash 
equilibrium. 

This entry presents a sampling of results from 
the learning in games literature through a selec¬ 
tion of illustrative dynamic processes, a review 
of their long-run behaviors relevant to Nash equi¬ 
librium, and pointers to further work. 

Illustration: Commuting Game 

We begin with a description of learning in games 
in the specific setting of the commuting game, 
which is a special case of so-called congestion 
games (cf., Roughgarden 2005). The setup is as 
follows. Each player seeks to plan a path from 
an origin to a destination. The origins and desti¬ 
nations can differ from player to player. Players 
seek to minimize their own travel times. These 
travel times depend both on the chosen path 
(distance traveled) and the paths of other players 
(road congestion). Every day, a player uses past 
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information and observations to select that day’s 
path according to some selection rule, and this 
process is repeated day after day. 

In game-theoretic terms, player “strategies” 
are paths linking their origins to destinations, and 
player “utility functions” reflect travel times. At 
a Nash equilibrium, players have selected paths 
such that no individual player can find a shorter 
travel time given the chosen paths of others. The 
learning in games question is then whether player 
paths indeed converge to Nash equilibrium in the 
long run. Not surprisingly, the answer depends 
on the specific process that players use to select 
paths and possible additional structure of the 
commuting game. 

Suppose that one of the players, say “Alice,” 
is choosing among a collection of paths. For 
the sake of illustration, let us give Alice the 
following capabilities: (i) Alice can observe the 
paths chosen by all other players and (ii) Alice 
can compute off-line her travel time as a function 
of her path and the paths of others. 

With these capabilities, Alice can compute 
running averages of the travel times along all 
available paths. Note that the assumed capabili¬ 
ties allow Alice to compute the travel time of a 
path and hence its running average, whether or 
not she took the path on that day. With average 
travel time values in hand, two possible learning 
rules are: 

- Exploitation: Choose the path with the lowest 
average travel time. 

- Exploitation with Exploration: With high 
probability, choose the path with the lowest 
average travel time, and with low probability, 
choose a path at random. 

Assuming that all players implement the same 
learning rule, each case induces a dynamic pro¬ 
cess that governs the daily selection of paths 
and determines the resulting long-run behavior. 
We will revisit these processes in a more formal 
setting in the next section. 

A noteworthy feature of these learning rules is 
that they do not explicitly depend on the utility 
functions of other players. For example, suppose 
one of the other players is willing to trade off 
travel time for more scenic routes. Similarly, 
suppose one of the other players prefers to travel 


on high congestion paths, e.g., a rolling billboard 
seeking to maximize exposure. The aforemen¬ 
tioned learning rules for Alice remain unchanged. 
Of course, Alice’s actions implicitly depend on 
the utility functions of other players, but only 
indirectly through their selected paths. This char¬ 
acteristic of no explicit dependence on the utility 
functions of others is known as “uncoupled” 
learning, and it can have major implications on 
the achievable long-run behavior (Hart and Mas- 
Colell 2003a). 

In assuming the ability to observe the paths of 
other players and to compute off-line travel times 
as a function of these paths, these learning rules 
impose severe requirements on the information 
available to each player. Less restrictive are learn¬ 
ing rules that are “payoff based” (Young 2005). 
A simple modification that leads to payoff-based 
learning is as follows. Alice maintains an empiri¬ 
cal average of the travel times of a path using only 
the days that she took that path. Note the distinc¬ 
tion - on any given day, Alice remains unaware 
of travel times for the routes not selected. Using 
these empirical average travel times, Alice can 
then mimic any of the aforementioned learning 
rules. As intended, she does not directly observe 
the paths of others, nor does she have a closed- 
form expression for travel times as a function of 
player paths. Rather, she only can select a path 
and measure the consequences. As before, all 
players implementing such a learning rule induce 
a dynamic process, but the ensuing analysis in 
payoff-based learning can be more subtle. 

Learning Dynamics 

We now give a more formal presentation of se¬ 
lected learning rules and results concerning their 
long-run behavior. 

Preliminaries 

We begin with the basic setup of games with a 
finite set of players, {1,2 ,...,A}, and for each 
player i , a finite set of strategies, A \. Let 

A = A\ x ... x An 
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denote the set of strategy profiles. Each player, /, 
is endowed with a utility function 

Uf : A M. 

Utility functions capture player preferences over 
strategy profiles. Accordingly, for any a, a' G A, 
the condition 

Ui (a) > Ui ( a ') 

indicates that player i prefers the strategy profile 
a over a'. 

The notation —i indicates the set of players 
other than player i . Accordingly, we sometimes 
write a e A as (at , a-i) to isolate ai , the strategy 
of player /, versus a-i , the strategies of other 
players. The notation —i is used in other settings 
as well. 

Utility functions induce best-response sets. 
For a-i G A-i , define 

Bi(a-i ) = {a t : Ui(a i9 a-i ) > Ui(a' i9 a- t ) 
for all a[ G Aj } . 

In words, B/(a_/) denotes the set of strategies 
that are optimal for player i in response to the 
strategies of other players, a-i . 

A strategy profile < 2 * G A is a Nash equilib¬ 
rium if for any player i and any a[ G Ai , 

In words, at a Nash equilibrium, no player can 
achieve greater utility by unilaterally changing 
strategies. Stated in terms of best-response sets, 
a strategy profile, a*, is a Nash equilibrium if for 
every player i , 

af e S;«). 

We also will need the notions of mixed strate¬ 
gies and mixed strategy Nash equilibrium. Let 
A(VI/) denote probability distributions (i.e., non¬ 
negative vectors that sum to one) over the set 
Ai. A mixed strategy profile is a collection of 
probability distributions, a = (oq,...,a#), with 


a/ G A(Ai ) for each i . Let us assume that players 
choose a strategy randomly and independently 
according to these mixed strategies. Accordingly, 
define Pr [a; a] to be the probability of strategy a 
under the mixed strategy profile a, and define the 
expected utility of player i as 

Ui (a) = ^ Ui (a) • Pr[a; a]. 
aeA 

A mixed strategy Nash equilibrium is a mixed 
strategy profile, a*, such that for any player i and 
any a- G A(*4), 

w.a-i) > 

Special Classes of Games 

We will reference three special classes of games: 
(i) zero-sum games, (ii) potential games, and (iii) 
weakly acyclic games. 

Zero-sum games: There are only two players (i.e., 
N = 2), and u\{a) = —U 2 (a). 

Potential games: There exists a (potential) func¬ 
tion, 

(p : A —^ M 

such that for any pair of strategies, a = (<a/, a-i ) 
and a' = (a \, a-i), that differ only in the strategy 
of player i , 

Ui (ai , a-i)-Ui (a \, a-i ) = 0 (a t , fl_,-)-0 (a \, a-i ). 
Weakly acyclic games: There exists a function 

(p : A —> M 

with the following property: if <2 G ^4 is a 
Nash equilibrium, then at least one player, say 
player i , has an alternative strategy, say a[ G Ai, 
such that 

Ui (a [, a-i ) > Ui(a i 9 a-i) 

and 

> <p{cii,a-i). 

Potential games are a special class of games 
for which various learning dynamics converge to 
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a Nash equilibrium. The aforementioned com¬ 
muting game constitutes a potential game under 
certain special assumptions. These are as follows: 
(i) the delay on a road only depends on the 
number of users (and not their identities) and (ii) 
all players measure delay in the same manner 
(Monderer and Shapley 1996). 

Weakly acyclic games are a generalization of 
potential games. In potential games, there ex¬ 
ists a potential function that captures differences 
in utility under unilateral (i.e., single player) 
changes in strategy. In weakly acyclic games 
(see Young 1998), if a strategy profile is not a 
Nash equilibrium, then there exists a player who 
can simultaneously achieve an increase in utility 
while increasing the potential function. The char¬ 
acterization of weakly acyclic games through a 
potential function herein is not traditional and is 
borrowed from Marden et al. (2009a). 


Forecasted Best-Response Dynamics 

One family of learning dynamics involves players 
formulating a forecast of the strategies of other 
players based on past observations and then play¬ 
ing a best response to this forecast. 


Cournot Best-Response Dynamics 
The simplest illustration is Cournot best-response 
dynamics. Players repetitively play the same 
game over stages t = 0,1,2,.... At stage t, 
a player forecasts that the strategies of other 
players are the strategies played at the previous 
stage t — 1. The following rules specify Cournot 
best response with inertia. For each stage t and 
for each player i : 

• With probability p e (0,1), a t ( t ) = ( t — 1) 
(inertia). 

• With probability 1 —p, df ( t ) e £>/ (. a~i (t — 1)) 
(best response). 

• If cii(t — 1) e Bi(a-i(t - 1)), then a t {t) = 
dj(t — 1) (continuation). 

Proposition 1 For wedkly dcyclic (dnd hence 
potentidl) gdmes, pldyer stmtegies under 
Cournot best-response dynomics with inertid 
converge to d Ndsh equilibrium. 


Cournot best-response dynamics need not al¬ 
ways converge in games with a Nash equilibrium, 
hence the restriction to weakly acyclic games. 

Fictitious Play 

In fictitious play, introduced in Brown (1951), 
players also use past observations to construct a 
forecast of the strategies of other players. Unlike 
Cournot best-response dynamics, this forecast is 
probdbilistic. 

As a simple example, consider the commuting 
game with two players, Alice and Bob, who both 
must choose between two paths, A and B. Now 
suppose that on stage t = 10, Alice has observed 
Bob used path A for 6 out of the previous 10 
days and path B for the remaining days. Then 
Alice’s forecast of Bob is that he will chose path 
A with 60 % probability and path B with 40 % 
probability. Alice then chooses between path A 
and B in order to optimize her expected utility. 
Likewise, Bob uses Alice’s empirical averages to 
form a probabilistic forecast of her next choice 
and selects a path to optimize his expected utility. 

More generally, let 7 tj (t) e A (Aj) denote the 
empiricdl frequency for player j at stage t . This 
vector is a probability distribution that indicates 
the relative frequency of times player j played 
each strategy in Aj over stages 0,1 — 1. In 

fictitious play, player i assumes (incorrectly) that 
at stage t, other players will select their strategies 
independently and randomly according to their 
empirical frequency vectors. Let II _/ (t) denote 
the induced probability distribution over A-i at 
stage t. Under fictitious play, player i selects an 
action according to 

di it) e arg max u { (a*, d- t ) 

ai eAi “ 

a~i eA—i 

•Pr[a_,; n_,(f)]. 

In words, player i selects the action that 
maximizes expected utility assuming that other 
players select their strategies randomly and 
independently according to their empirical 
frequencies. 

Proposition 2 For (i) zero-sum gdmes, (ii) 
potentidl gdmes, dnd (iii) two-pldyer gdmes 
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in which one player has only two actions , 
player empirical frequencies under fictitious play 
converge to a mixed strategy Nash equilibrium. 

These results are reported in Fudenberg and 
Levine (1998), Hofbauer and Sandholm (2002), 
and Berger (2005). Fictitious play need not 
converge to Nash equilibria in all games. An 
early counterexample is reported in Shapley 
(1964), which constructs a two-player game 
with a unique mixed strategy Nash equilibrium. 
A weakly acyclic game with multiple pure 
(i.e., non-mixed) Nash equilibria under which 
fictitious play does not converge is reported in 
Foster and Young (1998). 

A variant of fictitious play is “joint strategy” 
fictitious play (Marden et al. 2009b). In this 
framework, players construct as forecasts empir¬ 
ical frequencies of the joint play of other players. 
This formulation is in contrast to constructing 
and combining empirical frequencies for each 
player. In the commuting game, it turns out that 
joint strategy fictitious play is equivalent to the 
aforementioned “exploitation” rule of selecting 
the path with lowest average travel time. Marden 
et al. (2009b) show that action profiles under joint 
strategy fictitious play (with inertia) converge to 
a Nash equilibrium in potential games. 

Log-Linear Learning 

Under forecasted best-response dynamics, 
players chose a best response to the forecasted 
strategies of other players. Log-linear learning, 
introduced in Blume (1993), allows the 
possibility of “exploration,” in which players can 
select nonoptimal strategies but with relatively 
low probabilities. 

Log-linear learning proceeds as follows. First, 
introduce a “temperature” parameter, T > 0. 

- At stage t, a single player, say player /, is 
selected at random. 

- For player i , 

Pr[a,(0 = a'] = i e “i («>-.' (»-»)/r 

- For all other players, j ^ i , 

dj (; t ) = a,j (t — 1 ). 


In words, under log-linear learning, only a single 
player performs a strategy update at each stage. 
The probability of selecting a strategy is expo¬ 
nentially proportional to the utility garnered from 
that strategy (with other players repeating their 
previous strategies). In the above description, the 
dummy parameter Z is a normalizing variable 
used to define a probability distribution. In fact, 
the specific probability distribution for strategy 
selection is a Gibbs distribution with tempera¬ 
ture parameter, T. For very large T, strategies 
are chosen approximately uniformly at random. 
However, for small T, the selected strategy is 
a best response (i.e., at(t) e Bi(a~i(t — 1 ))) 
with high probability, and an alternative strategy 
is selected with low probability. 

Because of the inherent randomness, strategy 
profiles under log-linear learning never converge. 
Nonetheless, the long-run behavior can be char¬ 
acterized probabilistically as follows. 

Proposition 3 For potential games with poten¬ 
tial function 0(-) under log-linear learning , for 
any a e A, 

lim Pr[a(0 = a] = -e* (a)/T . 

t —>oo Z 

In words, the long-run probabilities of strategy 
profiles conform to a Gibbs distribution con¬ 
structed from the underlying potential function. 
This characterization has the important implica¬ 
tion of (probabilistic) equilibrium selection. Prior 
convergence results stated convergence to Nash 
equilibria, but did not specify which Nash equi¬ 
librium in the case of multiple equilibria. Under 
log-linear learning, there is a probabilistic prefer¬ 
ence for the Nash equilibrium that maximizes the 
underlying potential function. 

Extensions and Variations 


Payoff-based learning. The discussion herein 
presumed that players can observe the actions of 
other players and can compute utility functions 
off-line. Payoff-based algorithms, i.e., algorithms 
in which players only measure the utility 
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garnered in each stage, impose less restrictive 
informational requirements. See Young (2005) 
for a general discussion, as well as Marden et al. 
(2009c), Marden and Shamma (2012), and Arslan 
and Shamma (2004) for various payoff-based 
extensions. 

No-regret learning. The broad class of so-called 
“no-regret” learning rules has the desirable prop¬ 
erty of converging to broader solution concepts 
(namely, Hannan consistency sets and correlated 
equilibria) in general games. See Hart and Mas- 
Colell (2000, 2001, 2003b) for an extensive dis¬ 
cussion. 

Calibrated forecasts. Calibrated forecasts are 
more sophisticated than empirical frequencies 
in that they satisfy certain long-run consistency 
properties. Accordingly, forecasted best-response 
learning using calibrated forecasts has stronger 
guaranteed convergence properties, such as 
convergence to correlated equilibria. See Foster 
and Vohra (1997), Kakade and Foster (2008), and 
Mannor et al. (2007). 

Impossibility results. This entry focused on con¬ 
vergence results in various special cases. There 
are broad impossibility results that imply the 
impossibility of families of learning rules to con¬ 
verge to Nash equilibria in all games. The focus 
is on uncoupled learning, i.e., the learning dy¬ 
namics for player i does not depend explicitly 
on the utility functions of other players (which 
is satisfied by all of the learning dynamics pre¬ 
sented herein). See Hart and Mas-Colell (2003a, 
2006), Hart and Mansour (2007), and Shamma 
and Arslan (2005). Another type of impossibility 
result concerns lower bounds on the required rate 
of convergence to equilibrium (e.g., Hart and 
Mansour 2010). 

Welfare maximization. Of special interest is 
learning dynamics that select welfare (i.e., 
sum of utilities) maximizing strategy profiles, 
whether or not they are Nash equilibria. 
Recent contributions include Pradelski and 
Young (2012), Marden et al. (2011), and 
Arieli and Babichenko (2012). 


Summary and Future Directions 

We have presented a selection of learning dynam¬ 
ics and their long-run characteristics, specifically 
in terms of convergence to Nash equilibria. As 
stated early on, the original motivation of learn¬ 
ing in games research has been to add credence 
to solution concepts such as Nash equilibrium as 
a model of the outcome of a game. An emerging 
line of research stems from engineering consid¬ 
erations, in which the objective is to use the 
framework of learning in games as a design tool 
for distributed decision architecture settings such 
as autonomous vehicle teams, communication 
networks, or smart grid energy systems. A related 
emerging direction is social influence, in which 
the objective is to steer the collective behaviors 
of human decision makers towards a socially 
desirable situation through the dispersement of 
incentives. Accordingly, learning in games can 
offer baseline models on how individuals update 
their behaviors to guide and inform social influ¬ 
ence policies. 
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Introduction 

How does a machine learn an abstract concept 
from examples? How can a machine generalize 
to previously unseen situations? Learning theory 
is the study of (formalized versions of) such 
questions. There are many possible ways to for¬ 
mulate such questions. Therefore, the focus of 
this entry is on one particular formalism, known 
as PAC (probably approximately correct) learn¬ 
ing. It turns out that PAC learning theory is rich 
enough to capture intuitive notions of what learn¬ 
ing should mean in the context of applications 
and, at the same time, is amenable to formal 
mathematical analysis. There are several precise 
and complete studies of PAC learning theory, 
many of which are cited in the bibliography. 
Therefore, this article is devoted to sketching 
some high-level ideas. 


Keywords 

Machine learning; Probably approximately cor¬ 
rect (PAC) learning; Support vector machine; 
Vapnik-Chervonenkis (V-C) dimension 


Problem Formulation 

In the PAC formalism, the starting point is the 
premise that there is an unknown set, say an 
unknown convex polygon, or an unknown half¬ 
plane. The unknown set cannot be completely 
unknown; rather, something should be specified 
about its nature, in order for the problem to be 
both meaningful and tractable. For instance, in 
the first example above, the learner knows that 
the unknown set is a convex polygon, though 
it is not known which polygon it might be. 


Similarly, in the second example, the learner 
knows that the unknown set is a half-plane, 
though it is not known which half-plane. The 
collection of all possible unknown sets is known 
as the concept class, and the particular unknown 
set is referred to as the “target concept.” In the 
first example, this would be the set of all convex 
polygons and in the second case it would be 
the set of half-planes. The unknown set cannot 
be directly observed of course; otherwise, there 
would be nothing to learn. Rather, one is given 
clues about the target concept by an “oracle,” 
which informs the learner whether or not a 
particular element belongs to the target concept. 
Therefore, the information available to the learner 
is a collection of “labelled samples,” in the form 
{(xi, Ir(xi),i = 1 ,...,m}, where m is the 

total number of labelled samples and /*(•) is 
the indicator function of the target concept T. 
Based on this information, the learner is expected 
to generate a “hypothesis” H m that is a good 
approximation to the unknown target concept T. 

One of the main features of PAC learning 
theory that distinguishes it from its forerunners is 
the observation that, no matter how many training 
samples are available to the learner, the hypoth¬ 
esis H m can never exactly equal the unknown 
target concept T. Rather, all that one can expect 
is that H m converges to T in some appropriate 
metric. Since the purpose of machine learning 
is to generate a hypothesis H m that can be used 
to approximate the unknown target concept T 
for prediction purposes, a natural candidate for 
the metric that measures the disparity between 
H m and T is the so-called generalization error, 
defined as follows: Suppose that, after m training 
samples that have led to the hypothesis H m , a 
testing sample x is generated at random. One 
can now ask: what is the probability that the 
hypothesis H m misclassifies x? In other words, 
what is the value of Pr {I Hm (x) ^ /^(x)}? This 
quantity is known as the generalization error, and 
the objective is to ensure that it approaches zero 
as m —> oo. 

The manner in which the samples are gener¬ 
ated leads to different models of learning. For 
instance, if the learner is able to choose the 
next sample x m +i on the basis of the previous 
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m labelled samples, which is then passed on to 
the oracle for labeling, this is known as “active 
learning.” More common is “passive learning,” in 
which the sequence of training samples {v/}/>i 
is generated at random, in an independent and 
identically distributed (i.i.d.) fashion, according 
to some probability distribution P. In this case, 
even the hypothesis H m and the generalization 
error are random, because they depend on the 
randomly generated training samples. This is the 
rationale behind the nomenclature “probably ap¬ 
proximately correct.” The hypothesis H m is not 
expected to equal to unknown target concept T 
exactly, only approximately. Even that is only 
probably true, because in principle it is possible 
that the randomly generated training samples 
could be totally unrepresentative and thus lead to 
a poor hypothesis. If we toss a coin many times, 
there is a small but always positive probability 
that it could turn up heads every time. As the coin 
is tossed more and more times, this probability 
becomes smaller, but will never equal zero. 

Examples 

Example 1 Consider the situation where the con¬ 
cept class consists of all half-planes in M 2 , as 
indicated in the left side of Fig. 1. Here the 
unknown target concept T is some fixed but 
unknown half-plane. The symbol T is next to 
the boundary of the half-plane, and all points to 
the right of the line constitute the target half¬ 
plane. The training samples, generated at random 
according some unknown probability distribution 
P, are also shown in the figure. The samples that 


belong to T are shown as blue rectangles, while 
those that do not belong to T are shown as red 
dots. Knowing only these labelled samples, the 
learner is expected to guess what T might be. 

A reasonable approach is to choose some 
half-plane that agrees with the data and correctly 
classifies the labelled data. For instance, the well- 
known support vector machine (SVM) algorithm 
chooses the unique half-plane such that the 
closest sample to the dividing line is as far as 
possible from it; see the paper by Cortes and 
Vapnik (1997). 

The symbol H denotes the boundary of a hy¬ 
pothesis, which is another half-plane. The shaded 
region is the symmetric difference between the 
two half-planes. The set TAH is the set of points 
that are misclassified by the hypothesis H . Of 
course, we do not know what this set is, because 
we do not know T. It can be shown that, when¬ 
ever the hypothesis H is chosen to be consistent 
in the sense of correctly classifying all labelled 
samples, the generalization error goes to zero as 
the number of samples approaches infinity. 

Example 2 Now suppose the concept class con¬ 
sists of all convex polygons in the unit square, 
and let T denote the (unknown) target convex 
polygon. This situation is depicted in the right 
side of Fig. 1. This time let us assume that the 
probability distribution that generates the sam¬ 
ples is the uniform distribution on X . Given a 
set of positively and negatively labelled samples 
(the same convention as in Example 1), let us 
choose the hypothesis H to be the convex hull 
of all positively labelled samples, as shown in 
the figure. Since every positively labelled sample 
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Examples of learning 
problems, (a) Learning 
half-planes, (b) Learning 
convex polygons 



X = [0, l] 2 
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Learning Theory, Fig. 2 

VC dimension illustrations, 
(a) Shattering a set of three 
elements, (b) Infinite VC 
dimension 



belongs to T , and T is a convex set, it follows 
that H is a subset of T. Moreover, P(T \ H) 
is the generalization error. It can be shown that 
this algorithm also “works” in the sense that the 
generalization error goes to zero as the number of 
samples approaches infinity. 


Vapnik-Chervonenkis Dimension 

Given any concept class C, there is a single 
integer that offers a measure of the richness of 
the class, known as the Vapnik-Chervonenkis (or 
VC) dimension, after its originators. 

Definition 1 A set £c X is said to be shattered 

by a concept class C if, for every subset B^S, 
there is a set A e C such that S Cl A = B. The 
VC dimension of C is the largest integer d such 
that there is a finite set of cardinality d that is 
shattered by C. 

Example 3 It can be shown that the set of half¬ 
planes in M 2 has VC dimension two. Choose a 
set S = {x,y,z} consisting of three points that 
are not collinear, as in Fig. 2. Then there are 
2 3 = 8 subsets of S. The point is to show that 
for each of these eight subsets, there is a half¬ 
plane that contains precisely that subset, nothing 
more and nothing less. That this is possible is 
shown in Fig. 2. Four out of the eight situations 
are depicted in this figure, and the remaining four 
situations can be covered by taking the comple¬ 
ment of the half-plane shown. It is also necessary 
to show that no set with four or more elements 
can he shattered , but that step is omitted; instead 


the reader is referred to any standard text such 
as Vidyasagar (1997). More generally, it can be 
shown that the set of half-planes in R k has VC 
dimension k + 1. 

Example 4 The set of convex polygons has infi¬ 
nite VC dimension. To see this, let S be a strictly 
convex set, as shown in Fig. 2b. (Recall that a 
set is “strictly convex” if none of its boundary 
points is a convex combination of other points in 
the set.) Choose any finite collection of boundary 
points, call it S = {x \,..., x n }. If B is a 
subset of S , then the convex hull of B does not 
contain any other point of S , due to the strict 
convexity property. Since this argument holds for 
every integer n , the class of convex polygons has 
infinite VC dimension. 

Two Important Theorems 

Out of the many important results in learning 
theory, two are noteworthy. 

Theorem 1 (Blumer et al. (1989)) A concept 
class is distribution-free PAC learnable if and 
only if it has finite VC dimension. 

Theorem 2 (Benedek and Itai (1991)) Suppose 
P is a fixed probability distribution. Then the 
concept class C is PAC learnable if and only if 
for every positive number c, it is possible to cover 
C by a finite number of balls of radius e, with 
respect to the pseudometric dp. 

Now let us return to the two examples stud¬ 
ied previously. Since the set of half-planes has 
finite VC dimension, it is distribution-free PAC 
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learnable. The set of convex polygons can be 
shown to satisfy the conditions of Theorem 2 
if P is the uniform distribution and is therefore 
PAC learnable. However, since it has infinite VC 
dimension, it follows from Theorem 1 that it is 
not distribution-free PAC learnable. 

Summary and Future Directions 

This brief entry presents only the most basic 
aspects of PAC learning theory. Many more re¬ 
sults are known about PAC learning theory, and 
of course many interesting problems remain un¬ 
solved. Some of the known extensions are: 

• Learning under an “intermediate” family of 
probability distributions V that is not neces¬ 
sarily equal to V*, the set of all distributions 
(Kulkarni and Vidyasagar 1997) 

• Relaxing the requirement that the algorithm 
should work uniformly well for all target 
concepts and requiring instead only that it 
should work with high probability (Campi 
and Vidyasagar 2001) 

• Relaxing the requirement that the training 
samples are independent of each other 
and permitting them to have Markovian 
dependence (Gamarnik 2003; Meir 2000) or 
/3-mixing dependence (Vidyasagar 2003) 
There is considerable research in finding al¬ 
ternate sets of necessary and sufficient conditions 
for learnability. Unfortunately, many of these 
conditions are unverifiable and amount to tauto¬ 
logical restatements of the problem under study. 
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Abstract 

Lie algebraic methods generalize matrix methods 
and algebraic rank conditions to smooth nonlin¬ 
ear systems. They capture the essence of noncom¬ 
muting flows and give rise to noncommutative 
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rank conditions determine controllability, observ¬ 
ability, and optimality. Lie algebraic methods are 
also employed for state-space realization, control 
design, and path planning. 


Keywords 

Baker-Campbell-Hausdorff formula; Chen-Fliess 
series; Lie bracket 


Definition 

This article considers generally nonlinear control 
systems (affine in the control) of the form 

X = fo(x) + + .. ,u m f m (x) 

y = <p(x) 

where the state x takes values in M", or more 
generally in an -dimensional manifold M n , the 
fi are smooth vector fields, <^>: M' 2 i->> is a 
smooth output function, and the controls u = 
(wi,..., u m ): [0, T] \-> U are piecewise contin¬ 
uous, or, more generally, measurable functions 
taking values in a closed convex subset U c R m 
that contains 0 in its interior. 

Lie algebraic techniques refers to analyzing 
the system (1) and designing controls and sta¬ 
bilizing feedback laws by employing relations 
satisfied by iterated Lie brackets of the system 
vector fields f . 


Introduction 

Systems of the form (1) contain as a special case 
time-invariant linear systems x = Ax + Bu, y = 
Cx (with constant matrices A e R nxn , B e 
M” xm , and C e R pxn ) that are well-studied and 
are a mainstay of classical control engineering. 
Properties such as controllability, stabilizability, 
observability, and optimal control and various 
others are determined by relationships satisfied 
by higher-order matrix products of A, B , and C. 

Since the early 1970s, it has been well un¬ 
derstood that the appropriate generalization of 


this matrix algebra, and, e.g., invariant linear 
subspaces, to nonlinear systems is in terms of 
the Lie algebra generated by the vector fields f , 
integral submanifolds of this Lie algebra, and the 
algebra of iterated Lie derivatives of the output 
function. 

The Lie bracket of two smooth vector fields 
f g:M i-> TM is defined as the vector field 
[fg]:M i-> TM that maps any smooth function 
cp e C°°(M ) to the function [fg]<p = fg<p — 
gf<p■ 

In local coordinates, if 

n a 

fix) = y^ffx)— and 

i = 1 

g(x)=A-, 

7=1 

then 


ifgKx) = E fr'(x)^-(x) 

ij = 1 V 


- gJ ix)^jix) 


d 

dx l ' 


With some abuse of notation, one may abbreviate 
this to [fg] = ( Dg)f - ( Df)g , where / and 
g are considered as column vector fields and Df 
and Dg denote their Jacobian matrices of partial 
derivatives. 

Note that with this convention the Lie bracket 
corresponds to the negative of the commutator 
of matrices: If P, Q e R nxn define, in matrix 
notation, the linear vector fields f(x ) = Px and 
g(x) = Qx, then [/, g](x) = (QP - PQ)x = 
~[P, Q]x. 


Noncommuting Flows 

Geometrically the Lie bracket of two smooth 
vector fields f\ and fz is an infinitesimal measure 
of the lack of commutativity of their flows. For a 
smooth vector field / and an initial point v (0) = 
p G M , denote by e*f p the solution of the 
differential equation x = f(x) at time t . Then 
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[fufiMp) (<P( e ‘ fle tfl e tfl e tfl p) 

~ <P(P)) ■ 

As a most simple example, consider parallel 
parking a unicycle, moving it sideways without 
slipping. Introduce coordinates (x,y,0) for the 
location in the plane and the steering angle. The 
dynamics are governed by x = wicos#, y = 
ftisin#, and 6 = ft 2 where the control u\ is 
interpreted as the signed rolling speed and U 2 as 
the angular velocity of the steering angle. Written 
in the form (1), one has f\ = (cos 9, sin 9,0) T 
and fz = (0,0, l) r . (In this case the drift vector 
field /o = 0 vanishes.) If the system starts at 
(0,0,0) r , then via the sequence of control actions 
of the form turn left , roll forward, turn back , and 
roll backwards , one may steer the system to a 
point (0, Ay,0) T with Ay > 0. This sideways 
motion corresponds to the value (0,1,0) r of the 
Lie bracket [f \, ff\ = (— sin 0, cos 0, 0) T at the 
origin. It encapsulates that steering and rolling do 
not commute. This example is easily expanded 
to model, e.g., the sideways motion of a car, or 
a truck with multiple trailers; see, e.g., Bloch 
(2003), Bressan and Piccoli (2007), and Bullo 
and Lewis (2005). In such cases longer iterated 
Lie brackets correspond to the required more 
intricate control actions needed to obtain, e.g., a 
pure sideways motion. 

In the case of linear systems, if the Kalman 
rank condition rank[£, AB, A 2 B, ...A n ~ 1 B] = 
n is not satisfied, then all solutions curves of the 
system starting from the same point x (0) = p are 
at all times T > 0 constrained to lie in a proper 
affine subspace. In the nonlinear setting the role 
of the compound matrix of that condition is taken 
by the Lie algebra L = L(/ 0 , /i,... f m ) of all 
finite linear combinations of iterated Lie brackets 
of the vector fields f . As an immediate conse¬ 
quence of the Frobenius integrability theorem, if 
at a point v (0) = p the vector fields in L span 
the whole tangent space, then it is possible to 
reach an open neighborhood of the initial point by 
concatenating flows of the system (1) that corre¬ 
spond to piecewise constant controls. Conversely, 
in the case of analytic vector fields and a compact 


set U of admissible control values, the Hermann- 
Nagano theorem guarantees that if the dimension 
of the subspace L(p) = {f(p): f e L} < 
n is not maximal, then all such trajectories are 
confined to stay in a lower-dimensional proper 
integral submanifold of L through the point p. 
For a comprehensive introduction, see, e.g., the 
textbooks Bressan and Piccoli (2007), Isidori 
(1995), and Sontag (1998). 

Controllability 

Define the reachable set 7 Zt ( p ) as the set of all 
terminal points x(T\u , p) at time T of trajecto¬ 
ries of (1) that start at the initial point v(0) = 
p and correspond to admissible controls. Com¬ 
monly known as the Lie algebra rank condi¬ 
tion (LARC), the above condition determines 
whether the system is accessible from the point 
p, which means that for arbitrarily small time 
T > 0, the reachable set 7 Zr(p) has nonempty 
ft-dimensional interior. For most applications one 
desires stronger controllability properties. Most 
amenable to Lie algebraic methods, and practi¬ 
cally relevant, is small-time local controllability 
(STLC): The system is STLC from p if p lies in 
the interior of 7 Zt (p) for every T > 0. In the case 
that there is no drift vector field /o, accessibility 
is equivalent to STLC. However, in general, the 
situation is much more intricate, and a rich liter¬ 
ature is devoted to various necessary or sufficient 
conditions for STLC. A popular such condition 
is the Hermes condition. For this define the sub¬ 
spaces S 1 = span{(ad 7 /o, ft) A < j < m, j e 
Z + }, and recursively S k+l = span{[gi , guY g i € 
S l , gk e S k }. Here (< ad°fg ) = g , and 
recursively {ad k+l f g) = [f,(ad k fg)]. The 
Hermes condition guarantees in the case of an¬ 
alytic vector fields and, e.g., U = [— 1 , l] m 

that if the system satisfies the (LARC) and for 
every k > 1, S 2k (p) c then the 

system is (STLC). For more general conditions, 
see Sussmann (1987) and also Kawski (1990) for 
a broader discussion. 

The importance and value of Lie algebraic 
conditions may in large part be ascribed to their 
geometric character, their being invariant under 
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coordinate changes and feedback. In particular, in 
the analytic case, the Lie relations completely de¬ 
termine the local properties of the system, in the 
sense that Lie algebra homomorphism between 
two systems gives rise to a local diffeomorphism 
that maps trajectories to trajectories (Sussmann 
1974). 

Exponential Lie Series 

A central analytic tool in Lie algebraic methods 
that takes the role of Taylor expansions in clas¬ 
sical analysis of dynamical system is the Chen- 
Fliess series which associates to every admissible 
control u: [0, T] h-> U a formal power series 

CF(w, T) = u du’-X h ...X is (2) 

over a set {Xo, X\, ... X m } of noncommuting 
indeterminates (or letters). For every multi-index 
7 = (z'i, z' 2 , • • • h) £ {0,1,.. .m} s , s > 0, the 
coefficient of X/ is the iterated integral defined 
recursively 



Upon evaluating this series via the substitutions 
X; <— fj , it becomes an asymptotic series for 
the propagation of solutions of (1): For fj,(p 
analytic, U compact, p in a compact set, and 
T > 0 sufficiently small, one has 

u, p)) = y2j o du ' • c fh • • • A?) oo- 

(4) 

One application of particular interest is to 
construct approximating systems of a given 
system (1) that preserve critical geometric 
properties, but which have an simpler structure. 
One such class is that of nilpotent systems, 
that is, systems whose Lie algebra L = 
L(/o, /i, ... f m ) is nilpotent, and for which 
solutions can be found by simple quadratures. 
While truncations of the Chen-Fliess series 


never correspond to control systems of the 
same form, much work has been done in recent 
years to rewrite this series in more useful 
formats. For example, the infinite directed 
exponential product expansion in Sussmann 
(1986) that uses Hall trees immediately may be 
interpreted in terms of free nilpotent systems 
and consequently helps in the construction 
of nilpotent approximating systems. More 
recent work, much of it of a combinatorial 
algebra nature and utilizing the underlying 
Hopf algebras, further simplifies similar 
expansions and in particular yields explicit 
formulas for a continuous Baker-Campbell- 
Hausdorff formula or for the logarithm of 
the Chen-Fliess series (Gehrig and Kawski 
2008). 


Observability and Realization 

In the setting of linear systems a well- 
defined algebraic sense dual to the concept of 
controllability is that of observability. Roughly 
speaking the system (1) is observable if 
knowledge of the output y(t) = cp(x(t;u,p)) 
over an arbitrarily small interval suffices to 
construct the current state x(t;u, p) and indeed 
the past trajectory x( • ;u,p). In the linear 
setting observability is equivalent to the rank 
condition rank[C r , {CA ) T ,..., (CA n ~) T ] = n 
being satisfied. In the nonlinear setting, the place 
of the rows of this compound matrix is taken 
by the functions in the observation algebra, 
which consists of all finite linear combinations of 
iterated Lie derivatives f s --- fhV °f the output 
function. 

Similar to the Hankel matrices introduced in 
the latter setting, in the case of a finite Lie rank, 
one again can use the output algebra to construct 
realizations in the form of (1) for systems which 
are initially only given in terms of input-output 
descriptions, or in terms of formal Fliess oper¬ 
ators; see, e.g., Fliess (1980), Gray and Wang 
(2002), and Jakubczyk (1986) for further reading. 
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Optimal Control 


In a well-defined geometric way, conditions for 
optimal control are dual to conditions for con¬ 
trollability and thus are directly amenable to 
Lie algebraic methods. Instead of considering a 
separate functional 


J(u) = 


f(x(T-,u, p)) + f 
Jo 


T 

L(t,x(t ; u, p ), u(t )) dt 


(5) 


to be minimized, it is convenient for our purposes 
to augment the state by, e.g., defining x 0 = 1 
and x n +\ = L(xo,x,u). For example, in the 
case of time-optimal control, one again obtains 
an enlarged system of the same form (1); else one 
utilizes more general Lie algebraic methods that 
also apply to systems not necessarily affine in the 
control. 

The basic picture for systems with a compact 
set U of admissible values of the controls in¬ 
volves the attainable funnel !Z<r{p) consisting 
of all trajectories of the system (1) starting at 
x(0) = p that correspond to admissible controls. 
The trajectory corresponding to an optimal con¬ 
trol u* must at time T lie on the boundary of 
the funnel 7 Z<t(p) and hence also at all prior 
times (using the invariance of domain property 
implied by the continuity of the flow). Hence one 
may associate a covector field along such optimal 
trajectory that at every time points in the direction 
of an outward normal. The Pontryagin Maximum 
Principle is a first-order characterization of such 
trajectory covector field pairs. Its pointwise max¬ 
imization condition essentially says that if at any 
time to e [0, T] one replaces the optimal control 
w*(-) by any admissible control variation on an 
interval [to, to + s], then such variation may be 
transported along the flow to yield, in the limit 
as s \ 0, an inward pointing tangent vector to 
the reachable set 7 Zt ( p ) at x(T;u*, p). To obtain 
stronger higher-order conditions for maximality, 
one may combine several such families of control 
variations. The effects of such combinations are 
again calculated in terms of iterated Lie brackets 
of the vector fields fi . Indeed, necessary con¬ 
ditions for optimality, for a trajectory to lie on 
the boundary of the funnel H<t(p), immediately 


translate into sufficient conditions for STLC, for 
the initial point to lie in the interior of 7 Zr(p), 
and vice versa. For recent work employing Lie 
algebraic methods in optimality conditions, see, 
e.g., Agrachev et al. (2002). 


Summary and Future Research 

Lie algebraic techniques may be seen as a direct 
generalization of matrix linear algebra tools that 
have proved so successful in the analysis and de¬ 
sign of linear systems. However, in the nonlinear 
case, the known algebraic rank conditions still ex¬ 
hibit gaps between necessary and sufficient con¬ 
ditions for controllability and optimality. Also, 
new, not yet fully understood, topological and 
resonance obstructions stand in the way of con¬ 
trollability implying stabilizability. Systems that 
exhibit special structure, such as living on Lie 
groups, or being second order such as typical 
mechanical systems, are amenable to further re¬ 
finements of the theory; compare, e.g., the use of 
affine connections and the symmetric product in 
Bullo et al. (2000). Other directions of ongoing 
and future research involve the extension of Lie 
algebraic methods to infinite dimensional sys¬ 
tems and to generalize formulas to systems with 
less regularity; see, e.g., the work by Rampazzo 
and Sussmann (2007) on Lipschitz vector fields, 
thereby establishing closer connections with non¬ 
smooth analysis (Clarke 1983) in control. 
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Linear Matrix Inequality Techniques 
in Optimal Control 
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Abstract 

LMI (linear matrix inequality) techniques offer 
more flexibility in the design of dynamic linear 
systems than techniques that minimize a scalar 
functional for optimization. For linear state space 
models, multiple goals (performance bounds) can 
be characterized in terms of LMIs, and these can 
serve as the basis for controller optimization via 
finite-dimensional convex feasibility problems. 
LMI formulations of various standard control 
problems are described in this article, including 
dynamic feedback stabilization, covariance con¬ 
trol, LQR, H 0 o control, control, and infor¬ 
mation architecture design. 


Keywords 

Control system design; Covariance control; H 00 
control; control; LQR/LQG; Matrix inequal¬ 
ities; Sensor/actuator design 


Early Optimization History 

Hamilton invented state space models of 
nonlinear dynamic systems with his generalized 
momenta work in the 1800s (Hamilton 1834, 
1835), but at that time the lack of computational 
tools prevented broad acceptance of the first- 
order form of dynamic equations. With the rapid 
development of computers in the 1960s, state 
space models evoked a formal control theory 
for minimizing a scalar function of control and 
state, propelled by the calculus of variations 
and Pontryagin’s maximum principle. Optimal 
control has been a pillar of control theory for 
the last 50 years. In fact, all of the problems 
discussed in this article can perhaps be solved 
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by minimizing a scalar functional, but a search 
is required to find the right functional. Globally 
convergent algorithms are available to do just 
that for quadratic functionals, but more direct 
methods are now available. 

Since the early 1990s, the focus for linear 
system design has been to pose control problems 
as feasibility problems, to satisfy multiple con¬ 
straints. Since then, feasibility approaches have 
dominated design decisions, and such feasibility 
problems may be convex or not. If the problem 
can be reduced to a set of linear matrix inequal¬ 
ities (LMIs) to solve, then convexity is proven. 
However, failure to find such LMI formulations 
of the problem does not mean it is not convex, and 
computer-assisted methods for convex problems 
are available to avoid the search for LMIs (see 
Camino et al. 2003). 

In the case of linear dynamic models of 
stochastic processes, optimization methods led 
to the popularization of linear quadratic Gaussian 
(LQG) optimal control, which had globally 
optimal solutions (see Skelton 1988). The first 
two moments of the stochastic process (the 
mean and the covariance) can be controlled 
with these methods, even if the distribution of 
the random variables involved is not Gaussian. 
Hence, LQG became just an acronym for the 
solution of quadratic functionals of control 
and state variables, even when the stochastic 
processes were not Gaussian. The label LQG 
was often used even for deterministic problems, 
where a time integral, rather than an expectation 
operator, was minimized, with given initial 
conditions or impulse excitations. These were 
formally called LQR (linear quadratic regulator) 
problems. Later the book (Skelton 1988) gave 
the formal conditions under which the LQG and 
the LQR answers were numerically identical, and 
this particular version of LQR was called the 
deterministic LQG. 

It was always recognized that the quadratic 
form of the state and control in the LQG problem 
was an artificial goal. The real control goals usu¬ 
ally involved prespecified performance bounds 
on each of the outputs and bounds on each chan¬ 
nel of control. This leads to matrix inequalities 
(Mis) rather than scalar minimizations. While 


it was known early that any stabilizing linear 
controller could be obtained by some choice of 
weights in an LQG optimization problem (see 
Chap. 6 and references in Skelton 1988), it was 
not known until the 1980s what particular choice 
of weights in LQG would yield a solution to 
the matrix inequality (MI) problem. See early 
attempts in Skelton (1988), and see Zhu and Skel¬ 
ton (1992) and Zhu et al. (1997) for a globally 
convergent algorithm to find such LQG weights 
when the MI problem has a solution. Since then, 
rather than stating a minimization problem for 
a meaningless sum of outputs and inputs, linear 
control problems can now be stated simply in 
terms of norm bounds on each input vector and/or 
each output vector of the system (L 2 bounds, 
Loq bounds, or variance bounds and covariance 
bounds). These feasibility problems are convex 
for state feedback or full-order output feedback 
controllers (the focus of this elementary intro¬ 
duction), and these can be solved using linear 
matrix inequalities (LMIs), as illustrated in this 
article. However, the earliest approach to these 
MI problems was iterative LQG solutions (to 
find the correct weights to use in the quadratic 
penalty of the state), as in Skelton (1988), Zhu 
and Skelton (1992), and Zhu et al. (1997). 


Matrix Inequalities 

Let Q be any square matrix. The linear matrix 
inequality (LMI) “Q > 0” is just a short-hand 
notation to represent a certain scalar inequality. 
That is, the matrix notation “Q > 0” means “the 
scalar x T Qx is positive for all values of x, except 
x = 0.” Obviously this is a property of Q, not 
x, hence the abbreviated matrix notation Q > 0. 
This is called a linear matrix inequality (LMI), 
since the matrix unknown Q appears linearly 
in the inequality Q > 0. Note also that any 
square matrix Q can be written as the sum of 
a symmetric matrix Q s = |(Q + Q T ), and a 
skew-symmetric matrix Qk = |(Q — Q T ), but 
x T Qk* = 0, so only the symmetric part of the 
matrix Q affects the scalar x t Qx. We assume 
hereafter without loss of generality that Q is 
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symmetric. The notation “Q > 0” means “the 
scalar x T Qx cannot be negative for any x 

Lyapunov proved that x(t) converges to zero 
if there exists a matrix Q such that, along the 
nonzero trajectory of a dynamic system (e.g., the 
system x = Ax), two scalars have the property, 
x(t) T Qx(t) > 0 and <i/^(x T (t)Qx(t)) < 0. 
This proves that the following statements are all 
equivalent: 

1. For any initial condition x(0) of the system 
x = Ax, the state x(t) converges to zero. 

2. All eigenvalues of A lie in the open left half 
plane. 

3. There exists a matrix Q with the two proper¬ 
ties Q > 0 and QA + A T Q < 0. 

4. The set of all quadratic Lyapunov functions 
that can be used to prove the stability or 
instability of the null solution of x = Ax 
is given by x T Q -1 x, where Q is any square 
matrix with the two properties of item 3 
above. 

LMIs are prevalent throughout the fundamen¬ 
tal concepts of control theory, such as control¬ 
lability and observability. For the linear system 
example x = Ax + Bu, y = Cx, the “Ob¬ 
servability Gramian” is the infinite integral Q = 
/ e ATt C T Ce At dt. Furthermore Q > 0 if and only 
if (A, C) is an observable pair, and Q is bounded 
only if the observable modes are asymptotically 
stable. When it exists, the solution of QA + 
A t Q + C T C = 0 satisfies Q > 0 if and only 
if the matrix pair (A, C) is observable. 

Likewise the “Controllability Gramian” X = 
/ e At BB T e ATt dt > 0 if and only if the pair (A, B) 
is controllable. If X exists, it satisfies XA T + 
AX + BB t = 0, and X > 0 if and only if (A, B) 
is a controllable pair. Note also that the matrix 
pair (A, B) is controllable for any A if BB T > 0, 
and the matrix pair (A, C) is observable for any 
A if C T C > 0. Hence, the existence of Q > 0 
or X > 0 satisfying either (QA + A T Q < 0) or 
(AX + XA t < 0) is equivalent to the statement 
that “all eigenvalues of A lie in the open left half 
plane.” 

It should now be clear that the set of all 
stabilizing state feedback controllers, u = Gx, is 
parametrized by the inequalities Q > 0, Q(A + 
BG) + (A + BG) t Q < 0. The difficulty in this 


MI is the appearance of the product of the two 
unknowns Q and G, so more work is required to 
show how to use LMIs to solve this problem. 

In the sequel some techniques are borrowed 
from linear algebra, where a linear matrix equal¬ 
ity (LME) TGA = 0 may or may not have 
a solution G. For LMEs there are two separate 
questions to answer. The first question is “Does 
there exist a solution?” and the answer is “if and 
only if T T + 0A + A = 0.” The second question 
is “What is the set of all solutions?” and the 
answer is “G = r + 0A + + z - r + rzAA + , 
where Z is arbitrary, and the + symbol denotes 
Pseudo Inverse.” LMI approaches employ the 
same two questions by formulating the necessary 
and sufficient conditions for the existence of an 
LMI solution and then to parametrize all solu¬ 
tions. 

Perhaps the earliest book on LMI control 
methods was Boyd et al. (1994), but the results 
and notations used herein are taken from Skelton 
et al. (1998). Other important LMI papers and 
books can give the reader a broader background, 
including Iwasaki and Skelton (1994), Gahinet 
and Apkarian (1994), de Oliveira et al. (2002), 
Li et al. (2008), de Oliveira and Skelton 
(2001), Camino et al. (2001, 2003), Boyd and 
Vandenberghe (2004), Iwasaki et al. (2000), 
Khargonekar and Rotea (1991), Vandenberghe 
and Boyd (1996), Scherer (1995), Scherer et al. 
(1997), Balakrishnan et al. (1994), Gahinet et al. 
(1995), and Dullerud and Paganini (2000). 


Control Design Using LMIs 

Consider the feedback control system 


"V 


Ap D p B p 


" X P _ 

y 

= 

Cp Dy By 


w 

z 


_M P D z 0 


11 



'D c 

C c ' 

z 

= G 

z 


Be 

A c _ 

_*c_ 


_X C 


where z is the measurement vector, y is the output 
to be controlled, u is the control vector, x p is the 
plant state vector, x c is the state of the controller, 
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and w is the external disturbance (in some cases 
below we treat w as a zero-mean white noise). 
We seek to choose the control matrix G to satisfy 
the given upper bounds on the output covariance 
£[yy T ] < Y, where E represents the steady-state 
expectation operator in the stochastic case (i.e., 
when w is white noise), and in the deterministic 
case E represents the infinite integral of the ma¬ 
trix [yy T ] . The math is the same in each case, with 
appropriate interpretations of certain matrices. 
For a rigorous equivalence of the deterministic 
and stochastic interpretations, see Skelton (1988). 
By defining the matrices, 


any given matrices T, A, 0, Chap. 9 of the book 
(Skelton et al. 1998) provides all G which solve 

TGA + (TGA) t + 0 < 0, (6) 

and proves that there exists such a matrix G if and 
only if the following two conditions hold: 

u£0[/ r <O, or rr T >0, (7) 

V^0F a < 0, or A t A > 0. (8) 

If G exists, then one set of such G is given by 


x = 



G=-pr T oA T (AOA T )“ 1 ,o = (prr T -0) 1 , 

(9) 
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C = [Cp 0], H = [B y 0], 



(3) 

F = D y , (4) 


one can write the closed-loop system dynamics in 
the form 
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Often it is of interest to characterize the set 
of all controllers that can satisfy performance 
bounds on both the outputs and inputs, £[yy T ] < 
Y and £[uu T ] < U, and we call these covari¬ 
ance control problems. But without prespecified 
performance bounds Y, U, one can require stabil¬ 
ity only. Such examples are given below. 


Many Control Problems Reduce to the 
Same LMI 

Let the left (right) null spaces of any matrix B 
be defined by matrices Ub (VbX where UgB = 
0, U£U b > 0, (BV b = 0, VgV B > 0). For 


where p > 0 is an arbitrary scalar such that 

O = (prr T -0) _1 > 0. (10) 

All G which solve the problem are given by 
Theorem 2.3.12 in Skelton et al. (1998). As 
elaborated in Chap. 9 of Skelton et al. (1998), 
17 different control problems (using either state 
feedback or full-order dynamic controllers) all re¬ 
duce to this same mathematical problem. That is, 
by defining the appropriate 0, A, T , a very large 
number of different control problems, including 
the characterization of all stabilizing controllers, 
covariance control, //-infinity control, L -infinity 
control, LQG control, and H 2 control, can be re¬ 
duced to the same matrix inequality (13). Several 
examples from Skelton et al. (1998) follow. 

Stabilizing Control 

There exists a controller G that stabilizes the 
system (1) if and only if (7) and (8) hold, where 
the matrices are defined by 

[r A t 0] = [B XM t AX + XA T ]. 

( 11 ) 

One can also write such results in another way, 
as in Corollary 6.2.1 of Skelton et al. (1998, 
p. 135): There exists a control of the form 11 = 
Gx that can stabilize the system x = Ax + B 11 
if and only if there exists a matrix X > 0 
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satisfying B~ L (AX + XA T )(B- L ) T < 0, where 
B- 1 denotes the left null space of B. In this case 
all stabilizing controllers may be parametrized by 
G = —B t P + LQ 1 / 2 , for any Q > 0 and a 
P > 0 satisfying PA + A T P — PBB T P + Q = 0. 
The matrix L is any matrix that satisfies the norm 
bound ||L|| < 1. Youla et al. (1976) provided 
a parametrization of the set of all stabilizing 
controllers, but the parametrization was infinite 
dimensional (as it did not impose any restriction 
on the order or form of the controller). So for 
finite calculations one had to truncate the set to a 
finite number before optimization or stabilization 
started. As noted above, on the other hand, all 
stabilizing state feedback controllers G can be 
parametrized in terms of an arbitrary but finite¬ 
dimensional norm-bounded matrix L. Similar re¬ 
sults apply for the dynamic controllers of any 
fixed order (see Chap. 6 in Skelton et al. 1998). 

Covariance Upper Bound Control 

In the system (1), suppose that D y = 0, B y = 0 
and that w is zero-mean white noise with intensity 
I. Let a required upper bound Y > 0 on the 
steady-state output covariance Y = if[yy T ] be 
given. The following statements are equivalent: 

(i) There exists a controller G that solves the 
covariance upper bound control problem 
Y< Y. 

(ii) There exists a matrix X > 0 such that Y = 
CXC T < Y and (7) and (8) hold, where the 
matrices are defined by 

[r a t 0 ] 

_ B XM t AX + XA t D 
0 E t D t -I 

(12) 

(0 occupies the last two columns). 

Proof is provided by Theorem 9.1.2 in Skelton 
etal. (1998). 

Linear Quadratic Regulator 

Consider the linear time-invariant system (1). 
Suppose that D y = 0, D z = 0 and that w is 
the impulsive disturbance w(t) = wo<S(t). Let 
a performance bound y > 0 be given, where 


the required performance is to keep the integral 
squared output (HyH^) less than the prespecified 
value ||y|| l 2 < y for any vector wo such that 
wjw 0 < 1, and xo = 0. This problem is labeled 
linear quadratic regulator (LQR). The following 
statements are equivalent: 

(i) There exists a controller G that solves the 
LQR problem. 

(ii) There exists a matrix Y > 0 such that 
||D t YD|| < y 2 and (7) and (8) hold, where 
the matrices are defined by 

[r a t 0 ] 

_ "YB M t YA + A t Y M t ] 

H 0 M -ij ' (13) 

Proof is provided by Theorem 9.1.3 in Skelton 
etal. (1998). 

Hoq Control 

LMI techniques provided the first papers to solve 
the general Hoq problem, without any restrictions 
on the plant. See Iwasaki and Skelton (1994) and 
Gahinet and Apkarian (1994). 

Let the closed-loop transfer matrix from w to 
y with the controller in (1) be denoted by T(s): 

T(s) = C c i (si - A cl ) —1 B c i + D cl . (14) 

The Hoq control problem can be defined as fol¬ 
lows: 

Let a performance bound y > 0 be given. Deter¬ 
mine whether or not there exists a controller G in 
(1) which asymptotically stabilizes the system and 
yields the closed-loop transfer matrix (14) such that 
the peak value of the frequency response is less 
than y. That is, ||T|| Hoo = sup ||T(yV)|| < y. 

For the Hoq control problem, we have the follow¬ 
ing result. Let a required Hoq performance bound 
y > 0 be given. The following statements are 
equivalent: 

(i) A controller G solves the Hoq control prob¬ 
lem. 

(ii) There exists a matrix X > 0 such that (7) 
and (8) holds, where 
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[r a t 0 ] 

”B XM t AX + XA t XC t D " 

= H 0 CX —yl F 

0 E t D t F t -yl_ 

(15) 

(0 occupies the last three columns). 

Proof is provided by Theorem 9.1.5 in Skelton 
etal. (1998). 

Loq Control 

The peak value of the frequency response is 
controlled by the above H 00 controller. A similar 
theorem can be written to control the peak in the 
time domain. 

Define supy(t) T y(t) = ||yHf ^, and let the 
statement ||y||Loo < Y mean that the peak value 
of y(t) T y(t) is less than y 2 . Suppose that D y = 0 
and B y = 0. There exists a controller G which 
maintains ||y||z.oo < y in the presence of any 
energy-bounded input w(t) (i.e., / 0 °° w T wdt < 1) 
if and only if there exists a matrix X > 0 such that 
CXC T < y 2 1 and (7) and (8) hold, where 

[r a t 0] 

"B XM t AX + XA t D 
~~ 0 E t D t —yl 

( 16 ) 

Proof is provided by Theorem 9.1.4 in Skelton 
etal. (1998). 


Information Architecture in 
Estimation and Control Problems 

In the typical “control problem” that occupies 
most research literature, the sensors and actuators 
have already been selected. Yet the selection of 
sensors and actuators and their locations greatly 
affect the ability of the control system to do its 
job efficiently. Perhaps in one location a high- 
precision sensor is needed, and in another loca¬ 
tion high precision is not needed, and paying for 
high precision in that location would therefore 
be a waste of resources. These decisions must be 
influenced by the control dynamics which are yet 


to be designed. How does one know where to ef¬ 
fectively spend money to improve the system? To 
answer this question, we must optimize the infor¬ 
mation architecture jointly with the control law. 

Let us consider the problem of selecting the 
control law jointly with the selection of the 
precision (defined here as the inverse of the 
noise intensity) of each actuator/sensor, subject 
to the constraint of specified upper bounds on the 
covariance of output error and control signals, 
and specified upper bounds on the sensor/actuator 
cost. We assume the cost of these devices is 
proportional to their precision (i.e., the cost is 
equal to the price per unit of precision , times 
the precision). Traditionally, with full-order 
controllers, and prespecified sensor/actuator 
instruments (with specified precisions); this is a 
well-known solved convex problem (which 
means it can be converted to an LMI problem 
if desired), see Chap. 6 of Skelton et al. (1998). If 
we enlarge the domain of the freedom to include 
sensor/actuator precisions, it is not obvious 
whether the feasibility problem is convex or 
not. The following shows that this problem of 
including the sensor/actuator precisions within 
the control design problem is indeed convex 
and therefore completely solved. The proof is 
provided in Li et al. (2008). 

Consider the linear control system (l)-(5). 
Assume that the cost of sensors and actuators is 
proportional to their precision, which we herein 
define to be the inverse of the noise intensity (or 
variance, in the discrete-time case). So if the price 
per unit of precision of the z'-th sensor/actuator 
is P u , and if the variance (or intensity) of the 
noise associated with the z-th sensor/actuator 
is Wa, then the total cost of all sensors and 
actuators is JfPaW i J 1 , or simply tr(PW -1 ), 
where P = diag(P//) and W -1 = diag(TI^“ 1 ). 

Consider the control system (1). Suppose that 
D y = 0, By = 0, w = [wj wj] is the zero- 
mean sensor/actuator noise, D p = [0 D a ] and 
D z = [D s 0] . If the $ represents the allowed upper 
bound on sensor/actuator costs, there exists a 
dynamic controller G that satisfies the constraints 

E[uu T ] < U, E[ yy T ] < Y, tr(PW _1 ) < $ 

(17) 
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in the presence of sensor/actuator noise with 
intensity dmg(Wa) = W (which like G should 
be considered a design variable not fixed 
a priori) if and only if there exist matrices 
L, F, Q, X, Z, W _1 such that 


tr(PW -1 ) < $ (18) 
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Note that the matrix inequalities (18)-(20) 
are LMIs in the collection of variables 
(L, F, Q, X, Z, W _1 ), whereby joint con¬ 
trol/sensor/actuator design is a convex problem. 

Assume a solution (L, F, Q, X, Z, W) is found 
for the LMIs (18)—(20). Then the problem (17) is 
solved by the controller 
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where \\ and V r are left and right factors of the 
matrix I — YX (which can be found from the 
singular value decomposition I—YX = UXV T = 
(US 1/2 )(X 1/2 V T ) = (Vi)(V r )). 

To emphasize the theme of this article, to 
relate optimization to LMIs, we note that three 
optimization problems present themselves in the 
above problem with three constraints: control 
effort U, output performance Y, and instrument 
costs $. To solve optimization problems, one can 


fix any two of these prespecified upper bounds 
and iteratively reduce the level set value of the 
third “constraint” until feasibility is lost. This 
process minimizes the resource expressed by the 
third constraint, while enforcing the other two 
constraints. 

As an example, if cost is not a concern, one 
can always set large limits for $ and discover the 
best assignment of sensor/actuator precisions for 
the specified performance requirements. These 
precisions produced by the algorithm are the val¬ 
ues W t ~ l , produced from the solution (18)-(20), 
where the observed rankings W t ~ l > WjJ 1 > 
w k - k l > ... indicate which sensors or actuators 
are most critical to the required performance 
goals (U, Y, $). If any precision is essen¬ 
tially zero, compared to other required precisions, 
then the math is asserting that the information 
from this sensor ( n ) is not important for the 
control objectives specified, or the control signals 
through this actuator channel ( n ) are ineffective 
in controlling the system to these specifications. 
This information leads us to a technique for 
choosing the best sensor actuators and their lo¬ 
cation. 

The previous discussion provides the preci¬ 
sions ( Wf ~ l ) required of each sensor and each 
actuator in the system. Our final application of 
this theory locates sensors and actuators in a 
large-scale system, by discarding the least effec¬ 
tive ones. Suppose we solve any of the above 
feasibility problems, by starting with the entire 
admissible set of sensors and actuators (without 
regard to cost). For example, in a flexible struc¬ 
ture control problem we might not know whether 
to place a rate sensor or displacement sensors at 
a given location, so we add both. We might not 
know whether to use torque or force actuators, so 
we add both. We fill up the system with all the 
possibilities we might want to consider, and let 
the above precision rankings (available after the 
above LMI problem is solved) reveal how much 
precision is needed at each location and at each 
sensor/actuator. If there is a large gap in the pre¬ 
cisions required (say > » 

... W nn l ), then delete the sensor/actuator n and 
repeat the LMI problem with one less sensor or 
actuator. Continue deleting sensors/actuators in 
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this manner until feasibility of the problem is 
lost. Then this algorithm, stopping at the previous 
iteration, has selected the best distribution of 
sensors/actuators for solving the specific prob¬ 
lem specified by the allowable bounds ($, U, Y). 
The most important contribution of the above 
algorithm has been to extend control theory to 
solve system design problems that involve more 
than just deigning control gains. This enlarges the 
set of solved linear control problems, from solu¬ 
tions of linear controllers with sensors/actuators 
prespecified to solutions which specify the sen¬ 
sor/actuator requirements jointly with the control 
solution. 


Summary 

LMI techniques provide more powerful tools 
for designing dynamic linear systems than 
techniques that minimize a scalar functional for 
optimization, since multiple goals (bounds) can 
be achieved for each of the outputs and inputs. 
Optimal control has been a pillar of control theory 
for the last 50 years. In fact, all of the problems 
discussed in this article can perhaps be solved 
by minimizing a scalar functional, but a search 
is required to find the right functional. Globally 
convergent algorithms are available to do just 
that for quadratic functionals. But more direct 
methods are now available (since the early 1990s) 
for satisfying multiple constraints. Since then, 
feasibility approaches have dominated design 
decisions (at least for linear systems), and such 
feasibility problems may be convex or not. If 
the problem can be reduced to a set of LMIs 
to solve, then convexity is proven. However, 
failure to find such LMI formulations of the 
problem does not mean it is not convex, and 
computer-assisted methods for convex problems 
are available to avoid the search for LMIs (see 
Camino et al. 2003). Optimization can also be 
achieved with LMI methods by reducing the 
level set for one of the bounds, while maintaining 
all the other bounds. This level set is reduced 
iteratively, between convex (LMI) solutions, 
until feasibility is lost. A most amazing fact is 
that most of the common linear control design 


problems all reduce to exactly the same matrix 
inequality problem (6). The set of such equivalent 
problems includes LQR, the set of all stabilizing 
controllers, the set of all controllers, and the 
set of all Loo controllers. The discrete and robust 
versions of these problems are also included in 
this equivalent set; 17 control problems have 
been found to be equivalent to LMI problems. 

LMI techniques extend the range of 
solvable system design problems beyond just 
control design. By integrating information 
architecture and control design, one can 
simultaneously choose the control gains and 
the precision required of all sensor/actuators to 
satisfy the closed-loop performance constraints. 
These techniques can be used to select the 
information (with precision requirements) 
required to solve a control or estimation problem, 
using the best economic solution (minimal 
precision). For a more complete discussion of 
LMI problems in control, read Dullerud and 
Paganini (2000), de Oliveira et al. (2002), Li 
et al. (2008), de Oliveira and Skelton (2001), 
Gahinet and Apkarian (1994), Iwasaki and 
Skelton (1994), Camino et al. (2001, 2003), 
Skelton et al. (1998), Boyd and Vandenberghe 
(2004), Boyd et al. (1994), Iwasaki et al. (2000), 
Khargonekar and Rotea (1991), Vandenberghe 
and Boyd (1996), Scherer (1995), Scherer et al. 
(1997), Balakrishnan et al. (1994), and Gahinet 
et al. (1995). 


Cross-References 

► H-Infinity Control 

► H 2 Optimal Control 

► Linear Quadratic Optimal Control 

► LMI Approach to Robust Control 

► Stochastic Linear-Quadratic Control 
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Abstract 

Linear quadratic optimal control is a collective 
term for a class of optimal control problems 
involving a linear input-state-output system 
and a cost functional that is a quadratic form 
of the state and the input. The aim is to 
minimize this cost functional over a given 
class of input functions. The optimal input 
depends on the initial condition, but can be 
implemented by means of a state feedback 
control law independent of the initial condition. 
Both the feedback gain and the optimal cost can 
be computed in terms of solutions of Riccati 
equations. 
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Introduction 

Linear quadratic optimal control is a generic 
term that collects a number of optimal control 
problems for linear input-state-output systems in 
which a quadratic cost functional is minimized 
over a given class of input functions. This func¬ 
tional is formed by integrating a quadratic form 
of the state and the input over a finite or an infi¬ 
nite time interval. Minimizing the energy of the 
output over a finite or infinite time interval can be 
formulated in this framework and in fact provides 
a major motivation for this class of optimal con¬ 
trol problems. A common feature of the solutions 
to the several versions of the problem is that the 
optimal input functions can be given in the form 
of a linear state feedback control law. This makes 
it possible to implement the optimal controllers 
as a feedback loop around the system. Another 
common feature is that the optimal value of the 
cost functional is a quadratic form of the initial 
condition on the system. This quadratic form is 
obtained by taking the appropriate solution of a 
Riccati differential equation or algebraic Riccati 
equation associated with the system. 


Systems with Inputs and Outputs 

Consider the continuous-time, linear time- 
invariant input-output system in state space form 
represented by 

x(t) = Ax(t) + Bu(t), z(t) = Cx(t) + Du(t). 

( 1 ) 

This system will be referred to as £. In (1), 
A, B, C, and D are maps between suitable 
spaces (or matrices of suitable dimensions) and 
the functions x, u , and z are considered to be 
defined on the real axis M or on any subinterval 
of it. In particular, one often assumes the domain 
of definition to be the nonnegative part of M. The 
function u is called the input , and its values are 
assumed to be given from outside the system. 
The class of admissible input functions will be 
denoted U. Often, U will be the class of piecewise 


continuous or locally integrable functions, but for 
most purposes, the exact class from which the 
input functions are chosen is not important. We 
will assume that input functions take values in an 
m -dimensional space U , which we often identify 
with M m . The variable v is called the state 
variable and it is assumed to take values in an n- 
dimensional space A. The space A will be called 
the state space. It will usually be identified with 
W 1 . Finally, z is called the to be controlled output 
of the system and takes values in a ^-dimensional 
space Z, which we identify with R p . The solution 
of the differential equation of I] with initial value 
x(0) = xo will be denoted as x u (t, xo). It can be 
given explicitly using the variation-of-constants 
formula (see Trentelman et al. 2001, p. 38). The 
set of eigenvalues of a given matrix M is called 
the spectrum of M and is denoted by cr(M). The 
system (1) is called stabilizable if there exists a 
map (matrix of suitable dimensions) F such that 
g(A+BF) C C“. Here, C“ denotes the open left 
half complex plane, i.e., {A e C | Re (A) < 0}. 
We often express this property by saying that the 
pair (A, B) is stabilizable. 

The Linear Quadratic Optimal Control 
Problem 

Assume that our aim is to keep all components 
of the output z(t) as small as possible, for all 
t > 0. In the ideal situation, with initial state 
x(0) = 0, the uncontrolled system (with control 
input u = 0) evolves along the stationary solution 
x(t ) = 0. Of course, the output z(t) will then 
also be equal to zero for all t. If, however, at 
time t = 0 the state of the system is perturbed 
to, say, x(0) = xo, with xo ^ 0, then the 
uncontrolled system will evolve along a state 
trajectory unequal to the stationary zero solution, 
and we will get z(t) = Ce At x q. To remedy this, 
from time t = 0 on, we can apply an input 
function u , so that for t >0 the corresponding 
output becomes equal to z(t) = Cx u (t,x o) + 
Du(t). Keeping in mind that we want the output 
z(t) to be as small as possible for all t > 0, 
we can measure its size by the quadratic cost 
functional 
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J(x o, u ) 



( 2 ) 


where || • || denotes the Euclidean norm. Our 
aim to keep the values of the output as small as 
possible can then be expressed as requiring this 
integral to be as small as possible by suitable 
choice of input function u. In this way we arrive 
at the linear quadratic optimal control problem : 

Problem 1 Consider the system E : x(t) = 
Ax(t) + Bu(t), z(t) = Cx(t) + Du(t). Deter¬ 
mine for every initial state Vo an input u e U (a 
space of functions [0, oo) —> U) such that 



is minimal. Here z(t) denotes the output trajec¬ 
tory z u (t,x o) of E corresponding to the initial 
state xo and input function u. 

Since the system is linear and the integrand in 
the cost functional is a quadratic function of z, 
the problem is called linear quadratic. Of course, 
||z|| 2 = x T C T Cx + 2 u T D t Cx + u T D T Du, 
so the integrand can also be considered as a 
quadratic function of (v, u). The convergence of 
the integral in (3) is of course a point of concern. 
Therefore, one often considers the corresponding 
finite-horizon problem in a preliminary investiga¬ 
tion. In this problem, a final time T is given and 
one wants to minimize the integral 

J(x 0 ,u,T):= f T \\z(t)\\ 2 dt. (4) 

Jo 


Problem 2 In the situation of Problem 1 , deter¬ 
mine for every initial state Vo an input u e U such 
that x u (t, Vo) 0 (t oo) and such that under 
this condition, J(x o, u) is minimized. 

In the literature various special cases of these 
problems have been considered, and names have 
been associated to these special cases. In partic¬ 
ular, Problems 1 and 2 are called regular if the 
matrix D is injective, equivalently, D T D > 0. If, 
in addition, C T D = 0 and D T D = /, then the 
problems are said to be in standard form. In the 
standard case, the integrand in the cost functional 
reduces to ||z || 2 = x T C T Cx + u T u. We often 
write Q = C T C. The standard case is a special 
case, which is not essentially simpler than the 
general regular problem, but which gives rise to 
simpler formulas. The general regular problem 
can be reduced to the standard case by means of 
a suitable feedback transformation. 

The Finite-Horizon Problem 

The finite-horizon problem in standard from is 
formulated as follows: 

Problem 3 Given the system x(t) = Ax(t) + 
Bu(t ), a final time T > 0, and symmetric 
matrices N and Q such that N > 0 and Q > 0, 
determine for every initial state Vo a piecewise 
continuous input function u : [0,7] -> U such 
that the integral 

J(xo,u,T ) := / 0 r x(t) T Qx(t) + u(t) T u(t) dt 
+x(T) t Nx(T) (5) 


In contrast to this, the first problem above is 
sometimes called the infinite horizon problem. 
An important issue is also the convergence of 
the state. Obviously, convergence of the integral 
does not always imply the convergence to zero of 
the state. Therefore, distinction is made between 
the problem with zero and with free endpoint. 
Problem 1 as stated is referred to as the problem 
with free endpoint. If one restricts the inputs u 
in the problem to those for which the resulting 
state trajectory tends to zero, one speaks about 
the problem with zero endpoint. Specifically: 


is minimized. 

In this problem, we have introduced a weight 
on the final state , using the matrix N. This 
generalization of the problem does not give rise 
to additional complications. 

A key ingredient in solving this finite-horizon 
problem is the Riccati differential equation asso¬ 
ciated with the problem: 

P(t) = A T P(t) + P(t)A-P(t)BB T P(t) + Q, 
P( 0) = N. (6) 
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This is a quadratic differential equation on the 
interval [0, oo] in terms of the matrices A, B , 
and Q , and with initial condition given by the 
weight matrix N on the final state. The unknown 
in the differential equation is the matrix valued 
function P(t). The following theorem solves the 
finite-horizon problem. It states that the Riccati 
differential equation with initial condition (6) has 
a unique solution on [0, oo), that the optimal 
value of the cost functional is determined by the 
value of this solution at time T, and that there 
exists a unique optimal input that is generated by 
a time-varying state feedback control law: 

Theorem 1 Consider Problem 3 . The following 
properties hold: 

1. The Riccati differential equation with initial 
value (6) has a unique solution P(t ) on 
[0, oo). This solution is symmetric and positive 
semidefinite for all t > 0. 

2. For each Xq there is exactly one optimal input 
function , i.e., a piecewise continuous func¬ 
tion u* on [0, T] such that J(xo,u*,T) = 
/*(xo,T) := inf{/(xo, u, T) \ u e U}. This 
optimal input function u* is generated by the 
time-varying feedback control law 

u(t) = -B t P(T - t)x(t) (0 <t<T). 

(7) 

3. For each xo, the minimal value of the cost 
functional equals 

j*(x o, T) = Xo t P(T)x 0 . 

4. If N = 0, then the function t i-> P(t) is an 
increasing function in the sense that P(t ) — 
P(s) is positive semidefinite for t > s. 


The Infinite-Horizon Problem with 
Free Endpoint 

We consider the situation as described in Theo¬ 
rem 1 with N =0. An obvious conjecture is 
that Xo T P(T)xo converges to the minimal cost 
of the infinite-horizon problem as T —> oo. The 
convergence of Xo T P(T)xo for all Xo is equiva¬ 
lent to the convergence of the matrix P(T) for 


T oo to some matrix P ~ . Such a convergence 
does not always take place. In order to achieve 
convergence, we make the following assumption: 
for every xo, there exists an input u for which the 
integral 


J(x o, u) 


■=f 


x(t) T Qx(t) + u(t) T u(t)dt 

( 8 ) 


converges, i.e., for which the cost J(x o, u) is fi¬ 
nite. Obviously, for the problem to make sense for 
all xo, this condition is necessary. It is easily seen 
that the stabilizability of (A, B) is a sufficient 
condition for the above assumption to hold (not 
necessary, take, e.g., Q = 0). Take an arbitrary 
initial state Xo and assume that u is a function 
such that the integral (8) is finite. We have for 
every T > 0 that 


xo T P(T)xo < J(xo,u , T) < J(xo,u), 

which implies that for every xo, the expression 
Xo T T > (T)xo is bounded. This implies that P(T) 
is bounded. Since P(T) is increasing with respect 
to T, it follows that P~ : = lim^^oo P(T) 
exists. Since P satisfies the differential equa¬ 
tion (6), it follows that also P ft) has a limit as 
t —>► oo. It is easily seen that this latter limit must 
be zero. Hence, P = P~ satisfies the following 
equation: 

A t P + PA- PBB t P + Q = 0. (9) 

This is called the algebraic Riccati equation 
(ARE). The solutions of this equation are exactly 
the constant solutions of the Riccati differential 
equation. The previous consideration shows that 
the ARE has a positive semidefinite solution 
P~. The solution is not necessarily unique, not 
even with the extra condition that P > 0. 
However, P~ turns out to be the smallest real 
symmetric positive semidefinite solution of the 
ARE. 

The following theorem now establishes a com¬ 
plete solution to the regular standard form version 
of Problem 1 : 
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Theorem 2 Consider the system x(t) = 
Ax(t ) + Bu(t ) together with the cost functional 

pOO 

J(xo,u):= / x(t) T Qx(t) + u(t) T u(t) dt, 

Jo 

with Q > 0. Factorize Q = C T C. 
following statements are equivalent: 

1. For every xo e A', exists u e U such that 
J(x o, w) < oo. 

2. The ARE (9) has a real symmetric positive 
semidefinite solution P. 

Assume that one of the above conditions holds. 
Then, there exists a smallest real symmetric pos¬ 
itive semidefinite solution of the ARE, i.e., there 
exists a real symmetric solution P~ > 0 such 
that for every real symmetric solution P >0, we 
have P~ < P. For every xo, we have 

J*(x 0 ) : = inf{/(xo, u) \ u e U} = xo T P~xo. 

Furthermore, for every xo, there is exactly one 
optimal input function, i.e., a function u* e U 
such that J(xq,u*) = J*(x o). This optimal input 
is generated by the time-invariant feedback law 

u{t) = — B T P~x(t). 


The Infinite-Horizon Problem with 
Zero Endpoint 

In addition to the free endpoint problem, we con¬ 
sider the version of the linear quadratic problem 
with zero endpoint. In this case the aim is to 
minimize for every Vo the cost functional over all 
inputs u such that x u (t,x o) —0 (t oo). For 
each vo such u exists if and only if the pair (A, B) 
is stabilizable. A solution to the regular standard 
form version of Problem 2 is stated next: 

Theorem 3 Consider the system x(t) = 
Ax(t) + Bu(t) together with the cost functional 

poo 

J(xo,u):= / x(t) T Qx(t) + u(t) T u(t) dt, 

Jo 


with Q > 0. Assume that ( A , B) is stabilizable. 
Then: 

1. There exists a largest real symmetric solution 
of the ARE, i.e., there exists a real symmetric 
solution P + such that for every real symmet¬ 
ric solution P, we have P < P + . P + is 
positive semidefinite. 

2. For every initial state xo, we have 

Jq(x 0 ) = x 0 T P+x 0 . 

3. For every initial state Xo, there exists an op¬ 
timal input function, i.e., a function u* e 
U with v(oo) = 0 such that J(xo,u*) = 
Jq (vo) if and only if every eigenvalue of A on 
the imaginary axis is (Q, A) observable, i.e., 

'“"‘('VO = n for all X e cr(A) with 

Re(A) = 0. 

Under this assumption we have: 

4. For every initial state xo, there is exactly one 
optimal input function u*. This optimal input 
function is generated by the time-invariant 
feedback law 

u(t) = -B T P + x(t). 

5. The optimal closed-loop system x(t ) = (A — 
BB T P + )x(t) is stable. In fact, P + is the 
unique real symmetric solution of the ARE for 
which a (A - BB r P + ) c <C“ 

Summary and Future Directions 

Linear quadratic optimal control deals with find¬ 
ing an input function that minimizes a quadratic 
cost functional for a given linear system. The 
cost functional is the integral of a quadratic form 
in the input and state variable of the system. If 
the integral is taken over, a finite time interval 
the problem is called a finite-horizon problem, 
and the optimal cost and optimal state feedback 
gain can be expressed in terms of the solution of 
an associated Riccati differential equation. If we 
integrate over an infinite time interval, the prob¬ 
lem is called an infinite-horizon problem. The 
optimal cost and optimal feedback gain for the 
free endpoint problem can be found in terms of 
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the smallest nonnegative real symmetric solution 
of the associated algebraic Riccati equation. For 
the zero endpoint problem, these are given in 
terms of the largest real symmetric solution of the 
algebraic Riccati equation. 

Cross-References 

► Generalized Finite-Horizon Linear-Quadratic 
Optimal Control 

► H-Infinity Control 

► H 2 Optimal Control 

► Linear State Feedback 


Recommended Reading 

The linear quadratic regulator problem and the 
Riccati equation were introduced by R.E. Kalman 
in the early 1960s (see Kalman 1960). Extensive 
treatments of the problem can be found in the 
textbooks Brockett (1969), Kwakernaak and 
Sivan (1972), and Anderson and Moore (1971). 
For a detailed study of the Riccati differential 
equation and the algebraic Riccati equation, we 
refer to Wonham (1968). Extensions of the linear 
quadratic regulator problem to linear quadratic 
optimization problems, where the integrand 
of the cost functional is a possibly indefinite 
quadratic function of the state and input variable, 
were studied in the classical paper of Willems 
(1971). A further reference for the geometric 
classification of all real symmetric solutions 
of the algebraic Riccati equation is Coppel 
(1974). For the question what level of system 
performance can be obtained if, in the cost 
functional, the weighting matrix of the control 
input is singular or nearly singular leading to 
singular and nearly singular linear quadratic 
optimal control problems and “cheap control” 
problems, we refer to Kwakernaak and Sivan 
(1972). An early reference for a discussion on 
the singular problem is the work of Clements and 
Anderson (1978). More details can be found 
in Willems (1971) and Schumacher (1983). 
In singular problems, in general one allows 
for distributions as inputs. This approach was 


worked out in detail in Hautus and Silverman 
(1983) and Willems et al. (1986). For a more 
recent reference, including an extensive list 
of references, we refer to the textbook of 
Trentelman et al. (2001). 
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Abstract 

As in optimal control theory, linear quadratic 
(LQ) differential games (DG) can be solved, 
even in high dimension, via a Riccati equation. 
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However, contrary to the control case, existence 
of the solution of the Riccati equation is not 
necessary for the existence of a closed-loop 
saddle point. One may “survive” a particular, 
nongeneric, type of conjugate point. An 
important application of LQDGs is the so-called 
Hoq -optimal control, appearing in the theory of 
robust control. 


Keywords 


only necessary for some of the following results. 
Detailed results without that assumption were 
obtained by Zhang (2005) and Delfour (2005). 
We chose to set the cross term in uv in the 
criterion null; this is to simplify the results and 
is not necessary. This problem satisfies Isaacs’ 
condition (see article DG) even with nonzero 
such cross terms. 

Using the change of control variables 

u = it — R~ l S[x , v = v + r _1 ^v , 


Differential games; Finite horizon; H-infinity 
control; Infinite horizon 

Perfect State Measurement 

Linear quadratic differential games are a spe¬ 
cial case of differential games (DG). See the 
article ►Pursuit-Evasion Games and Zero-Sum 
Two-Person Differential Games. They were first 
investigated by Ho et al. (1965), in the context 
of a linearized pursuit-evasion game. This sub¬ 
section is based upon Bernhard (1979, 1980). A 
linear quadratic DG is defined as 

x = Ax + Bu + Dv , x(to) = xq , 


yields a DG with the same structure, with mod¬ 
ified matrices A and Q , but without the cross 
terms in xu and xv. (This extends to the case with 
nonzero cross terms in uv.) Thus, without loss of 
generality, we will proceed with (S\ S 2 ) = (0 0). 

The existence of open-loop and closed-loop 
solutions to that game is ruled by two Riccati 
equations for symmetric matrices P and P*, 
respectively, and by a pair of canonical equations 
that we shall see later: 

P + PA + A'P - PBR-'B'P + PDT~ l D'P 
+ Q = 0,P(T) = K, (1) 

P* + P*A + A*P* + P'DT-'D 1 P* + Q=0, 
P*(T) = K. (2) 


with x e K", u e K m , v e u(-) e 
L 2 ([ 0, T],R m ), v(-) e L 2 ([0, T]^). Final time 
T is given, there is no terminal constraint, and 
using the notation x l Kx = \\x\\ 2 K , 


J(t 0 ,x 0 ;u(-),v(-)) = \\x(T)\\ 2 k + f (x r u‘ v l ) 




J to 
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When both Riccati equations have a solution over 
[t,T], it holds that in the partial ordering of 
definiteness, 


0 < P{t) < P\t). 

When the saddle point exists, it is represented by 
the state feedback strategies 

u = cp*{t, v) = —R~ l P(t)x , 


v = t*(t,x) = r~ l D‘P(t)x. (3) 


The matrices of appropriate dimensions, A, B, D, 
Q, Si, R, and T, may all be measurable functions 
of time. R and T must be positive definite with 
inverses bounded away from zero. To get the most 
complete results available, we assume also that K 
and Q are nonnegative definite, although this is 


The control functions generated by this pair of 
feedbacks will be noted w(-) and C(-). 

Theorem 1 

• A sufficient condition for the existence of 
a closed-loop saddle point, then given by 
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(<p*, i/s*) in (3), is that Eq. (1) has a solution 
P(t ) defined over [to, T]. 

• A necessary and sufficient condition for the 

existence of an open-loop saddle point is that 
Eq. (2) has a solution over [to, T] (and then so 
does (1)). In that case, the pairs (w(-), £(•)), 
(u(-), t/t*), and are saddle points. 

• A necessary and sufficient condition for 
(cp*, C(-)) to he a saddle point is that Eq. (1) 
has a solution over [to, T]. 

• In all cases where a saddle point exists, the 
Value function is V(t,x) = ||v||p^. 

However, Eq. (1) may fail to have a solution and 
a closed-loop saddle point still exists. The precise 
necessary condition is as follows: let X(-) and 
F(-) be two square matrix function solutions of 
the canonical equations 

fX\ _ ( A —BR~ l B t + Dr -1 ZA fX\ 

w “ v-e ) v) ’ 



The matrix P(t) exists for t G [to,T] if and 
only if X(t) is invertible over that range, and 
then, P(t) = Y(t)X~ l (t). Assume that the rank 
of X(t) is piecewise constant, and let X\t) 
denote the pseudo-inverse of X(t) and IZ(X(t)) 
its range. 

Theorem 2 A necessary and sufficient condition 
for a closed-loop saddle point to exist, which is 
then given by (3) with P(t) = Y(t)X\t), is 
that 

1. Vo G lZ(X(to)). 

2. For almost all t G [to, T], IZ(D(t)) C 
IZ(X(t)). 

3. W e [t 0 , T], Y(t)X\t) > 0. 

In a case where X(t) is only singular at an 
isolated instant t* (then conditions 1 and 2 above 
are automatically satisfied), called a conjugate 
point but where YX~ l remains positive defi¬ 
nite on both sides of it, the conjugate point is 
called even. The feedback gain F = —R~ l B t P 
diverges upon reaching t*, but on a trajectory 
generated by this feedback, the control u(t) = 


F(t)x (t) remains finite. (See an example in Bern- 
hard 1979.) 

If T = oo, with all system and payoff matrices 
constant and Q > 0, Mageirou (1976) has shown 
that if the algebraic Riccati equation obtained by 
setting P = 0 in (1) admits a positive definite 
solution P, the game has a Value \\x\\ 2 p , but (3) 
may not be a saddle point, (iff* may not be an 
equilibrium strategy.) 

#oo-Optimal Control 

This subsection is entirely based upon Ba§ar and 
Bernhard (1995). It deals with imperfect state 
measurement, using Bernhard’s nonlinear min¬ 
imax certainty equivalence principle (Bernhard 
and Rapaport 1996). 

Several problems of robust control may be 
brought to the following one: a linear, time- 
invariant system with two inputs (control input 
u G M m and disturbance input w G M^) and two 
outputs (measured output y G R p and controlled 
output z G W) is given. One wishes to con¬ 
trol the system with a nonanticipative controller 
u(f) = 0(y(•)) in order to minimize the induced 
linear operator norm between spaces of square- 
integrable functions, of the resulting operator 
w(-) z(-). 

It turns out that the problem which has a 
tractable solution is a kind of dual one: given 
a positive number y, is it possible to make this 
norm no larger than y? The answer to this ques¬ 
tion is yes if and only if 

/ oo 

(\\z(t)\\ 2 - y 2 \\w(t)\\ 2 )dt <0. 

-00 

We shall extend somewhat this classical prob¬ 
lem by allowing either a time variable system, 
with a finite horizon T, or a time-invariant system 
with an infinite horizon. 

The dynamical system is 


x — Ax T Bu T - Dw, 

(4) 

y = Cx + Ew, 

(5) 

z = Hx + Gu. 

(6) 
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We let 

©<*■*■>=» 

(£>«>-(££)• 

and we assume that E is onto, O N > 0, and G 
is one-to-one O R > 0. 

Finite Horizon 

In this part, we consider a time-varying system, 
with all matrix functions measurable. Since the 
state is not known exactly, we assume that the 
initial state is not known either. The issue is 
therefore to decide whether the criterion 

Jy = MT)\\ 2 k + /W)ll 2 -y 2 IM0ll 2 ) 

JtQ 

df-y 2 ||xo||| 0 (7) 

may be kept finite and with which strategy. Let 
y* = inf{y | inf sup J y < oo} . 

^ ,w{-)eL 2 

Theorem 3 y < y* if and only if the following 
three conditions are satisfied: 

1. The following Riccati equation has a solution 
over [to, T]\ 

-P =PA+A‘ P-(PB+S)R~ l (B‘ P+S‘) 
+y~ 2 PMP + Q , P(T)=K . (8) 

2. The following Riccati equation has a solution 
over [t 0 , T]\ 

t =AE + EA‘ - (EC , +L)N- 1 (CE + L t ) 
~\~Y 2 E Q E + M , E(to) = Eo . (9) 

3. The following spectral radius condition is sat¬ 
isfied: 

Vr € [to, T], p(E(t)P(t)) < y 2 . (10) 


In that case, the optimal controller ensuring 
inf^ sup XQ w J y is given by a “worst case state” 
x(f) satisfying x(0) =0 and 

j i=[A - BR-\B‘ P+S‘) + Y~ 2 D(D‘ P + L 1 )] 

x + (I-y- 2 EP)~ x (EC t + L)(y-Cx), (11) 

and the certainty equivalent controller 

**(?(■))(0 = -R-\B‘P + S')x(t) . (12) 

Infinite Horizon 

The infinite horizon case is the traditional //oo- 
optimal control problem reformulated in a state 
space setting. We let all matrices defining the 
system be constant. We take the integral in (7) 
from — oo to + 00 , with no initial or terminal term 
of course. We add the hypothesis that the pairs 
(A, B) and (A, D) are stabilizable and the pairs 
(C, A) and ( H , A) detectable. Then, the theorem 
is as follows: 

Theorem 4 y < y* if and only if the fol¬ 
lowing conditions are satisfied: The algebraic 
Riccati equations obtained by placing P = 0 
and £ = 0 in (8) and (9) have positive defi¬ 
nite solutions, which satisfy the spectral radius 
condition (10). The optimal controller is given 
by Eqs.(ll) and (12), where P and E are the 
minimal positive definite solutions of the alge¬ 
braic Riccati equations, which can be obtained 
as the limit of the solutions of the differential 
equations as t —00 for P and t —> 00 
for E. 


Conclusion 

The similarity of the -optimal control theory 
with the LQG, stochastic, theory is in many 
respects striking, as is the duality observation 
control. Yet, the “observer” of -optimal con¬ 
trol does not arise from some estimation the¬ 
ory but from the analysis of a “worst case.” 
The best explanation might be in the duality of 
the ordinary, or (+, x), algebra with the idem- 
potent (max, +) algebra (see Bernhard 1996). 
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The complete theory of -optimal control in 
that perspective has yet to be written. 
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Abstract 

Feedback is a fundamental mechanism in nature 
and central in the control of systems. The state 
contains important system information, and ap¬ 
plying a control law that uses state information is 
a very powerful control policy. To illustrate the 
effect of feedback in linear systems, continuous¬ 
time and discrete-time state variable descriptions 
are used: these allow one to write explicitly the 
resulting closed-loop descriptions and to study 
the effect of feedback on the eigenvalues of the 
closed-loop system. The eigenvalue assignment 
problem is also discussed. 


Keywords 

Feedback; Linear systems; State feedback; State 
variables 


Introduction 

Feedback is a fundamental mechanism arising in 
nature. Feedback is also common in engineered 
systems and is essential in the automatic control 
of dynamic processes with uncertainties in their 
model descriptions and in their interactions with 
the environment. When feedback is used, the 
actual values of the system variables are sensed, 
fed back, and used to control the system. That is, 
a control law decision process is based not only 
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on predictions on the system behavior derived 
from a process model (as in open-loop control) 
but also on information about the actual behavior 
(closed-loop feedback control). 


Linear Continuous-Time Systems 

Consider, to begin with, time-invariant systems 
described by the state variable description 

x = Ax + Bu, y = Cx + Du, (1) 

in which x(t) G W 1 is the state, u(t ) G M m is the 
input, y(t ) G R p is the output, and A G M wxw , 
B G ir xm , C G R pxn , D G R pxm are constant 
matrices. In this case, the linear state feedback 
(lsf) control law is selected as 


u(t) = Fx(t ) + r(f), (2) 


where F G M mx/7 is the constant gain matrix and 
r(t) G M m is a new external input. 

Substituting (2) into (1) yields the closed-loop 
state variable description, namely, 


x = (A + BF)x + Br, 
y = (C + DF)x + Dr. 


( 3 ) 


Appropriately selecting F, primarily to modify 
A + BF, one affects and improves the behavior 
of the system. 

A number of comments are in order: 

- Feeding back the information from the state v 
of the system is expected to be, and it is, an ef¬ 
fective way to alter the system behavior. This 
is because knowledge of the (initial) state and 
the input uniquely determines the system’s 
future behavior and intuitively using the state 
information should be a good way to control 
the system, i.e., modifying its behavior. 

- In a state feedback control law, the input u 
can be any function of the state u = f(x,r), 
not necessarily linear with constant gain F as 
in (2). Typically given (1) and (2) is selected 
as the linear state feedback primarily because 
the resulting closed-loop description (3) is 


also a linear time-invariant system. However, 
depending on the application needs, the state 
feedback control law (2) can be more com¬ 
plex. 

Although the Eqs. (3) that describe the closed- 
loop behavior are different from Eq. (1), this 
does not imply that the system parameters 
have changed. The way feedback control acts 
is not by actually changing the system pa¬ 
rameters A, B, C, D but by changing u 
so that closed-loop system behaves as if the 
parameters were changed. When one applies, 
say, a step via r(t) in the closed-loop system, 
then u(t ) in (2) is modified appropriately so 
the system behaves in a desired way. 

It is possible to implement u in (2) as an open- 
loop control law, namely, 

u(s) = F[sl ~(A + BF)] _1 x(0) 

+ [/ - F(sl - ^) _1 J B] _1 r(i) (4) 

where Laplace transforms have been used for 
notational convenience. Equation (4) produces 
exactly the same input as Eq. (2), but it has 
the serious disadvantage that it is based ex¬ 
clusively on prior knowledge on the system 
(notably v(0) and parameters A , B). As a 
result, when there are uncertainties (and there 
always are), the open-loop control law (4) 
may fail, while the closed-loop control law (2) 
succeeds. 

Analogous definitions exist for continuous¬ 
time, time-varying systems described by the 
equations 

x = A(t)x + B(t)u, y = C(t)x + D(t)u 

( 5 ) 

In this framework, the control law is described 
by 

u = F(t)x + r, (6) 

and the resulting closed-loop system is 

x — [A(^) + B(t^F(t^]x H - B(t)r, 

( 7 ) 

y = [ C(t ) + D{t)F(t)\x + D(t)r. 
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Linear Discrete-Time Systems 

For the discrete-time, time-invariant case, the 
system description is 

x(k + \) = Ax(k) + Bu(k), y = Cx(k) + Du(k), 

( 8 ) 

the linear state feedback control law is defined as 
u(k) = Fx(k) + r(k), (9) 

and the closed-loop system is described by 
x(k + 1) = (A + BF)x(k ) + Br(k), 

( 10 ) 

y(k) = (C + DF)x(k) + Dr(k). 
Similarly, for the discrete-time, time-varying case 
x(k + 1) = A(k)x(k) + B(k)u(k), 

(ID 

y(k) = C(k)x(k) + D(k)u(k ), 
the control law is defined as 

u(k) = F(k)x(k) + r(k), (12) 

and the resulting closed-loop system is 

x(k + 1) = [A(k) + B(k)F(k)]x(k) + B(k)r(k), 

y(k) = [C(k) + D(k)F(k)]x(k) + D(k)r(k). 

(13) 

Selecting the Gain F 

F (or F(t)) is selected so that the closed-loop 
system has certain desirable properties. Stability 
is of course of major importance. Many control 
problems are addressed using linear state feed¬ 
back including tracking and regulation, diagonal 
decoupling, and disturbance rejection. Here we 
shall focus on stability. Stability can be achieved 
under appropriate controllability assumptions. In 
the time-varying case, one way to determine 
such stabilizing F(t) (or F(k)) is to use results 
from the optimal linear quadratic regulator (LQR) 


theory which yields the “best” F(t) (or F(k)) in 
some sense. 

In the time-invariant case, one can also use a 
LQR formulation, but here stabilization is equiva¬ 
lent to the problem of assigning the n eigenvalues 
of (A + BF) in the stable region of the complex 
plane. If A/, i = 1are the eigenvalues of 
A + BF , then F should be chosen so that, for all 
i = 1,the real part of A*, Re( A/) < 0 in 
the continuous-time case, and the magnitude of 
A/, |A/1 < 1 in the discrete-time case. Eigenvalue 
assignment is therefore an important problem, 
which is discussed hereafter. 

Eigenvalue Assignment Problem 

For continuous-time and discrete-time, time- 
invariant systems, the eigenvalue assignment 
problem can be posed as follows. Given matrices 
A e and B e R nxm , find F e M mxw such 
that the eigenvalues of A + BF are assigned to 
arbitrary, complex conjugate, locations. Note that 
the characteristic polynomial of A + BF , namely, 
det (si — (A + BF)), has real coefficients, 
which implies that any complex eigenvalue is 
part of a pair of complex conjugate eigenvalues. 

Theorem 1 The eigenvalue assignment problem 
has a solution if and only if the pair (A, B) is 
reachable. 

For single-input systems, that is, for systems with 
m = 1, the eigenvalue assignment problem has 
a simple solution, as illustrated in the following 
statement: 

Proposition 1 Consider system (1) or (8). Let 
m = 1. Assume that 

rank R = n, 

where 

R = [B,AB,....A n ~ l B], 

that is, the system is reachable. Let p(s) be a 
desired monic polynomial of degree n. Then there 
is a (unique) linear state feedback gain matrix F 
such that the characteristic polynomial of A + BF 
is equal to p{s). Such linear state feedback gain 
matrix F is given by 
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F = — [0---0 l]R~ l p(A). (14) 

Proposition 1 provides a constructive way to 
assign the characteristic polynomial, hence the 
eigenvalues, of the matrix A + BF. Note that, for 
low order systems, i.e., if n = 2 or n = 3, it may 
be convenient to compute directly the character¬ 
istic polynomial of A + BF and then compute 
F using the principle of identity of polynomials, 
i.e., F should be such that the coefficients of 
the polynomials det(s/ — (A + BF)) and p(s) 
coincide. Equation (14) is known as Ackermann’s 
formula. 

The result summarized in Proposition 1 can be 
extended to multi-input systems. 

Proposition 2 Consider system (1) or (8). Sup¬ 
pose 

rank R = n, 

that is, the system is reachable. Let p(s) be a 
monic polynomial of degree n. Then there is a 
linear state feedback gain matrix F such that the 
characteristic polynomial of A A- BF is equal 
to p(s). 

Note that in the case m > 1 the linear state 
feedback gain matrix F assigning the character¬ 
istic polynomial of the matrix A + BF is not 
unique. To compute such a gain matrix, one may 
exploit the following fact: 

Lemma 1 Consider system (1). Suppose 
rank R = n, 

that is, the system is reachable. Let b\ be a 
nonzero column of the matrix B. Then there is a 
matrix G such that the single-input system 

x = (A + BG)x + bjV (15) 

is reachable. Similar results are true for discrete¬ 
time systems (8). 

Exploiting Lemma 1 , it is possible to design a 
matrix F such that the characteristic polynomial 
of A A- BF equals some monic polynomial p(s) 
of degree n in two steps. First, we compute a ma¬ 
trix G such that the system (15) is reachable, and 


then we use Ackermann’s formula to compute a 
linear state feedback gain matrix F such that the 
characteristic polynomial of 

A A- BGA-btF 

is p(s). Note also that if (A, B) is reachable, 
under mild conditions on A, there exists vector 
g so that (A, Bg) is reachable. 

There are many other methods to assign the 
eigenvalues which may be found in the references 
below. 


Transfer Functions 

If Hp(s) is the transfer function matrix of the 
closed-loop system (3), it is of interest to find its 
relation to the open-loop transfer function H(s) 
of (1). It can be shown that 

H f (s) = H(s)[I - F(sl - 

= H(s)[F(sI - (A + BF))~ X B + I] 

In the single-input, single-output case, it can 
be readily shown that the linear state feedback 
control law (2) only changes the coefficients 
of the denominator polynomial in the transfer 
function (this result is also true in the multi¬ 
input, multi-output case). Therefore, if any of 
the (stable) zeros of H(s) need to be changed, 
the only way to accomplish this via linear state 
feedback is by pole-zero cancelation (assigning 
closed-loop poles at the open-loop zero locations; 
in the MIMO case, closed-loop eigenvalue direc¬ 
tions also need to be assigned for cancelations to 
take place). Note that it is impossible to change 
the unstable zeros of H(s) under stability, since 
they would have to be canceled with unstable 
poles. 

Observer-Based Dynamic Controllers 

When the state v is not available for feedback, an 
asymptotic estimator (a Luenberger observer) is 
typically used to estimate the state. The estimate 
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x of the state, instead of the actual x, is then used 
in (2) to control the system, in what is known as 
the certainty equivalence architecture. 


Summary 

The notion of state feedback for linear systems 
has been discussed. It has been shown that state 
feedback modifies the closed-loop behavior. The 
related problem of eigenvalue assignment has 
been discussed, and its connection with the reach¬ 
ability (controllability) properties of the system 
has been highlighted. The class of feedback laws 
considered is the simplest possible one. If addi¬ 
tional constraints on the input signal, or on the 
closed-loop performance, are imposed, then one 
perhaps has to resort to nonlinear state feedback, 
for example, if the input signal is bounded in 
amplitude or rate. If constraints such as decou¬ 
pling of the systems into m noninteracting sub¬ 
systems or tracking under asymptotic stability are 
imposed, then dynamic state feedback may be 
necessary. 
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Abstract 

An important input-output description of a linear 
continuous-time system is its impulse response, 
which is the response h(t,r) to an impulse ap¬ 
plied at time r. In time-invariant systems that are 
also causal and at rest at time zero, the impulse 
response is h(t, 0) and its Laplace transform is 
the transfer function of the system. Expressions 
for h(t,r ) when the system is described by state- 
variable equations are also derived. 

Keywords 

Continuous-time; Impulse response descriptions; 
Linear systems; Time-invariant; Time-varying; 
Transfer function descriptions 

Introduction 

Consider linear continuous-time dynamical sys¬ 
tems, the input-output behavior of which can 
be described by an integral representation of the 
form 

/ +oo 

H(t,r)u(r)dr (1) 

-oo 

where t, r G M, the output is y (t) G M 77 , the input 
is u(t) G M m , and H : MxM M^ xm is assumed 
to be integrable. For instance, any system in state- 
variable form 
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x = A(t)x + B(t)u 
y = C(t)x + D(t)u 


( 2 ) 


x — Ax -\- Bu 
y = Cx + Du 


(3) 


also has a representation of the form (1) as we 
shall see below. 

Note that it is assumed that at r = — oo, the 
system is at rest. H(t, r) is the impulse response 
matrix of the system (1). To explain, consider first 
a single-input single-output system: 


y(t) 


-L 


+oo 


h(t, r)u(r)dr, 


(4) 


and recall that if 8(t—r) denotes an impulse (delta 
or Dirac) function applied at time r = t , then for 
a function fit), 


input at time r with all remaining components 
of the input being zero. H(t, r) = [hij it, r)] is 
called the impulse response matrix of the system. 

If it is known that system (1) is causal, then 
the output will be zero before an input is applied. 
Therefore, 


H{t, r)=0, for t < x, (7) 

and (1) becomes 


y(t) 

Rewrite (8) as 

rto 


= / H(t,r)u(r)dr. 

J—oo 


( 8 ) 


/ To pt 

H(t,r)u(r)dr / Hf,x)uix)dx 

-CX) Jto 


— y(to) + f H(t,x)u(x)dx. 

Jto 


(9) 


/ +oo 

f(z)8(t -z)dz. (5) 

-OO 

If now in (4) u(x) = 8(t — r), that is, an impulse 
input is applied at r = t, then the output yi(t) is 

yi(t) = h(t, t), 

i.e., h(t , t) is the output at time t when an impulse 
is applied at the input at time t. So in (4), hit, x) 
is the response at time t to an impulse applied 
at time r. Clearly if the impulse response hit, x) 
is known, the response to any input uit) can be 
derived via (4), and so h(t, x) is an input/output 
description of the system. 

Equation (1) is a generalization of (4) for the 
multi-input, multi-output case. If we let all the 
components of w(r) in (1) be zero except the j th 
component, then 

/ +oo 

hij(t, z)uj(z)dz, (6) 

-OO 

h^ it, x) denotes the response of the i th compo¬ 
nent of the output of system (1) at time t due to 
an impulse applied to the j th component of the 


If (1) is at rest at t = to (i.e., if uit ) = 0 for 
t > to, then yit ) =0 for t > ?o), y(t o) = 0 and 
(9) becomes 

yit) = f Hit,x)uix)dx. (10) 

Jto 

If in addition system (1) is time-invariant, then 
Hit, x) = Hit — x, 0) (also written as Hit — 
r)) since only the elapsed time (t — x) from the 
application of the impulse is important. Then (10) 
becomes 

yit) = f Hit-x)uix)dx, t > 0, (11) 

Jo 

where we chose to = 0 without loss of generality. 
Equation (11) is the description for causal, time- 
invariant systems, at rest at t = 0. 

Equation (11) is a convolution integral and 
if we take the (one-sided or unilateral) Laplace 
transform of both sides, 

9(s) = His)uis), (12) 

where yis), uis) are the Laplace transforms of 
yit), uit) and His) is the Laplace transform of 
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the impulse response H(t). H(s) is the transfer 
function matrix of the system. Note that the trans¬ 
fer function of a linear, time-invariant system is 
typically defined as the rational matrix H (s) that 
satisfies (12) for any input and its corresponding 
output assuming zero initial conditions, which is 
of course consistent with the above analysis. 


where x(to) = 0. 

Comparing (15) with (11), the impulse re¬ 
sponse 


H{t - r) 


Ce A(, - r) B + D8(t -t) t > z, 
0 t < r, 

(16) 


Connection to State-Variable 
Descriptions 


or as it is commonly written (taking the time 
when the impulse is applied to be zero, r = 0) 


When a system is described by the state-variable 
description (2), then 

y(t)= f [C(t)$(t,T)B(z) 

Jto 

+ D(t)8(t — r)]u(r)dr, (13) 


[Ce At B + D8(t) t> 0, 

H{t) = < w “ 

0 t< 0. 


(17) 


Take now the (one-sided or unilateral) Laplace 
transform of both sides in (17) to obtain 


where it was assumed that x(^o) = 0, i.e., the 
system is at rest at to. Here 0(f r) is the state 
transition matrix of the system defined by the 
Peano-Baker series: 


Ht,t 0 ) = I + 


J A(r0d 


Tl 


to 


t 

p 

?1 

p 

/ A(r0 

J 

/ A(T 2 )dT2 

J 

to 

-to 


dx\ + • • • 


H(s) = C(sl — A)~ l B + D, (18) 


which is the transfer function matrix in terms 
of the coefficient matrices in the state-variable 
description (3). Note that (18) can also be derived 
directly from (3) by assuming zero initial condi¬ 
tions ix (0) = 0) and taking Laplace transform of 
both sides. 

Finally, it is easy to show that equivalent 
state-variable descriptions give rise to the same 
impulse responses. 


see ► Linear Systems: Continuous-Time, Time- 
Varying State Variable Descriptions. 

Comparing (13) with (10), the impulse re¬ 
sponse 

( Cm(t,r)B(t)+D(t)8(t-r ) t>r , 

|0 t< x. 

(14) 

Similarly, when the system is time-invariant 
and is described by (3), 

yit) = f [Ce A ^~ T ^B + DSf — x)]uix)dx, 

Jto 

(15) 


Summary 

The continuous-time impulse response is an 
external, input-output description of linear, 
continuous-time systems. When the system is 
time-invariant, the Laplace transform of the 
impulse response hit, 0) (which is the output 
response at time t due to an impulse applied at 
time zero with initial conditions taken to be zero) 
is the transfer function of the system - another 
very common input-output description. The 
relationships with the state-variable descriptions 
are shown. 
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Recommended Reading 

External or input-output descriptions such as the 
impulse response and the transfer function (in 
the time-invariant case) are described in several 
textbooks below. 


Bibliography 

Antsaklis PJ, Michel AN (2006) Linear systems. 
Birkhauser, Boston 

DeCarlo RA (1989) Linear systems. Prentice-Hall, Engle¬ 
wood Cliffs 

Kailath T (1980) Linear systems. Prentice-Hall, Engle¬ 
wood Cliffs 

Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- 
Hall, Englewood Cliffs 

Sontag ED (1990) Mathematical control theory: deter¬ 
ministic finite dimensional systems. Texts in applied 
mathematics, vol 6. Springer, New York 


Linear Systems: Continuous-Time, 
Time-Invariant State Variable 
Descriptions 

Panos J. Antsaklis 

Department of Electrical Engineering, University 
of Notre Dame, Notre Dame, IN, USA 

Synonyms 

LTI Systems 


Abstract 

Continuous-time processes that can be modeled 
by linear differential equations with constant 


coefficients can also be described in a systematic 
way in terms of state variable descriptions of 
the form x(t) = Ax(t) + Bu(t), y(t ) = 

Cx(t)+Du(t). The response of such systems due 
to a given input and a set of initial conditions is 
derived and expressed in terms of the variation of 
constants formula. Equivalence of state variable 
descriptions is also discussed. 

Keywords 

Continuous-time; Linear systems; State variable 
descriptions; Time-invariant 

Introduction 

Linear, continuous-time systems are of great in¬ 
terest because they model, exactly or approxi¬ 
mately, the behavior over time of many practical 
physical systems of interest. We are particularly 
interested in systems, the behavior of which is de¬ 
scribed by linear, ordinary differential equations 
with constant coefficients. 

Such descriptions can always be rewritten as a 
set of first-order differential equations, typically 
in the following convenient state variable form: 

x = Ax(t) + Bu(t), y(t ) = Cx(t ) + Du(t ); 
x(0) = v 0 , (1) 

where x(t), the state vector, is a column vector of 
dimension n (x(t) e R n ) and x(t) = with 
the derivative being taken element by element. 
A e M" x/ \ B e R nxm , C e R p * n , D e RP xm are 
matrices with real entries (these are the constant 
coefficients that make the system time invariant); 
and u{t) e R m , y{t) e R p are the inputs and 
outputs of the system. The vector differential 
equation is the state equation and the algebraic 
equation is the output equation. 

The advantage of the above state variable 
description is that for given input u(t ) and initial 
condition v(0), its solution (state and output mo¬ 
tions or trajectories) can be conveniently and sys¬ 
tematically characterized. This is shown below. 
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Deriving State Variable Descriptions 

Description (1) may be derived directly, by mod¬ 
eling the behavior of a linear, continuous-time, 
time-invariant system, but more often it is de¬ 
rived either from the linearization of a nonlinear 
equation around an operating point or a trajectory 
or from higher-order differential equations that 
model the system’s behavior. The example below 
illustrates the latter case. 



Consider a spring-mass example, where a 
mass m slides horizontally on a surface with 
damping coefficient b due to friction and it is 
attached to a wall by a linear spring of spring 
constant k. If y{t) denotes the distance of the 
center of the mass from a position of rest of the 
spring, by applying Newton’s law the following 
second-order linear ordinary differential equation 
with constant coefficients is obtained: 

my(t) + by(t) + ky(t) = u(t). (2) 


_x 2 (t) _ 

= 

o^ lE 

1_1 

1 - 1 

. ^I§ 

1 

1 - 1 

X X 

K) H- 

1_1 

+ 

"O' 

- m - 


u(t ) 


(3) 


and 


y(t) = [io] 


Xi(t) 

X 2 (?) 


with 


'xi(O)- 


>o- 

_x 2 (0) _ 


_Ti_ 


as initial conditions. This 


is of the form (1) where x(t) is a 2-dimensional 
column vector; A is a 2 x 2 matrix; B and 
C are 2-dimensional column and row vectors, 
respectively; and x(0) = xo. 

Notes: 

1. It is always possible to obtain a state variable 
description which is equivalent to a given set 
of higher-order differential equations 

2. The choice of the state variables, here x\ and 
X 2 , is not unique. Different choices will lead 
to different A, B, and C. 

3. The number of the state variables is typically 
equal to the order of the set of the higher- 
order differential equations and equals the 
number of initial conditions needed to derive 
the unique solution; in the above example this 
number is 2. 

4. In time-invariant systems, it can be assumed 
without loss of generality that the starting time 
is t = 0 and so the initial conditions are taken 
to be x(0) = xq. 


Here y(t) = The motion of the mass 

y(t),t > 0 is uniquely determined if the applied 
force u(t), t > 0 is known and at t = 0 the initial 
position y(0) = yo and initial velocity y (0) = y\ 
are given. To obtain a state variable description, 
introduce the state variables x\ and X 2 as 

x x (t) = y(t), x 2 (t) = y(t) 

to obtain the set of first-order differential equa¬ 
tions mx 2 (t) + bx 2 (t ) + kx\(t) = u(t ) and 
x(t) = X 2 (t) which can be rewritten in the form 

of (1) 


Solving x = A(t)x\ x(0) = xo 

Consider the homogeneous equation 

x = A(t)x; x(0) = xo (4) 

where x(t) = [x\(t ),..., x n (t)] T is the state 
vector of dimension n and A is an n x n matrix 
with entries real numbers (i.e., A eR nXn ). 

Equation (4) is a special case of (1) where 
there are no inputs and outputs, u and y. The 
homogeneous vector differential equation (4) will 
be solved first, and its solution will be used to find 
the solution of (1). 
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Solving (4) is an initial value problem. It 
can be shown that there always exists a unique 
solution cp{t) such that 

(pit) = A(p{t)\ (p{ 0) = x n . 

To find the unique solution, consider first the 
one-dimensional case, namely, 

y(t)= ay it)\ y(0) = y 0 

the unique solution of which is 

y(t) = e at yo, t > 0. 

The scalar exponential e at can be expressed in a 
series form 

00 t k 11 1 

e at = y' — a k i=\ + -ta + -t 2 a 2 +-t 3 a 3 +...) 
^k\ 1 2 6 

k =0 

The generalization to the n xn matrix exponential 
(A is n xn) is given by 

fk 1 

eA, =Y.V\ Ak (= In + At +~A 2 t 2 + ...) 

k=() K ' 

(5) 

By analogy, let the solution to (4) be 

(x(t) =)(pit) = e A, x 0 (6) 

It is a solution since if it is substituted into (4), 

(p{t) = [A + At + — A^t^ + ... ]vo 
= Ae At x o = A(pit) 

and <^(0) = e A ‘°xo = xo, that is, it satisfies 
the equation and the initial condition. Since the 
solution of (4) is unique, (6) is the unique solution 
of (4). 

The solution (6) can be derived more formally 
using the Peano-Baker series (see ► Linear 
Systems: Continuous-Time, Time-Varying State 
Variable Descriptions), which in the present 


time-invariant case becomes the defining series 
for the matrix exponential (5). 

System Response 

Based on the solution of the homogeneous equa¬ 
tion (4), shown in (6), the solution of the state 
equation in (1) can be shown to be 

x(t) = e At xo + f e A(A ~ x) Buix)dx. (7) 

Jo 

The following properties for the matrix exponen¬ 
tial e At can be shown directly from the defining 
series: 

1. Ae At = e At A. 

2. ie At )~ l = e~ At . 

Equation (7) which is known as the variation 
of constants formula can be derived as follows: 

Consider x = Ax + Bu and let z{t) = 
e~ At xit). Then xf ) = e At zit) and substituting 

Ae At zit) + e At zit) = Ae At zit) + Buf) 

or zit) = e~ At Buf) from which 

zit)-zi0)= f e~ Ax Buix)dx 

Jo 

or 

e~ At xit) — v(0) = f e~ Ax Buix)dx 

Jo 

or 

xit) = e At xo + f e A(yt ~ x ^Buix)dx 

Jo 

which is the variation of constants formula (7). 

Equation (7) is the sum of two parts, the state 
response (when uf) = 0 and the system is driven 
only by the initial state conditions) and the input 
response (when Vo = 0 and the system is driven 
only by the input uit))\ this illustrates the linear 
system principle of superposition. 

If the output equation yf) = Cxit) + Duf) 
is considered, then in view of (7), 
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y(t) = Ce At x o+ f Ce A(A ^Bu(r)dr + Du(t ) 

Jo 


( 8 ) 


Ce At x o + / [Ce 


f i 


^(f—r) 


+ D<5(£ — r)]w(r)(ir 


£ 


The second expression involves the Dirac (or 
impulse or delta) function 8{t ), and it is derived 
based on the basic property for 8{t ), namely, 

/ oo 

<5(f - r)f{x)dz 

-OO 


It is clear that the matrix exponential e" 4 * plays 
a central role in determining the response of 
a linear continuous-time, time-invariant system 
described by (1). 

Given A, e At may be determined using several 
methods including its defining series, diagonal- 
ization of A using a similarity transformation 
( PAP ~ l ), the Cayley-Hamilton theorem, using 
expressions involving the modes of the system 
(e At — Y^i =i Me Xit when A has n distinct eigen¬ 
values A/; A t = Vi Vi with Vi , Vi the right and left 
eigenvectors of A that correspond to A z - (£>/ vj = 
1, i = j and ViVj = 0, i ^ j)), or using 
Laplace transform (e At = C~ l [(sl — ^4) -1 ]). See 
references below for detailed algorithms. 


Equivalent State Variable 
Descriptions 

Given 

x = Ax + Bu, y = Cx + Du (9) 
consider the new state vector x where 


x = Px 


with P a real nonsingular matrix. Substituting 
v = P~ l x in (9), we obtain 

x = Ax -b Bu, y = Cx + Du, (10) 


where 

A = PAP -1 , B = PB, C = CP~\ D = D 


The state variable descriptions (9) and (10) 
are called equivalent and P is the equivalence 
transformation. This transformation corresponds 
to a change in the basis of the state space, which 
is a vector space. Appropriately selecting P, one 
can simplify the structure of A(= PAP~ l ); the 
matrices A and A are called similar. When the 
eigenvectors of A are all linearly independent 
(this is the case, e.g., when all eigenvalues A/ of 
A are distinct), then P may be found so that A 
is diagonal. When e At is to be determined, and 
A = PAP~ X = diag[A z ] (A and A have the same 
eigenvalues), then 

e At = e p ~' APt = p~ l e At p = P -1 A\ag[e Xit ]P. 

Note that it can be easily shown that equiva¬ 
lent state space representations give rise to the 
same impulse response and transfer function (see 
► Linear Systems: Continuous-Time Impulse Re¬ 
sponse Descriptions). 


Summary 

State variable descriptions for continuous-time 
time-invariant systems are introduced and the 
state and output responses to inputs and initial 
conditions are derived. Equivalence of state vari¬ 
able representations is also discussed. 


Cross-References 

► Linear Systems: Continuous-Time Impulse Re¬ 
sponse Descriptions 

► Linear Systems: Continuous-Time, Time-Vary¬ 
ing State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 
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Recommended Reading 

The state variable description of systems received 
wide acceptance in systems theory beginning in 
the late 1950s. This was primarily due to the 
work of R.E. Kalman and others in filtering 
theory and quadratic control theory and to the 
work of applied mathematicians concerned with 
the stability theory of dynamical systems. For 
comments and extensive references on some of 
the early contributions in these areas, see Kailath 
(1980) and Sontag (1990). The use of state vari¬ 
able descriptions in systems and control opened 
the way for the systematic study of systems with 
multi-inputs and multi-outputs. 
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Linear Systems: Continuous-Time, 
Time-Varying State Variable 
Descriptions 

Panos J. Antsaklis 
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of Notre Dame, Notre Dame, IN, USA 

Abstract 

Continuous-time processes that can be modeled 
by linear differential equations with time-varying 
coefficients can be written in terms of state 


variable descriptions of the form x(t) = 
A(t)x(t ) + B(t)u(t), y(t) = C(t)x(t ) + 
D(t)u(t). The response of such systems due 
to a given input and initial conditions is derived 
using the Peano-Baker series. Equivalence of 
state variable descriptions is also discussed. 

Keywords 

Continuous-time; Linear systems; State variable 
descriptions; Time-varying 


Introduction 

Dynamical processes that can be described or 
approximated by linear high-order ordinary dif¬ 
ferential equations with time-varying coefficients 
can also be described, via a change of variables, 
by state variable descriptions of the form 

x(t) = A(t)x(t) + B(t)u(t ); x(to) = xo 

y(t) = C(t)x(t) + D(t)u(t), 

( 1 ) 

where x(t) ( t G R, the set of reals) is a col¬ 
umn vector of dimension n (x(t) G R n ) and 
A(t), B(t), C(t ), D(t) are matrices with entries 
functions of time t. A{t) = [aij(t)\, aq(t ) : 
R -> R. A(t) e R nxn , B(t) e R nxm ,C(t ) g 
R pxn , D(t) G R pxm . The input vector is u(t) e 
R m and the output vector is y(t) e R p . The 
vector differential equation in (1) is the state 
equation , while the algebraic equation is the out¬ 
put equation. 

The advantage of the state variable description 
(1) is that given an input u(t), t ^ 0 and an 
initial condition x(to) = xo, the state trajec¬ 
tory or motion for t ^ to can be conveniently 
characterized. To derive the expressions, we first 
consider the homogenous state equation and the 
corresponding initial value problem. 


Solving x(t) = x(/o) = *o 

Consider the homogenous equation with the ini¬ 
tial condition 
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x(t) = A{t)x(t)\ x(to) = Xq (2) from which 


where x(t) = [x\(t), ..., x n (t)] T is the state 
(column) vector of dimension n and A{t) is 
an n x n matrix with entries functions of time 
that take on values from the field of reals 

(.A e R nxn ). 

Under certain assumptions on the entries 
of A(t ), a solution of (2) exists and it is 
unique. These assumptions are satisfied, and a 
solution exists and is unique in the case, for 
example, when the entries of A(t) are continuous 
functions of time. In the following we make this 
assumption. 

To find the unique solution of (2), we use 
the method of successive approximations which 
when applied to 

x(t) = f(t,x(t)), x(t 0 ) = X 0 (3) 


(pm (0 — 


l 

/ 


/ + / A(x\)dt\ 


+ 


A(x2)dx2dx\ + 


to to 


l T\ T m — 1 

+ J A{xi) J A(x 2 ) ■ ■ ■ J A(x m ) 


to to 


d ~CfYi ... d ~C\ 


Xo 


When m ^ oo, and under the above continuity 
assumptions on A{t ), (j) m {t) converges to the 
unique solution of (2), i.e., 


is described by 


= $(t,to)x 0 (5) 


<po(t) = x 0 


where 


(pm (0 = X 0 + jf(x,<p m -x(x))dx, m = 1,2,... 

to 

(4) 

As m -> oo, (j) m converges to the unique solution 
of (3), assuming the / satisfies certain condi¬ 
tions. 

Applying the method of successive approxi¬ 
mations to (2) yields 


00 (0 = x 0 


t 

<&(t,to) — I J A{x\)dx\ 

to 
t 

+ J A(x\) 

to 

( 6 ) 


T1 

/ 


A(X2)dX2 


dx\ -b 


Note that 0(^0, to) = I and by differentiation it 
can be seen that 


0i (0 — *o + 


02(0 — *0 + 


t 

J A(x)xodx 

to 

t 

J A{x)(j)\{x)xodx 

to 


<p(t,t o) = 4 (00 (Mo), (7) 

as expected, since (5) is the solution of (2). The 
n xn matrix <&(t, to) is called the state transition 
matrix of (2). The defining series (6) is called the 
Peano-Baker series. 

Note that when A{t) = A, a constant matrix, 
then (6) becomes 


0 m (0 — *0 + 


t 

J A(x)<p m -i(x)x 0 dx 

to 




+ E 


k=\ 


A k (t — to) k 
k\ 


( 8 ) 
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which is the defining series for the matrix expo- +°o 

nential e A ^~ to ^ (see ►Linear Systems: Continu- f(t) — ] 8ft — x)fft)dx, 

ous-Time, Time-Invariant State Variable Descrip- 

tions). 


System Response 

Based on the solution (5) of x = Aft) x ft), the 
solution of the non-homogenous equation 

xft) = A(t)x(t) + B(t)u(t ); x(to) = xo (9) 

can be shown to be 

(pit) = <t>(L to)xo + f O (t,r)Bir)uir)dr. 

Jto 

( 10 ) 

Equation (10) is the variation of constants 
formula. This result can be shown via direct 
substitution of (10) into (9); note that (pft) = 
<&(to,to)xo = xo. That (10) is a solution can 
also be shown using a change of variables in (9), 
namely, 

zft) = ®(to,t)x(t). 

Equation (10) is the sum of two parts, the state 
response (when u(t) = 0 and the system is driven 
only by the initial state conditions) and the input 
response (when Vo = 0 and the system is driven 
only by the input uft))\ this illustrates the linear 
system principle of superposition. 

In view of (10), the output y{t) (= C{t)x{t) + 
D(t)u(t)) is 

y(t) = C(t)®(t,t 0 )x 0 

t 

+ J Cft)<&ft, x)B{x)uix)dx + D(t)u(t) 

to 

t 

= C(t)®(t,to)xo + J [C(t)$(t,x)B(x) 

to 

+D(t)8(t — x)]u{x)dx 

The second expression involves the Dirac (or 
impulse or delta) function S(t) and is derived 
based on the basic property for S(t), namely, 


where S(t — x) denotes an impulse applied at time 
x = t. 


Properties of the State Transition 
Matrix $(*, * 0 ) 

In general it is difficult to determine ®(t,t o) 
explicitly; however, O(L^o) may be readily 
determined in a number of special cases including 
the cases in which Aft) = A, Aft) is diagonal, 
Aft)Aft) = Aft)Aft). 

Consider x = Aft)x. We can derive 
a number of important properties which 
are described below. It can be shown that 
given n linearly independent initial con¬ 
ditions Xoi , the corresponding n solutions 
(pi ft) are also linearly independent. Let a 
fundamental matrix T^) of x = Aft)x 
be an n x n matrix, the columns of which 
are a set of linearly independent solutions 
(pi ft ),..., (pnft). The state transition matrix 
O is the fundamental matrix determined from 
solutions that correspond to the initial conditions 
[l,0,0,...] r , [0, l,0,...,0] r ,...[0,0,...,l] r 
(recall that <f>(^o,£o) — ^)- The following are 
properties of 0(L to): 

(i) <L(Mo) = ^(0^ -1 (^o) with ^(0 any fun¬ 
damental matrix. 

(ii) <L(L to) is nonsingular for all t and to. 

(iii) 0(?,r) = <L(L cr)0(cr, r) (semigroup prop¬ 
erty). 

(iv) [<l>(f,?o)] -1 = 

In the special case of time-invariant systems 
and x = Ax, the above properties can be written 
in terms of the matrix exponential since 


O(r,fo) = e A(, ~ t0 \ 
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Equivalence of State Variable 
Descriptions 

Given the system 


► Linear Systems: Continuous-Time Impulse Re¬ 
sponse Descriptions 

► Linear Systems: Discrete-Time, Time-Varying, 
State Variable Descriptions 


x = A(t)x + B(t)u 
y = C(t)x + D(t)u 

consider the new state vector x 
x(t ) = P(t)x(t) 


Recommended Reading 

( 11 ) 

Additional information regarding the time- 
varying case may be found in Brockett (1970), 
Rugh (1996), and Antsaklis and Michel (2006). 
For historical comments and extensive references 
on some of the early contributions, see Sontag 
(1990) and Kailath (1980). 


where P 1 (t) exists and P and P 1 are continu¬ 
ous. Then the system 

x = A(t)x + B(t)u 
y = C(t)x + D(t)u 

where 

A(t) = [P(t)A(t) + P(t)]p-\t) 

B{t) = P(t)B(t) 

C(t) = C(t)P~ l (t) 

D(t ) = D{t) 

is equivalent to (1). It can be easily shown that 
equivalent descriptions give rise to the same im¬ 
pulse responses. 


Summary 

State variable descriptions for continuous-time 
time-varying systems were introduced and the 
state and output responses to inputs and initial 
conditions were derived. The equivalence of state 
variable representations was also discussed. 

Cross-References 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 
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Abstract 

An important input-output description of a 
linear discrete-time system is its (discrete-time) 
impulse response (or pulse response), which 
is the response h(k,k o) to a discrete impulse 
applied at time ko. In time-invariant systems 
that are also causal and at rest at time zero, the 
impulse response is h(k,0), and its z -transform is 
the transfer function of the system. Expressions 
for h(k, ko) when the system is described by state 
variable equations are derived. 
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Keywords 

At rest; Causal; Discrete-time; Discrete-time 
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Time-varying; Transfer function descriptions 

Introduction 

Consider linear, discrete-time dynamical systems 
that can be described by 

+oo 

#) = E H(k, /)«(/) (1) 

/ =—OO 

where k,l G Z is the set of integers, the output is 
y(k) G M^, the input is u(k) G M m , and H(k, l) : 
ZxZ ^ . For instance, any system that can 

be written in state variable form 

x(k + 1) = A(k)x(k) + B(k)u(k ) 

( 2 ) 

y(k) = C(k)x(k) + D(k)u(k) 

or 

x(k + 1) = Ax(k ) + Bu(k ) 

( 3 ) 

y(£) = Cx(k) + Du(k) 

can be represented by (1). Note that it is assumed 
that at / = — cx), the system is at rest, i.e., no 
energy is stored in the system at time — oo. 

Define the discrete-time impulse (or unit 
pulse) as 

xn , _ ( 1 k = 0 

(0 k^0,keZ 

and consider a single-input, single-output system: 

+oo 

y(k) = E h(k,l)u(l ) (4) 

/=—oo 

If u(l) = 8 (/ — /), that is, the input is a unit pulse 
applied at / = /, then the output is 

yi (k) = h(k,i). 


i.e., h(k, l ) is the output at time k when a unit 
pulse is applied at time /. 

So in (4 ) h(k,l) is the response at time k to a 
discrete-time impulse (unit pulse) applied at time 
/. h(k,l ) is the discrete-time impulse response 
of the system. Clearly if h(k,l ) is known, the 
response of the system to any input can be de¬ 
termined via (4). So h(k,l ) is an input/output 
description of the system. 

Equation (1) is a generalization of (4) for the 
multi-input, multi-output case. If we let all the 
components of u(l ) in (1) be zero except for 
the yth component, then 

+oo 

yi(k) = hij(k,l)uj(l) (5) 

/ =—OO 

hij (k, l ) denotes the response of the i th compo¬ 
nent of the output of system (1) at time k due to a 
discrete impulse applied to the j th component of 
the input at time / with all remaining components 
of the input being zero. H(k, l ) = [h^ (k,l)\ is 
called the impulse response matrix of the system. 

If it is known that system (1) is causal , then 
the output will be zero before an input is applied. 
Therefore, 

H(k,l) = 0, for k<l, (6) 

and so when causality is present, (1) becomes 

k 

y(k) = E H(k, l)u(l). (7) 

/ = —OO 

A system described by (1) is at rest at k = ko 
if u(k ) = 0 for k ^ ko implies y (k) = 0 for k ^ 
ko. For a system at rest at k = ko, (7) becomes 

k 

y(k) = E H(k, l)u(l). (8) 

l=k 0 

If system (1) is time-invariant, then H(k, l ) = 
H(k — l, 0) (also written as H(k — /)) since only 
the time elapsed (k — l) from the application of 
the discrete-time impulse is important. Then (8) 
becomes 
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k 

y(k) = £ H(k-l)u(l ), k> 0, (9) 

1=0 

where we chose ko = 0 without loss of gener¬ 
ality. Equation (9) is the description for casual, 
time-invariant systems, at rest at k = 0. 

Equation (9) is a convolution sum and if we 
take the (one-sided or unilateral) z-transform of 
both sides, 

y(z) = H(z)u(z), (10) 

where y(z), u(z) are the z -transforms of y(k), 
u(k) and H(z ) is the z-transform of the discrete¬ 
time impulse response H(k). H(z) is the transfer 
function matrix of the system. Note that the trans¬ 
fer function of a linear, time-invariant system is 
defined as the rational matrix H (z) that satisfies 
(10) for any input and its corresponding output 
assuming zero initial conditions. 


Connections to State Variable 
Descriptions 

When a system is described by (2), then 
k -1 

y(k) = J2 C(k)<5>(k,l + 1 )B(l)u(l) 
l=k 0 

+D(k)u(k), k > ko (11) 

where it was assumed that x(ko) = 0, i.e., the 
system is at rest at ko . Here <t>(k,/) (= A(k — 
1 )-"A(l)) is the state transition matrix of the 
system. 

Comparing (11) with (8), the discrete-time 
impulse response of the system is 

( C(k)&(k,l + 1 )£(/) k > l 
H(k,l)=\ D(k) k = l 

y 0 k < l 

( 12 ) 

Similarly, when the system is time-invariant and 
is described by (3), 


k -1 


y(k) = y] CA k ~ (l+x) 

Bu(l ) + Du(k), 

k > k 0 

l=k() 


(13) 

where x(ko) = 0 and 

( CA k ~« +v >B 

k > l 

II 

1 

II 

D 

k = l 


{ o 

k < l 



(14) 


When / = 0 (taking the time when the discrete 
impulse is applied to be zero, / = 0), the discrete¬ 
time impulse response is 

(CA k ~ l B k> 0 
H(k) = \ D k = 0 (15) 

[ 0 k < 0 

Taking (one-sided or unilateral) z-transforms of 
both sides in (15), 

H(z) = C(zI -A)~ l B + D (16) 

which is the transfer function matrix in terms 
of the coefficient matrices in the state variable 
description (3). Note that (16) can also be derived 
directly from (3) by assuming zero initial condi¬ 
tions (x (0) = 0) and taking z-transforms of both 
sides. 

Finally, it is easy to show that equivalent 
state variable descriptions give rise to the same 
discrete-impulse response. 


Summary 

The discrete-time impulse response is an exter¬ 
nal, input-output description of linear, discrete¬ 
time systems. When the system is time-invariant, 
the z-transform of the impulse response h(k, 0) 
(which is the output response at time k due 
to a discrete impulse applied at time zero with 
initial conditions taken to be zero) is the transfer 
function - another very common input-output 
description. The relationships to the state variable 
descriptions were shown. 
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Abstract 

Discrete-time processes that can be modeled by 
linear difference equations with constant coeffi¬ 
cients can also be described in a systematic way 
in terms of state variable descriptions of the form 
x(k + 1) = Ax(k ) + Bu(k), y(k) = Cx(k) + 
Du(k). The response of such systems due to a 
given input and subject to initial conditions is de¬ 
rived. Equivalence of state variable descriptions 
is also discussed. 


Keywords 

Discrete-time; Linear systems; State variable de¬ 
scriptions; Time-invariant 

Introduction 

Discrete-time systems arise in a variety of ways 
in the modeling process. There are systems that 
are inherently defined only at discrete points in 
time; examples include digital devices, inven¬ 
tory systems, economic systems such as banking 
where interest is calculated and added to savings 
accounts at discrete time interval, etc. There are 
also systems that describe continuous-time sys¬ 
tems at discrete points in time; examples include 
simulations of continuous processes using digital 
computers and feedback control systems that em¬ 
ploy digital controllers and give rise to sampled- 
data systems. 

Linear, discrete-time, time-invariant systems 
can be modeled via state variable equations, 
namely, 

x(k + 1) = Ax(k) + Bu(k ); v(0) = Vo 

y(k) = Cx(k) + Du(k ) 

( 1 ) 

where k e Z, the set of integers, the state vector 
x e R n , i.e., an n dimensional column vector; 
A e R nxn , B e M" xm , C e R pxn , D e M^ xm 
are matrices with entries of real numbers; and 
y(k) e R p , u(k) e M m the output and the input, 
respectively. The vector difference equation in (1) 
is the state equation and the algebraic equation is 
the output equation. 

Note that (1) could have been equivalently 
written as x(l) = Ax(l — 1) + Bu(l — 1) where 
/ = k + 1 and x(l — 1) is an easily visualized 
delayed version of x(l)\ this is a form more 
common in signal processing (where a two-sided 
or bilateral z- transform is used). In control where 
we assume a known initial condition at time equal 
to zero (and one-sided or unilateral z- transform is 
taken), the form in (1) is common. 

Similar to the continuous-time case, (1) can 
be derived from a set of high-order difference 
equations by introducing the state variables 
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x(k) = [x\ (A),..., x n (k)] T . Description (1) 

can also be derived from continuous-time system 
descriptions by sampling (see ► Sampled-Data 
Systems). 

The advantage of the above state variable 
description is that given any input u(k ) and initial 
conditions x(0), its solution (state trajectory or 
motion) can be conveniently and systematically 
characterized. This is done below. We first con¬ 
sider the solutions of the homogenous equation 
x(k + 1) = Ax(k). 


Solving x(k + 1) = Ax(k); x(0) = xo 

Consider the homogenous equation 

x(k + 1) = Ax(k)\ x(0) = xo (2) 

where k e Z + is a nonnegative integer, x ( k ) = 
[xi (A),..., x n (k)] T is the state column vectors of 
dimension n , and A is an n xn matrix with entries 
real numbers (i.e., A g l”^). 

Write (2) for k = 0,1,2,..., namely, x(l) = 
Ax(0), x(2) = Ax(l) = A 2 x( 0),... to derive 
the solution 

x(k) = A k x(0), k ^ 0 (3) 

This result can be shown formally by induction. 
Note that A 0 = I by convention and so (3) also 
satisfies the initial condition. 

If the initial time were some (integer) ko in¬ 
stead of zero, then the solution would be 

x(k) = A k ~ k °x(ko), k ^ ko (4) 


time elapsed (k — ko) and not on the actual initial 
time ko. 

In view of (3), it is clear that A k plays an 
important role in the solutions of the difference 
state equations that describe linear, discrete¬ 
time, time-invariant systems; it is actually 
analogous to the role e At plays in the solutions 
of the linear differential state equations that 
describe linear, continuous-time, time-invariant 
systems. 

Notice that in (3), k ^ 0. This is so because 
A k for k < 0 may not exist; this is the case, 
for example, when A is a singular matrix - it has 
at least one eigenvalue at the origin. In contrast, 
e At exists for any t positive or negative. The 
implication is that in discrete-time systems we 
may not be able to determine uniquely the initial 
past state x (0) from a current state value x ( k ); in 
contrast, in continuous-time systems, it is always 
possible to go backwards in time. 

There are several methods to calculate A k that 
mirror the methods to calculate e At . One could, 
for example, use similarity transformations, or 
the z-transform. When all eigenvectors of A are 
linearly independent (this is the case, e.g., when 
all eigenvalues A/ of A are distinct), then a simi¬ 
larity transformation exists so that 

PAP= A = diag[A,]. 


Then 


A k = P~ l A k P = P~ x 






P. 


The solution can be written as 

x(k) = 0(k,ko)x(ko) 

= (P(k — ko,0)x(ko), k ^ ko (5) 

where <P(k,ko) is the state transition matrix and 
it equals <P(k,ko) = A k ~ k °. Note that for time- 
invariant systems, the initial time ko can always 
be taken to be zero without loss of generality; 
this is because the behavior depends only on the 


Alternatively, using the z-transforms, A k = 
Z~ l {z{zl — A)~ l }. Also when the eigenvalues A/ 
of A are distinct, then 

A k =Y j A l Xl 

i= 0 

where Aj = V;V; with Vi, Vi the right and left 
eigenvectors of A that correspond to A ? . Note that 
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v\ 

Vn 


h- 



AiX\ are the modes of the system. One could 
also use the Cayley-Hamilton theorem to deter¬ 
mine A k . 


Equivalence of State Variable 
Descriptions 

Given description (1), consider the new state 
vector x where 


x(k) = Px(k) 


System Response 


with P e l” x ” a real nonsingular matrix. 
Substituting x = in (1), we obtain 


Consider the description (1). The response can 
be easily derived by writing the equation for 
k = 0,1,2,... and substituting or formally by 
induction. It is 


k —1 

x(k) = A k x( 0) + J2 A k ~ U+1) Bu(j), 
j= o 


and 


k > 0 
( 6 ) 


k —1 

y(k) = CA k x( 0) + ^2CA k ~ u+1) Bu(j) 
j= o 

+ Du(k), k > 0 

y{ 0) = Cx( 0) + Du( 0). (7) 


Note that (6) can also be written as 


x(k) = A /c x(0)+[B,AB,--- ,A k ~ l B] 


u(k — 1) 
u( 0) 


( 8 ) 


Clearly the response is the sum of two compo¬ 
nents, one due to the initial condition (state re¬ 
sponse) and one due to the input (input response). 
This illustrates the linear system principle of 
superposition. 

If the initial time is ko and (4) is used, then 


k —1 

y(k ) = CA k ~ k °x(ko) + c A k ~ u+l) Bu(J) 

j=ko 


+ Du(k), k > ko 

y(ko) = C x (ko) + Du(k 0 ). (9) 


x(k + 1) = Ax(k ) + Bu(k ) 

( 10 ) 

y(k) = Cx(k) + Du(k ) 


where 

A = PAP~\ B = PB, C = CP~\ D = D 

The state variable descriptions (1) and (9) 
are called equivalent and P is the equivalence 
transformation matrix. This transformation cor¬ 
responds to a change in the basis of the state 
space, which is a vector space. Appropriately se¬ 
lecting P one can simplify the structure of A (= 
PAP~ l ). It can be easily shown that equivalent 
description gives rise to the same discrete impulse 
response and transfer function. 


Summary 

State variable descriptions for discrete-time, 
time-invariant systems were introduced and the 
state and output responses to inputs and initial 
conditions were derived. The equivalence of state 
variable representations was also discussed. 


Cross-References 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► Linear Systems: Discrete-Time Impulse Re¬ 
sponse Descriptions 

► Linear Systems: Discrete-Time, Time-Varying, 
State Variable Descriptions 

► Sampled-Data Systems 
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Recommended Reading 

The state variable descriptions received wide ac¬ 
ceptance in systems theory beginning in the late 
1950s. This was primarily due to the work of 
R.E. Kalman. For historical comments and ex¬ 
tensive references, see Kailath (1980). The use of 
state variable descriptions in systems and control 
opened the way for the systematic study of sys¬ 
tems with multi-inputs and multi-outputs. 
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Abstract 

Discrete-time processes that can be modeled by 
linear difference equations with time-varying co¬ 
efficients can be written in terms of state variable 
descriptions of the form x(k + 1) = A(k)x(k) + 
B(k)u(k), y(k) = C(k)x(k) + D(k)u(k). The 
response of such systems due to a given input and 
initial conditions is derived. Equivalence of state 
variable descriptions is also discussed. 

Keywords 

Discrete-time; Linear systems; State variable de¬ 
scriptions; Time-varying 


Introduction 

Discrete-time systems arise in a variety of ways 
in the modeling process. There are systems that 
are inherently defined only at discrete points in 
time; examples include digital devices, inventory 
systems, and economic systems such as banking 
where interest is calculated and added to savings 
accounts at discrete time interval. There are also 
systems that describe continuous-time systems at 
discrete points in time; examples include simula¬ 
tions of continuous processes using digital com¬ 
puters and feedback control systems that employ 
digital controllers and give rise to sampled-data 
systems. 

Dynamical processes that can be described or 
approximated by linear difference equations with 
time-varying coefficients can also be described, 
via a change of variables, by state variable 
descriptions of the form 

x(k + 1) = A(k)x(k) + B(k)u(k); x(ko) = xo 
y(k) = C(k)x(k ) + D(k)u(k). 

(1) 

Above, the state vector x ( k ) (k G Z, the set of in¬ 
tegers) is a column vector of dimension n (x ( k ) G 
R n )\ the output is y(k) G R m and the input is 
u(k) G R m . A(k ), B(k ), C(k ), and D(k) are 
matrices with entries functions of time k , A(k) = 
[atj(k)\ 9 dij (k) : Z -> R ( A(k ) e R nxn , B(k) e 
ir xm , C(k) G R pxn , D(k ) G R p * m ). The vector 
difference equation in (1) is the state equation, 
while the algebraic equation is the output equa¬ 
tion. Note that in the time-invariant case, A(k) = 
A, B(k) = B , C(k) = C, and D(k) = D. 

The advantage of the state variable description 
(1) is that given an input u(k), k > ko and an 
initial condition x (ko) = xo, the state trajectories 
or motions for k > ko can be conveniently 
characterized. To determine the expressions, we 
first consider the homogeneous state equation 
and the corresponding initial value problem. 

Solving x (A: + 1) = A(k)x(k); 
x(A: 0 ) = xo 

Consider the homogenous equation 
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x(k + 1) = A(k)x(k)\ x(ko) = xo (2) 
Note that 

x(k<) + 1) = A(k 0 )x(k 0 ) 

x(k 0 + 2) = A(k 0 + l)A(k 0 )x(k 0 ) 

x(k) = A(k — l)A(k — 2) ■ ■ ■ A(ko)x(ko) 
k—l 

= ]~~[ A(j)x(k 0 ), k > k 0 

j=k o 

This result can be shown formally by induction. 
The solution of (2) is then 

x(k) = ®(k,k 0 )x(k 0 ), (3) 

where <&(k,ko) is the state transition matrix of 
(2) given by 

k—l 

<$(k,ko)= ]~[ A(J), k > k 0 \ $>(ko,ko) = I- 

j=k 0 

(4) 

Note that in the time-invariant case, <&(k, ko ) = 

A k ~ko 

System Response 

Consider now the state equation in (1). It can be 
easily shown that the solution is 

x(k) = <&(k, ko)x(ko) 

k—l 

+ y] ®(k,j + \ )B(j)u(j), k > ko, 

j=ko 

(5) 

and the response y (k) of (1) is 
y(k ) = C(k)^>(k,ko)x(ko) 

k-i 

+ c(k)j2*(k’j + VBU>u) 

j=ko 

+ D(k)u(k), k > ko, (6) 


and 

y(k 0 ) = C(k 0 )x(k 0 ) + D(ko)u(k 0 ). 

Equation (5) is the sum of two parts, the state 
response (when u(k) = 0 and the system is 
driven only by the initial state conditions) and the 
input response (when x(ko) = 0 and the system 
is driven only by the input u(k))\ this illustrates 
the linear systems principle of superposition. 

Equivalence of State Variable 
Descriptions 

Given (1), consider the new state vector x where 
x(k) = P(k)x(k) 
where P~ l (k ) exists. Then 

x(k + 1) = A(k)x(k) + B(k)u(k ) 

(7) 

y(k) = C(k)x(k) + D(k)u(k ) 

where 

A{k) = P(k + l)A(k)P~ l (k), 

B(k ) = Pik + 1 )B(k), 

C(k ) = C(k)P~ l (k), 

D(k ) = D(k) 

is equivalent to (1). It can be easily shown that 
equivalent descriptions give rise to the same dis¬ 
crete impulse responses. 

Summary 

State variable descriptions for linear discrete¬ 
time time-varying systems were introduced and 
the state and output responses to inputs and initial 
conditions were derived. The equivalence of state 
variable representations was also discussed. 

Cross-References 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 
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► Linear Systems: Discrete-Time Impulse 
Response Descriptions 

► Linear Systems: Continuous-Time, Time-Vary¬ 
ing State Variable Descriptions 

► Sampled-Data Systems 

Recommended Reading 

The state variable descriptions received wide ac¬ 
ceptance in systems theory beginning in the late 
1950s. This was primarily due to the work of 
R.E. Kalman. For historical comments and ex¬ 
tensive references, see Kailath (1980). The use of 
state variable descriptions in systems and control 
opened the way for the systematic study of sys¬ 
tems with multi-inputs and multi-outputs. 
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Abstract 

In the analysis and design of robust control sys¬ 
tems, LMI method plays a fundamental role. This 
article gives a brief introduction to this topic. 


After the introduction of LMI, it is illustrated how 
a control design problem is related with matrix 
inequality. Then, two methods are explained on 
how to transform a control problem characterized 
by matrix inequalities to LMIs, which is the core 
of the LMI approach. Based on this knowledge, 
the LMI solutions to various kinds of robust 
control problems are illustrated. Included are FLoo 
and H 2 control, regional pole placement, and 
gain-scheduled control. 


Keywords 

Gain-scheduled control; FLoo and FL 2 control; 
LMI; Multi-objective control; Regional pole 
placement; Robust control 


Introduction of LMI 

A matrix inequality in a form of 

m 

F(x) = F 0 + y>F, >0 (1) 

i = 1 

is called an LMI (linear matrix inequality). Here, 
x = [x\ • • • x m ] is the unknown vector and Fj (i = 
1,..., m) is a symmetric matrix. F(x) is an affine 
function of x. The inequality means that F(x) is 
positive definite. 

LMI can be solved effectively by numeri¬ 
cal algorithms such as the famous interior point 
method (Nesterov and Nemirovskii 1994). MAT- 
LAB has an LMI toolbox (Gahinet et al. 1995) 
tailored for solving the related control problems. 
Boyd et al. (1994) provide detailed theoretic 
fundamentals of LMI. A comprehensive and up- 
to-date treatment on the applications of LMI 
in robust control is covered in Liu and Yao 
(2014). 

The notation He (A) = A + A r is used to 
simplify the presentation of large matrices; A± 
is a matrix whose columns form the basis of 
the kernel space of A, i.e., AA± = 0. Further, 
A 0 B denotes the Kronecker product of matrices 
(A, B). 
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Control Problems and LMI 

In control problems, it is often the case that 
the variables are matrices. For example, the nec¬ 
essary and sufficient condition for the stability 
of a linear system x(t) = Ax(t) is that there 
exists a positive-definite matrix P satisfying the 
inequality AP + PA T < 0. Although this is 
different from the LMI of Eq. (1) in form, it can 
be converted to Eq. (1) equivalently by using a 
basis of symmetric matrices. 

Next, consider the stabilization of system x = 
Ax + Bu by a state feedback u = Fx. The closed- 
loop system is x = (A + BF)x. Therefore, the 
stability condition is that there exist a positive- 
definite matrix P and a feedback gain matrix F 
satisfying the inequality 

(A + BF)P + P(A + BF) t < 0. (2) 

In this inequality, FP, the product of unknown 
variables F and P, appears. Such matrix in¬ 
equality is called a bilinear matrix inequality , or 
BMI for short. BMI problem is non-convex and 
difficult to solve. There are mainly two methods 
for transforming a BMI into an LMI: variable 
elimination and variable change. 


From BMI to LMI: Variable Elimination 

The method of variable elimination is good at op¬ 
timizing single-objective problems. This method 
is based on the theorem below (Gahinet and 
Apkarian 1994). 

Lemma 1 Given real matrices E, F, G with G 
being symmetric, the inequality 

E T XF + F T X T E + G < 0 (3) 

has a solution X if and only if the following two 
inequalities hold simultaneously 


Application of this theorem to the previous 
stabilization problem (2) yields ( B T ) 7 L (AP + 
PA t )(B t )± < 0, which is an LMI about P. 
Once P is obtained, it may be substituted back 
into the inequality (2) and solve for F. 

For output feedback problems, it is often 
needed to construct a new matrix from two given 
matrices in solving a control problem with LMI 
approach. The method is given by the following 
lemma. 

Lemma 2 Given two n-dimensional positive- 
definite matrices X and Y, a 2n-dimensional 
positive-definite matrix P satisfying the condi¬ 
tions 



\Y *1 

II 

fx *1 

p = 

i 

* 

-1 

•* 

_i 



can be constructed if and only if 


Factorizing Y — X 1 as FF T , a solution is given 
by 



As an example of output feedback control design, 
let us consider the stabilization of the plant 

xp = Axp + Bu, y = Cxp (6) 

with a full-order dynamic controller 

*k = Ak*k + B k y, u = CkXk + D K y. (7) 


The closed-loop system is 


~x P " 
_Xk _ 

II 

~Xp " 
_Xk _ 

II 

A + BD k C BCk 
B k C A k 





00 


The stability condition is that the matrix inequal¬ 
ity 


e]_ge± < o, f[gf ± < 0. 


(4) 


A^V + ¥A C < 0 


(9) 
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has a solution P > 0. To apply the variable 
elimination method, we need to put all coefficient 
matrices of the controller into in a single matrix. 
This is done as follows: 


A c = A + BJCC , JC = 


D k 

B k 


C k 

A k 


( 10 ) 


in which A = diag(A,0), B = diag(5,/), 
and C = diag(C, /), all being block diagonal. 
Then, based on Lemma 1 , the stability condition 
reduces to the existence of symmetric matrices 
X , Y satisfying LMIs 

(B t )]_(AX + XA t )(B t )± < 0 (11) 

(Ca_) t {YA + A t Y)C±_ < 0. (12) 

Meanwhile, the positive definiteness of matrix P 
is guaranteed by Eq. (5) in Lemma 2. 


LMI Approach to Robust 
Control, Fig. 1 

Generalized feedback 
system 


^ z 


. w 



G 



y 




u 


K 


A = NA k M t + NB k CX + YBC k M t 
+ Y(A + BD k C)X 
B = NB k + YBD k , C = C k M t 
+D kC X , ID = Dk- 

(15) 

The coefficient matrices of the controller become 
D k = B ,Ck = (C- D k CX)M ~ t , 

b k = n-\m-ybd k ) 

A k = N~ l (A- NB k CX -YBC k M t } 
-Y(A + BD k C)X)M~ t . 


From BMI to LMI: Variable Change 


U2 and 'Hoq Control 


We may also use the method of variable change 
to transform a BMI into an LMI. This method is 
good at multi-objective optimization. 

The detail is as follows (Gahinet 1996). A 
positive-definite matrix can always be factorized 
as the quotient of two triangular matrices, i.e., 


I Y~ 

0 N t ' 

(13) 

P > 0 is guaranteed by Eq. (5) for a full- 
order controller. Further, the matrices M, N are 
computed from MN T = I — XY. Consequently, 
they are nonsingular. 

An equivalent inequality fl^A^LC + 
n[A c rii < 0 is obtained by multiplying Eq. (9) 
with nf and TT 1 . After a change of variables, 
this inequality reduces to an LMI 


Pfli = IT 2 , TT 1 = 


X I 

m t 0 


,n 2 = 


He 


AX + BC 
0 


A + BBC + A r 
YA + BC 


<0. (14) 


The new variables A, B, C, B are set as 


In system optimization, and Hoo norms 
are the most popular and effective performance 
indices. H 2 norm of a transfer function is closely 
related with the squared area of its impulse 
response. So, a smaller H 2 norm implies a 
faster response. Meanwhile, H 0 o norm of a 
transfer function is the largest magnitude of 
its frequency response. Hence, for a transfer 
function from the disturbance to the controlled 
output, a smaller Hoo norm guarantees a better 
disturbance attenuation. 

Usually I-L 2 and Hoc optimization problems 
are treated in the generalized feedback system of 
Fig. 1. Here, the generalized plant G(s ) includes 
the nominal plant, the performance index, and the 
weighting functions. 

Let the generalized plant G(s ) be 


G(s) = 


Ci 

c 2 


(sI-A)- 1 ^ B 2 ]+ 


D n 

B>2\ 


D\2 
0 • 

(17) 


Further, the stabilizability of ( A , B 2 ) and the 
detectability of (C 2 , A ) are assumed. The 
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closed-loop transfer matrix from the disturbance 
w to the performance output z is denoted by 

H z Js) = C c (sl - A C )~ { B C + D c . (18) 


The condition for H zw (s) to have an TL 2 norm 
less than y, i.e., ||// zvt ,|| 2 < y, is that there are 
symmetric matrices P and W satisfying 


~FA C + A t c P 

cr 

< 0 , 

' W Bj P' 

C c 

-/_ 

FB C P 


(19) 

as well as Tr (W) < y 2 . Here, Tr (W) denotes the 
trace of matrix W, i.e., the sum of its diagonal 
entries. 

The LMI solution is derived via the appli¬ 
cation of the variable change method, as given 
below. 

Theorem 1 Suppose that D\\ = 0. The TL 2 
control problem is solvable if and only if there 
exist symmetric matrices X, Y, W and matrices A, 
B, C satisfying the following LMIs: 


He 


' AX + B 2 C 
A t + A 
C\X + D 12 C 


" W B\ B\ Y~ 
B x X I 
YBi I Y 


0 

YA + BC 2 
Ci 


0 

0 


< 0 



> 0, Tr (IT) < y 2 . (21) 


When the LMI Eqs. (20) and (21) have solutions, 
an TL 2 controller is given by Eq. (16) by setting 
B = 0. 

The TLoo control problem is to design a con¬ 
troller so that 11 H zw 11 oq < y. The starting point of 
TLoo control is the famous bounded real lemma, 
which states that H zw (s) has an TLoo norm less 
than y if and only if there is a positive-definite 
matrix P satisfying 


'4TP + PA c fb c 

Bj P -yl 

C c D c 



< 0. 


( 22 ) 


There are two kinds of LMI solutions to this con¬ 
trol problem: one based on variable elimination 
and one based on variable change. 

To state the first solution, define the following 
matrices first: 


N y = [C 2 Z>2i]±, N x = [Bl D T n ] ± . (23) 


Theorem 2 The TLoo control problem has a solu¬ 
tion if and only if Eq. (5) and the following LMIs 
have positive-definite solutions X, Y: 



0 

I 


AX + XA t 
CiX 

_ *f 


xcj 

-yl 

D h 


Bx 

fin 

-yl 




(24) 



0 

I 


"YA + A t 
B jY 
Ci 


Y 


YB X 

cn 

-yl 

D T u 

D n 

-Yl J 




(25) 


Once a matrix P is computed according to 
Lemma 2, Eq. (22) becomes an LMI and its 
solution yields the controller. 

The second solution is given below. 

Theorem 3 The TLoo control problem has a solu¬ 
tion if and only if Eq. (5) and the following LMI 
have solutions X, Y and A, B, C, D : 


He 


AX A B\ 5 2 BD 2 i 0 

A YA + BC 2 YBi + BD 2! 0 

0 0 -\I 0 

C\X + Di 2 C Ci + T>i 2 BC 2 D\\ + fii 2 Bfi 2 i — jl 


< 0. 


(26) 
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The controller is given by Eq. (16). all of its poles are in the LMI region D if and only 

if there is a positive-definite matrix P satisfying 
the LMI 


Regional Pole Placement 


L 0 P + M 0 (AP) + M t 0 (AP) r < 0. (29) 


The location of system poles determines the re¬ 
sponse quality. However, for uncertain systems 
it is impossible to place the closed-loop poles at 
fixed points because they move with the variation 
of the plant. Nevertheless, it is still possible to 
place the closed-loop poles inside a region. Lor 
convex regions characterized by LMI, the design 
method is mature and proven effective in practice. 

Let us see how to characterize a convex region. 
It is easy to know that a complex number z is 
inside the disk of Lig. 2a if and only if it satisfies 


This forms the basis for the regional pole 
placement design. 

Lor the disk region in Lig. 2a, the condition 
becomes 


-rP 

cP + (AF) t 


cP + AF 
-rP 


< 0 . 


(30) 


Meanwhile, for the sector region in Lig. 2b, the 
corresponding LMI is 




(AF + P A t ) sin 9 

(AP — FA t ) cos 9 

—r z + c 

z + C -r 

< 0 . 

—(AP — FA t ) cos 6 

(AP + P A t ) sin 9 


Similarly, z is inside the sector of Lig. 2b if and 
only if 

(z + z) sin 0 (z - z) cos 9 
_—(z — z) cos 0 (z + z)sin0_ < 

Generally, the set of complex number z character¬ 
ized by 

D = {z e C|L + zM + ~ Z M T < 0} (27) 


Moreover, for a composite LMI region, such as 
the intersection of the disk and the sector, the pole 
placement is guaranteed by enforcing a common 
solution P to all the corresponding LMIs. 

In the pole placement design, only the variable 
change method is applicable. Lor example, in 
the nominal closed-loop system Eq. (8), the pole 
placement condition is that the LMI 


is called an LMI region, in which L is a symmet¬ 
ric matrix. Lor the dynamic system 


x = Ax, 


(28) 




Sector region 


LMI Approach to Robust Control, Fig. 2 Typical exam¬ 
ples of LMI region 


+He 


AX+B C 

A 


A+5DC]\ 

YA+MC\) 


(32) 


and Eq. (5) are solvable (Chilali and Gahinet 
1996). 

Lor systems with norm-bounded parameter 
uncertainty, a robust pole placement method is 
provided in Chilali et al. (1999). 


Multi-objective Control 

It is noted that all of the preceding control designs 
involve a positive-definite matrix P. Therefore, a 
multi-objective control design is easily realized 
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by enforcing a common solution P to the corre¬ 
sponding matrix inequality conditions. 

Gain-Scheduled Control 

In practice, many nonlinear systems can be ex¬ 
pressed as linear systems with state-dependent 
coefficients in form, which is known as the LPV 
(linear parameter-varying) form. For example, in 
the model of a robot arm JO + mgl sin 0 = u, 
if we define a parameter as p(t ) = sin 0/0, 
then it can be written as an LPV model JO it) + 
mglp(t)0(t) = u(t). In this class of systems, 
when the parameter p{t) is available online and 
its range is finite, one may tune the controller 
parameters based on the information of p(t ), 
so as to achieve a higher performance. This is 
referred to as gain-scheduled control. 

Consider the following affine model: 

x = A(p(t))x + B\(p(t))d + B 2 (p(t))u (33) 
z — C\(p(t))x + D\\d + D\ 2 u (34) 
y = C 2 (p(t))x + D 2l d (35) 

where A(p) ~ C 2 (p) are affine functions of 
the time-varying parameter vector p(t ), such as 
A(p) = Ao + Y^i=i Pi (0 A • The gain-scheduled 
control is to impose, on the coefficient matrices 
of the controller, the same affine structure about 
p(t) such as A K (p) = A k0 + Ya=\ Pi(t)A Ki . 

To simplify the design, it is desirable that the 
coefficient matrices of the closed-loop system 
become affine functions of the parameter vector 
p{t). This may be satisfied by restraining some 
of the matrices of the controller to constant ones. 
The easy-to-design structure of a gain-scheduled 
controller is summarized as follows: 

• Both B 2 (p) and C 2 (p) depend on p(t ): ( Bk , 
Ck ) must be constant matrices besides Dk = 
0 . 

• Constant ( B 2 , C 2 ): All coefficient matrices of 
the controller can be affine functions of the 
parameter vector pit). 

• Constant B 2 : (£/-, Dk) must be constant 
matrices. 

• Constant C 2 : (C^, Dk) must be constant ma¬ 
trices. 


When the structure of the gain-scheduled con¬ 
troller is chosen as summarized above, the solv¬ 
ability conditions reduce to those at all vertices 
Oi of the scheduling parameter vector pit). Fur¬ 
ther, a multi-objective is achieved by imposing a 
common solution P to all LMI conditions. Some 
concrete examples are illustrated below: 

Hoo Norm Spec : The conditions of Theorem 3 
are satisfied at all vertices Oj of the parameter 
vector pit). 

H 2 Norm Spec : The conditions of Theorem 1 are 
satisfied at all vertices Oj of the parameter 
vector pit). 

Regional Pole Placement : Eq. (32) is satisfied at 
all vertices Oj of the parameter vector pit) and 
Eq. (5) holds. 

Moreover, a different gain-scheduled method 
is proposed in Packard (1994) for parametric 
systems with norm-bounded uncertainty. 

Summary and Future Direction 

LMI approach is a very powerful method that can 
be applied to solve most of the robust control 
problems smartly and effectively. In particular, its 
capability of handling the multi-objective control 
problems is very attractive and proven useful in 
industrial applications. 

Further study is needed in the following direc¬ 
tions. 

• New method of variable change is desired 
in order to deal with the robust performance 
design of parametric systems. 

• Almost all robust performance designs are 
carried out based on sufficient conditions. It 
is very important to discover less conservative 
design methods. 
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Abstract 

Energy functions, an extension of Lyapunov 
functions, have been used in electric power 
systems for several applications. An overview 
of energy function theory for general nonlinear 
autonomous dynamical systems along with 
its applications to electric power systems is 
presented. The issue of how to optimally 
determine the critical level value of an energy 


function for estimating stability regions of 
nonlinear dynamical systems is also addressed. 


Keywords 

Energy function; Lyapunov function theory; Op¬ 
timal estimation; Power system stability; Stabil¬ 
ity region 


Introduction 

Energy functions, an extension of the Lyapunov 
functions, have been practically used in electric 
power systems for several applications. A com¬ 
prehensive energy function theory for general 
nonlinear autonomous dynamical systems along 
with its applications to electric power systems 
will be summarized in this article. 

We consider a general nonlinear autonomous 
dynamical system described by the following 
equation: 

x(t) = f{x {t)) (1) 

We say a function V : R n -> R is an energy 
function for the system (1) if the following three 
conditions are satisfied (Chiang et al. 1987): 

(El): The derivative of the energy function V(x) 
along any system trajectory x{t) is non¬ 
positive, i.e., V(x(t)) < 0. 

(E2): If x{t) is a nontrivial trajectory (i.e., x(t) 
is not an equilibrium point), then along the 
nontrivial trajectory x(t) the set {t e R : 
V(x(t)) = 0} has measure zero in R. 

(E3): That a trajectory x(t) has a bounded value 
of V(x(t)) for t e .R+implies that the 
trajectory x(t) is also bounded. 

Condition (El) indicates that the value of an 
energy function is nonincreasing along its trajec¬ 
tory, but does not imply that the energy function 
is strictly decreasing along any trajectory. Condi¬ 
tions (El) and (E2) imply that the energy function 
is strictly decreasing along any system trajectory. 
Property (E3) states that the energy function is 
a proper map along any system trajectory but 
need not be a proper map for the entire state 
space. Obviously, an energy function may not be 
a Lyapunov function. 
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As an illustration of the energy function, we 
consider the following classical transient stability 
model and derive an energy function for the 
model. Consider a power system consisting of 
n generators. Let the loads be modeled as con¬ 
stant impedances. Under the assumption that the 
transfer conductance of the reduced network after 
eliminating all load buses is zero, the dynamics 
of the / th generator can be represented by the 
equations 

8j — co i 

Mrn = Pj — Dja>i —^Vi Vjj Bij sin(5,- —8j) 
j = 1 

( 2 ) 

where the voltage at node /+1 is served as the 
reference, i.e., <5/+i : = 0. This is a version of the 
so-called classical model of the power system. It 
can be shown that the following function is an en¬ 
ergy function V(8,co) which satisfies conditions 
(E1)-(E3) for the classical model (2). 

n>. ■») = \ p, (>, - <;) 

- e”„ e ;:: +1 v > v j - h) 

-cos (S?-Sj) (3) 

where x s = ((5 s . 0) is the stable equilibrium point 
under consideration. 


Energy Function Theory 

In general, the dynamical behaviors of trajecto¬ 
ries of general nonlinear systems can be very 
complicated. The asymptotical behaviors (i.e., 
the co -limit set) of trajectories can be quasiperi- 
odic trajectories or chaotic trajectories. However, 
as shown below, every trajectory of system (1) 
having an energy function has only two modes 
of behaviors: its trajectory either converges to an 
equilibrium point or goes to infinity (becomes 
unbounded) as time increases. This result is ex¬ 
plained in the following theorem: 


Theorem 1 (Global Behavior of Trajectories) 

If there exists a function satisfying condition 
(El) and condition (E2) of the energy function 
for system (1), then every hounded trajectory of 
system (1) converges to one of the equilibrium 
points. 

Theorem 1 asserts that there does not exist 
any limit cycle (oscillatory behavior) or bounded 
complicated behavior such as almost periodic 
trajectory, chaotic motion, etc. in the system. We 
next show a sharper result, asserting that every 
trajectory on the stability boundary must con¬ 
verge to one of the unstable equilibrium points 
(UEPs) on the stability boundary. Recall that for 
a hyperbolic equilibrium point, it is an ( asymptot¬ 
ically) stable equilibrium point if all the eigenval¬ 
ues of its corresponding Jacobian have negative 
real parts; otherwise it is an unstable equilib¬ 
rium point. Let x be a hyperbolic equilibrium 
point. Its stable and unstable manifolds, W s (x) 
and W u (x), are well defined. There are many 
physical systems such as electric power systems 
containing multiple stable equilibrium points. A 
useful concept for these kinds of systems is that 
of the stability region (also called the region 
of attraction). The stability region of a stable 
equilibrium point x s is defined as 

A(vy) := e R n : lim 4> r (v) = x s \ 

{ t^oo ) 

The boundary of stability region A(x s ) is called 
the stability boundary of (v?) and will be denoted 
by dA(x s ). 

Theorem 2 (Trajectories on the Stability 
Boundary (Chiang et al. 1987)) If there exists 
an energy function for system (1), then every 
trajectory on the stability boundary 3A(xy) 
converges to one of the equilibrium points on 
the stability boundary dA(x s ). 

The significance of this theorem is that it 
offers an effective way to characterize the sta¬ 
bility boundary. In fact, Theorem 2 asserts that 
the stability boundary dA(x s ) is contained in the 
union of stable manifolds of the UEPs on the 
stability boundary, i.e., 
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dA(x s ) C (J W s (x t ) 

Xie{EndA(x s )} 

The following two theorems give interesting 
results on the structure of the equilibrium points 
on the stability boundary. Moreover, it presents 
a necessary condition for the existence of certain 
types of equilibrium points on a bounded stability 
boundary. 

Theorem 3 (Structure of Equilibrium Points 
on the Stability Boundary (Chiang and Thorp 
1989)) If there exists an energy function for sys¬ 
tem (1) which has an asymptotically stable equi¬ 
librium point x s (but not globally asymptotically 
stable), then the stability boundary (v?) must 
contain at least one type one equilibrium point. 
If furthermore, the stability region is bounded, 
then the stability boundary dA(x s ) must contain 
at least one type one equilibrium point and one 
source. 

Theorem 4 (Sufficient Condition for Un¬ 
bounded Stability Region (Chiang et al. 1987)) 

If there exists an energy function for system (1) 
which has an asymptotically stable equilibrium 
point x s (but not globally asymptotically stable) 
and if dA(x s ) contains no source, then the 
stability region A(x s ) is unbounded. 

A direct application of this is that the stability 
boundary 3 A(x s ) of an (asymptotically) stable 
equilibrium point of the classical power system 
stability model (2) is unbounded. 


Optimally Estimating Stability Region 
Using Energy Functions 

In this section, we focus on how to optimally 
determine the critical level value of an energy 
function for estimating the stability boundary 
3 A(x s ). We consider the following set: 

S v (k) = {x e R n : V(x) < k) (4) 

where F(.) : R n R is an energy function. 
We shall call the boundary of set (2) 3 S(k) := 
{x e R n : V(x) = k} the level set (or constant 


energy surface) and k the level value. Generally 
speaking, this set S(k ) can be very complicated 
with several connected components even for the 
2-dimensional case. We use the notation Sk(x s ) 
to denote the only component of the several dis¬ 
joint connected components of Sk that contains 
the stable equilibrium point x s . 

Theorem 5 (Optimal Estimation) Consider 
the nonlinear system (1) which has an energy 
function V{x). Let x s be an asymptotically 
stable equilibrium point whose stability region 
A(x s ) is not dense in R n . Let E\ be the 
set of type one equilibrium points and c = 

min xi € 3A(x s ) nf V ( x i )> and then 

1. S{. (x s ) c A(x s ). 

2. The set {Sb(x s ) PI A c (v?)} is nonempty for any 
number b > c. 

This theorem leads to an optimal estimation of 
the stability region A(x s ) via an energy function 
V(.) (Chiang and Thorp 1989). For the purpose 
of illustration, we consider the following simple 
example: 

x\ = — sin Xi — 0.5 sin(vi — X 2 ) + 0.01 

X 2 = —0.5 sin X 2 — 0.5 sin(v 2 — X\) + 0.05 (5) 

It is easy to show that the following function is an 
energy function for system (5): 

V (x\ , X 2 ) = —2 cos x\ — cos X 2 — cos(xi — X 2 ) 

—0.02vi — 0.1^2 (6) 

The point x s (x{,x s 2 ) = (0.02801,0.06403) 
is the stable equilibrium point whose stability 
region is to be estimated. Applying the optimal 
scheme to system (5), we have the critical level 
value of —0.31329. The Curve A in Fig. 1 is 
the exact stability boundary 3 A(x s ) while Curve 
B is the stability boundary estimated by the 
connected component (containing the s.e.p. x s ) 
of the constant energy surface. It can be seen that 
the critical level value, —0.31329, is indeed the 
optimal value. 
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Lyapunov Methods in Power System Stability, Fig. 1 

Curve A is the exact stability boundary 94 (U) of system 
(5), while Curve B is the stability boundary estimated by 
the constant energy surface (with level value of —0.31329) 
of the energy function 


Constructing Analytical Energy 
Functions for Transient Stability 
Models 

The task of constructing an energy function for 
a (post-fault) transient stability model is essential 
to direct stability analysis of power systems. The 
role of the energy function is to make feasible 
a direct determination of whether a given point 
(such as the initial point of a post-fault power sys¬ 
tem) lies inside the stability region of post-fault 
SEP without performing numerical integration. It 
has been shown that a general (analytical) energy 
function for power systems with losses does not 
exist (Chiang 1989). One key implication is that 
any general procedure attempting to construct 
an energy function for a lossy power system 
transient stability model must include a step that 
checks for the existence of an energy function. 
This step essentially plays the same role as the 
Lyapunov equation in determining the stability of 
an equilibrium point. 

Several schemes are available for constructing 
numerical energy functions for power system 
transient stability models expressed as a set of 
general differential-algebraic equations (DAEs) 
(Chu and Chiang 1999, 2005). 


Applications 

After decades of research and development in 
the energy-function-based direct methods and the 
time-domain simulation approach, it has become 
clear that the capabilities of direct methods and 
that of the time-domain approach complement 
each other. The current direction of development 
is to include appropriate direct methods and time- 
domain simulation programs within the body of 
overall power system stability simulation pro¬ 
grams (Chiang 1999, 2011; Chiang et al. 1995; 
Fouad and Vittal 1991; Sauer and Pai 1998). 
For example, the direct method provides the ad¬ 
vantages of fast computational speed and energy 
margins which make it a good complement to 
the traditional time-domain approach. The en¬ 
ergy margin and its functional relations to cer¬ 
tain power system parameters are an effective 
complement to develop tools such as preventive 
control schemes for credible contingencies which 
are unstable and to develop fast calculators for 
available transfer capability limited by transient 
stability. 

An effective, theory-based methodology for 
online screening and ranking of a large set 
of contingencies at operating points obtained 
from state estimators has been developed in 
Chiang et al. (2013). A set of improved BCU 
classifiers, along with their analytical basis, 
has been developed. Extensive evaluation of the 
improved BCU classifiers on a large test system 
and on the actual PJM interconnection system 
for a fast screening has been performed. This 
evaluation study is the largest in terms of system 
size, 14,500 buses and 3,000 generators, for a 
practical online transient stability assessment 
application. The evaluation results, performed 
on a total number of 5.3 million contingencies, 
were very promising in terms of speed, accuracy, 
reliability, and robustness (Chiang et al. 2013). 
This study also confirms the practicality of 
theory-based methodology for online transient 
stability assessment of large-scale power 
systems; in particular, theory-based methods are 
suitable for power system online applications 
which demand speed, accuracy, reliability, and 
robustness. 
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Recommended Reading 

A recent book which contains a comprehensive 
treatment of energy functions theory and applica¬ 
tions is Chiang (2011). 
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for time-invariant and time-varying systems mod¬ 
eled by ordinary differential equations. 
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Introduction 

Stability theory plays a central role in systems 
theory and engineering. For systems represented 
by state models, stability is characterized by 
studying the asymptotic behavior of the state 
variables near steady-state solutions, like equi¬ 
librium points or periodic orbits. In this article, 
Lyapunov’s method for determining the stabil¬ 
ity of equilibrium points is introduced. The at¬ 
tractive features of the method include a solid 
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theoretical foundation, the ability to conclude 
stability without knowledge of the solution (no 
extensive simulation effort), and an analytical 
framework that makes it possible to study the ef¬ 
fect of model perturbations and design feedback 
control. Its main drawback is the need to search 
for an auxiliary function that satisfies certain 
conditions. 


Stability of Equilibrium Points 

We consider a nonlinear system represented by 
the state model 

* = fix) (1) 

where the n -dimensional locally Lipschitz func¬ 
tion /(x) is defined for all x in a domain D C 
R n . A function f(x) is locally Lipschitz at a 
point xo if it satisfies the Lipschitz condition 
II fix) - f(y) II < L\\x - y\\ for all x,y in 
some neighborhood of Xq, where L is a positive 

constant and \\x\\ = ^'x\ + x\ +-1- x%. The 

Lipschitz condition guarantees that Eq. (1) has 
a unique solution for given initial state x(0). 
Suppose v e D is an equilibrium point of 
Eq. (1); that is, f(x ) = 0. Whenever the state 
of the system starts at x, it will remain at v for 
all future time. Our goal is to characterize and 
study the stability of x. For convenience, we take 
x = 0. There is no loss of generality in doing so 
because any equilibrium point x can be shifted to 
the origin via the change of variables y = x — x. 
Therefore, we shall always assume that /(0) = 0 
and study stability of the origin v = 0. 

The equilibrium point x = 0 of Eq. (1) is 
stable if for each s > 0, there is <5 = 8(e) > 0 
such that ||v(0)|| < 8 implies that ||v(0ll < 
for all t > 0. It is asymptotically stable if it is 
stable and 8 can be chosen such that ||v(0)|| < <5 
implies that x(t) converges to the origin as t tends 
to infinity. When the origin is asymptotically sta¬ 
ble, the region of attraction (also called region 
of asymptotic stability, domain of attraction, or 
basin) is defined as the set of all points x such 
that the solution of Eq. (1) that starts from v at 


time t = 0 approaches the origin as t tends to oo. 
When the region of attraction is the whole space, 
we say that the origin is globally asymptotically 
stable. A stronger form of asymptotic stability 
arises when there exist positive constants c, k, 
and A such that the solutions of Eq. (1) satisfy the 
inequality 

||x(?)|| < k\\x(0)\\e- Xt , Vt> 0 (2) 

for all ||x (0)|| < c. In this case, the equilibrium 
point xc = 0 is said to be exponentially stable. 
It is said to be globally exponentially stable if the 
inequality is satisfied for any initial state x(0). 

Linear Systems 

For the linear time-invariant system 

x = Ax (3) 

the stability properties of the origin can be de¬ 
termined by the location of the eigenvalues of 
A. The origin is stable if and only if all the 
eigenvalues of A satisfy Re [A, ] < 0 and for 
every eigenvalue with Re [A*] = 0 and algebraic 
multiplicity qt > 2, rank(A — A//) = n — qt, 
where n is the dimension of v and qi is the 
multiplicity of A/ as a zero of det(A I — A). The 
origin is globally exponentially stable if and only 
if all eigenvalues of A have negative real parts; 
that is, A is a Hurwitz matrix. For linear sys¬ 
tems, the notions of asymptotic and exponential 
stability are equivalent because the solution is 
formed of exponential modes. Moreover, due to 
linearity, if the origin is exponentially stable, then 
the inequality of Eq. (2) will hold for all initial 
states. 

Linearization 

Suppose the function /(xc) of Eq. (1) is continu¬ 
ously differentiable in a domain D containing the 
origin. The Jacobian matrix [df /dx] is an n x n 
matrix whose (/, j ) element is 9 fi / dxj . Let A be 
the Jacobian matrix evaluated at the origin x = 0. 
It can be shown that 

f(x) = [A + G(x)\x, where lim G(x) = 0 

x —>*0 
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This suggests that in a small neighborhood of the 
origin we can approximate the nonlinear system 
x = ffix ) by its linearization about the origin 
x = Ax. Indeed, we can draw conclusions about 
the stability of the origin as an equilibrium point 
for the nonlinear system by examining the eigen¬ 
values of A. The origin of Eq. (1) is exponentially 
stable if and only if A is Hurwitz. It is unstable if 
Re[A z -] > 0 for one or more of the eigenvalues of 
A. If Re[A ? ] < 0 for all i , with Re[A ? ] = 0 for 
some i , we cannot draw a conclusion about the 
stability of the origin of Eq. (1). 


Lyapunov's Method 

Let V(x ) be a continuously differentiable scalar 
function defined in a domain D C R n that 
contains the origin. The function Vfix) is said to 
be positive definite if F(0) = 0 and V(x) > 0 
for x ^ 0. It is said to be positive semidefinite 
if Vfix) > 0 for all x. A function Vfix) is said 
to be negative definite or negative semidefinite 
if — Vfix) is positive definite or positive semidef¬ 
inite, respectively. The derivative of V along the 
trajectories of Eq. (1) is given by 


converges to zero as t tends to infinity. This 
extension of the basic theorem is known as the 

invariance principle. 

Lyapunov functions can be used to estimate 
the region of attraction of an asymptotically sta¬ 
ble origin, that is, to find sets contained in the 
region of attraction. Let V(x) be a Lyapunov 
function that satisfies the conditions of asymp¬ 
totic stability over a domain D. Lor a positive 
constant c, let Q c be the component of {U(x) < 
c} that contains the origin in its interior. The 
properties of V guarantee that, by choosing c 
small enough, Q c will be bounded and contained 
in D. Then, every trajectory starting in Q c re¬ 
mains in Q c and approaches the origin as t -> oo. 
Thus, Q c is an estimate of the region of attraction. 
If D = R n and V (x) is radially unbounded, that 
is, ||x|| —> oo implies that Vfix) —> oo, then any 
point x G R n can be included in a bounded set 
£2 C by choosing c large enough. Therefore, the 
origin is globally asymptotically stable if there is 
a continuously differentiable, radially unbounded 
function V(x) such that for all x G R n , Vfix) 
is positive definite and V fix) is either negative 
definite or negative semidefinite but no solution 
can stay identically in the set {Vfix) = 0} other 
than the zero solution xfit) = 0. 


V(x) = 

i = 1 


= d f-m 

OXi ox 


where [dV/dx] is a row vector whose /th compo¬ 
nent is dV/d X/. 

Lyapunov’s stability theorem states that the 
origin is stable if there is a continuously differ¬ 
entiable positive definite function Vfix) so that 
Vfix) is negative semidefinite, and it is asymp¬ 
totically stable if Vfix) is negative definite. A 
function Vfic) satisfying the conditions for sta¬ 
bility is called a Lyapunov function. The surface 
Vfix) = c , for some c > 0, is called a Lyapunov 
surface or a level surface. 

When V(x) is only negative semidefinite, we 
may still conclude asymptotic stability of the ori¬ 
gin if we can show that no solution can stay iden¬ 
tically in the set {F(x) = 0}, other than the zero 
solution xfi) = 0. Under this condition, Vfix fit)) 
must decrease toward 0, and consequently xfit) 


Time-Varying Systems 

Equation (1) is time-invariant because / does 
not depend on t . The more general time-varying 
system is represented by 

x = f(t,x ) (4) 

In this case, we may allow the Lyapunov function 
candidate V to depend on t. Let V(t,x) be a 
continuously differentiable function defined for 
all t >0 and x G D . The derivative of V along 
the trajectories of Eq. (4) is given by 

V(t,x)= d f+ d ff(t,x) 

If there are positive definite functions H'i (.v), 
W 2 (x), and Wi(x) such that 

W x (x) < V(t,x) < W 2 (x) (5) 

V(t,x) <-W 3 (x) (6) 
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for all t > 0 and all x e D , then the 

origin is uniformly asymptotically stable, where 
“uniformly” indicates that the s-8 definition 
of stability and the convergence of x(t) to 
zero are independent of the initial time to. 
Such uniformity annotation is not needed with 
time-invariant systems since the solution of a 
time-invariant state equation starting at time to 
depends only on the difference t — to, which 
is not the case for time-varying systems. If the 
inequalities of Eqs. (5) and ( 6 ) hold globally and 
W\(x) is radially unbounded, then the origin 
is globally uniformly asymptotically stable. 
If Wi(x) = ki\\x\\ a , W 2 (x) = k 2 \\x\\ a , and 
W$(x) = k 2 \\x\\ a for some positive constants k\, 
k 2 , k 2 , and a , then the origin is exponentially 
stable. 


Perturbed Systems 

Consider the system 


x = f(t,x) +g(t,x) (7) 


where / and g are continuous in t and locally 
Lipschitz in x, for all t > 0 and x e D , 
in which D C R n is a domain that contains 
the origin x = 0. Suppose f(t, 0) = 0 and 
g(/, 0 ) = 0 so that the origin is an equilibrium 
point of Eq. (7). We think of the system (7) 
as a perturbation of the nominal system (4). 
The perturbation term g(t,x ) could result from 
modeling errors, uncertainties, or disturbances. 
In a typical situation, we do not know g(t,x ), 
but we know some information about it, like 
knowing an upper bound on ||g(7, x)||. Suppose 
the nominal system has an exponentially stable 
equilibrium point at the origin, what can we say 
about the stability of the origin as an equilib¬ 
rium point of the perturbed system? A natural 
approach to address this question is to use a 
Lyapunov function for the nominal system as a 
Lyapunov function candidate for the perturbed 
system. 

Let V(t, x) be a Lyapunov function that satis¬ 
fies 


Ci||x|| 2 < v(t,x) < C 2 \\x\\ 2 (8) 

+ <-ci\\x\\ 2 (9) 

dV 

~ < c 4lWI (10) 

OX 

for all x e D for some positive constants c\, c 2 , 
C 3 , and c 4 . Suppose the perturbation term g(t,x) 
satisfies the linear growth bound 

\\g(t,x) || < y\\x\\, Vt >0, VxeD (11) 


where y is a nonnegative constant. We use V as 
a Lyapunov function candidate to investigate the 
stability of the origin as an equilibrium point for 
the perturbed system. The derivative of V along 
the trajectories of Eq. (7) is given by 

dV dV dV 

V(t,x) = — + — f(t,x) + — g(t,x) 

at ox ox 

The first two terms on the right-hand side are the 
derivative of V(t, x) along the trajectories of the 
nominal system, which is negative definite and 
satisfies the inequality of Eq. (9). The third term, 
[dV/dx]g, is the effect of the perturbation. Using 
Eqs. (9) through (11), we obtain 


V(t,x) < -C3IMI 2 + 


dv 

dx 


\\g(t,x) II 


< -C 3 \\x\\ 2 + c 4 y||x || 2 


If y < C 3 /C 4 , then 

< -(C 3 - yc 4 )||x|| 2 , (c 3 -yc 4 )>0 


which shows that the origin is an exponentially 
stable equilibrium point of the perturbed sys¬ 
tem (7). 


Summary 

Lyapunov’s method is a powerful tool for study¬ 
ing the stability of equilibrium points. However, 
there are two drawbacks of the method. Lirst, 
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there is no systematic procedure for finding Lya¬ 
punov functions. Second, the conditions of the 
theory are only sufficient; they are not neces¬ 
sary. Failure of a Lyapunov function candidate to 
satisfy the conditions for stability or asymptotic 
stability does not mean that the origin is not stable 
or asymptotically stable. These drawbacks have 
been mitigated by a long history of using the 
method in the analysis and design of engineering 
systems, where various techniques for finding 
Lyapunov functions for specific systems have 
been determined. 

Cross-References 

► Feedback Stabilization of Nonlinear Systems 

► Input-to-State Stability 

► Regulation and Tracking of Nonlinear Systems 

Recommended Reading 

For an introduction to Lyapunov’s stability 
theory at the level of first-year graduate students, 
the textbooks Khalil (2002), Sastry (1999), 
Slotine and Li (1991), and (Vidyasagar 2002) 
are recommended. The books by Bacciotti and 
Rosier (2005) and Haddad and Chellaboina 
(2008) cover a wider set of topics at the 
same introductory level. A deeper look into 
the theory is provided in the monographs 
Hahn (1967), Krasovskii (1963), Rouche et al. 


(1977), Yoshizawa (1966), and (Zubov 1964). 
Lyapunov’s theory for discrete-time systems is 
presented in Haddad and Chellaboina (2008) and 
Qu (1998). The monograph Michel and Wang 
(1995) presents Lyapunov’s stability theory for 
general dynamical systems, including functional 
and partial differential equations. 
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Abstract 

Markov chains refer to stochastic processes 
whose states change according to transition 
probabilities determined only by the states of 
the previous time step. They have been crucial 
for modeling large-scale systems with random 
behavior in various fields such, as control, 
communications, biology, optimization, and 
economics. In this entry, we focus on their 
recent application to the area of search engines, 
namely, the PageRank algorithm employed at 
Google, which provides a measure of importance 
for each page in the web. We present several 
researches carried out with control theoretic tools 
such as aggregation, distributed randomized 
algorithms, and PageRank optimization. Due 
to the large size of the web, computational 
issues are the underlying motivation of these 
studies. 


Keywords 

Aggregation; Distributed randomized algorithms; 
Markov chains; Optimization; PageRank; Search 
engines; World wide web 

Introduction 

For various real-world large-scale dynamical sys¬ 
tems, reasonable models describing highly com¬ 
plex behaviors can be expressed as stochastic 
systems, and one of the most well-studied classes 
of such systems is that of Markov chains. A char¬ 
acteristic feature of Markov chains is that their 
behavior does not carry any memory. That is, the 
current state of a chain is completely determined 
by the state of the previous time step and not at all 
on the states prior to that step (Kumar and Varaiya 
1986; Norris 1997). 

Recently, Markov chains have gained 
renewed interest due to the extremely successful 
applications in the area of web search. The 
search engine of Google has been employing 
an algorithm known as PageRank to assist 
the ranking of search results. This algorithm 
models the network of web pages as a Markov 
chain whose states represent the pages that 
web surfers with various interests visit in a 
random fashion. The objective is to find an 
order among the pages according to their 
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popularity and importance, and this is done by 
focusing on the structure of hyperlinks among 
pages. 

In this entry, we first provide a brief overview 
on the basics of Markov chains and then in¬ 
troduce the problem of PageRank computation. 
We proceed to provide further discussions on 
control theoretic approaches dealing with PageR¬ 
ank problems. The topics covered include ag¬ 
gregated Markov chains, distributed randomized 
algorithms, and Markov decision problems for 
link optimization. 

Markov Chains 

In the simplest form, a Markov chain takes its 
states in a finite state space with transitions in 
the discrete-time domain. The transition from one 
state to another is characterized completely by the 
underlying probability distribution. 

Let X be a finite set given by X := 
{1,2,...,?z}, which is called the state space. 
Consider a stochastic process {Xk}^L 0 taking 
values on this set X. Such a process is called a 
Markov chain if it exhibits the following Markov 
property: 

Prob{^+i = j\X k = 4, X k -i = 4_i,..., 
X 0 = 4} = ProbJ A'/c+i = j\X k = 4}, 

where Prob{-|-} denotes the conditional probabil¬ 
ity and k G Z+. That is, the state at the next 
time step depends only on the current state and 
not those of previous times. 

Here, we consider the homogeneous case 
where the transition probability is constant over 
time. Thus, we have for each pair /, j G X, the 
probability that the chain goes from state j to 
state i at time k expressed as 

Pij ■= Prob{^4 = i\X k -i = j), k e Z+. 

In the matrix form, P := (Pij) is called the 
transition probability matrix of the chain. It is 
obvious that all entries of P are nonnegative, and 
for each y, the entries of the yth column of P 


sum up to 1, i.e., YTi=\ Pij — 1 for j £ X. In 
this respect, the matrix P is (column) stochastic 
(Horn and Johnson 1985). 

In this entry, we assume that the Markov chain 
is ergodic, meaning that for any pair of states, 
the chain can make a transition from one to the 
other over time. In this case, the chain and the 
matrix P are called irreducible. This property is 
known to imply that P has a simple eigenvalue 
of 1. Thus, there exists a unique steady state 
probability distribution n eW 1 given by 

t r = Pn, 1 T j x = 1, jt{ > 0, V/ G .T, 

where 1 G W 1 denotes a vector with entries one. 
Note that in this distribution tt, all entries are 
positive. 

Ranking in Search Engines: PageRank 
Algorithm 

At Google, PageRank is used to quantify the 
importance of each web page based on the hy¬ 
perlink structure of the web (Brin and Page 1998; 
Langville and Meyer 2006). A page is considered 
important if (i) many pages have links pointing 
to the page, (ii) such pages having links are 
important ones, and (iii) the numbers of links 
that such pages have are limited. Intuitively, these 
requirements are reasonable. For a web page, its 
incoming links can be viewed as votes supporting 
the page, and moreover the quality of the votes 
count through their importance as well as the 
number of votes that they make. Even if a minor 
page (with low PageRank) has many outgoing 
links, its contribution to the linked pages will not 
be substantial. 

An interesting way to explain the PageRank 
is through the random surfer model : The random 
surfer starts from a randomly chosen page. Each 
time visiting a page, he/she follows a hyperlink in 
that page chosen at random with uniform proba¬ 
bility. Hence, if the current page i has ft/ outgoing 
links, then one of them is picked with probability 
l/ft ? . If it happens that the current page has 
no outgoing link (e.g., at PDF documents), the 
surfer will use the back button. This process 
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will be repeated. The PageRank value for a page 
represents the probability of the surfer visiting the 
page. It is thus higher for pages visited more often 
by the surfer. 

It is now clear that PageRank is obtained by 
describing the random surfer model as a Markov 
chain and then finding its stationary distribution. 
First, we express the network of web pages as 
the directed graph Q = (V,£), where V = 
{1,2,is the set of nodes corresponding to 
web page indices while £ C V x V is the set of 
edges for links among pages. Node i is connected 
to node j by an edge, i.e., (/, j ) G £ , if page i has 
an outgoing link to page j. 

Let Xi(k) be the distribution of the random 
surfer visiting page i at time k , and let x(k) be 
the vector containing all X/(k). Given the initial 
distribution x(0), which is a probability vector, 
i.e., YTi =l x *(0) — 1, the evolution of x(k) can 
be expressed as 

x(k + 1 ) = Ax(k). ( 1 ) 

The link matrix A = (< ay ) G R nXn is given by 
ciij = l/rtj if (j, i) G £ and 0 otherwise, where 
rij is the number of outgoing links of page j. 
Note that this matrix A is the transition proba¬ 
bility matrix of the random surfer. Clearly, it is 
stochastic, and thus x(k) remains a probability 
vector so that YTi=\ x i (k) — 1 f° r all k. 

As mentioned above, PageRank is the 
stationary distribution of the process (1) under 
the assumption that the limit exists. Hence, 
the PageRank vector is given by v* := 
lim^oo x(k). In other words, it is the solution of 
the linear equation 

X* = Ax*, X* e [0,1]", l T x* = 1. (2) 

Notice that the PageRank vector x* is a non- 
negative unit eigenvector for the eigenvalue 1 of 
A. Such a vector exists since the matrix A is 
stochastic, but may not be unique; the reason is 
that A is a reducible matrix since in the web, not 
every pair of pages can be connected by simply 
following links. To resolve this issue, a slight 
modification is necessary in the random surfer 
model. 


The idea of the teleportation model is that 
the random surfer, after a while, becomes bored 
and stops following the hyperlinks. At such an 
instant, the surfer “jumps” to another page not 
directly connected to the one currently visiting. 
This page can be in fact completely unrelated 
in the domains and/or the contents. All n pages 
in the web have the same probability 1 / n to be 
reached by a jump. 

The probability to make such a jump is de¬ 
noted by m G (0,1). The original transition 
probability matrix A is now replaced with the 
modified one M eR nxn defined by 

YYl 

M := (1 -m)A + -ll 7 ’. (3) 

n 

For the value of m, we take m = 0.15 as reported 
in the original algorithm in Brin and Page (1998). 
Notice that M is a positive stochastic matrix. By 
Perron’s theorem (Horn and Johnson 1985), the 
eigenvalue 1 is of multiplicity 1 and is the unique 
eigenvalue with the maximum modulus. Further, 
the corresponding eigenvector is positive. Hence, 
we redefine the vector v* in (2) by using M 
instead of A as follows: 

X* = Mi*, X* e [0,1]", l T x* = 1. 

Due to the large dimension of the link matrix 
M, the computation of x* is difficult. The solu¬ 
tion employed in practice is based on the power 
method given by 

m 

x(k + 1) = Mx(k ) = (1 — m)Ax(k) H-1, 

n 

(4) 

where the initial vector v (0) G W 1 is a probability 
vector. The second equality above follows from 
the fact 1 T x(k) = 1 for k G Z+. For imple¬ 
mentation, the form on the far right-hand side is 
important, using only the sparse matrix A and not 
the dense matrix M . This method asymptotically 
finds the value vector as x(k) —> oo. 
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Aggregation Methods for Large-Scale 
Markov Chains 

In dealing with large-scale Markov chains, 
it is often desirable to predict their dynamic 
behaviors from reduced-order models that are 
more computationally tractable. This enables us, 
for example, to analyze the system performance 
at a macroscale with some approximation under 
different operating conditions. Aggregation 
refers to partitioning or grouping the states so 
that the states in each group can be treated 
as a whole. The technique of aggregation is 
especially effective for Markov chains possessing 
sparse structures with strong interactions among 
states in the same group and weak interactions 
among states in different groups. Such methods 
have been extensively studied, motivated by 
applications in queueing networks, power 
systems, etc. (Meyer 1989). 

In the context of the PageRank problem, such 
sparse interconnection can be expressed in the 
link matrix A with a block-diagonal structure 
(after some coordinate change, if necessary). The 
entries of the matrix A are dense along its diag¬ 
onal in blocks, and those outside the blocks take 
small values. More concretely, we write 

A = I + B +eC, (5) 

where B is a block-diagonal matrix given as 
B = dmg(B n ,B 2 2 , • • •, B nn ); B u is the h t x h t 
matrix corresponding to the rth group with hi 
member pages for i = 1,2,..., A; and € is a 
small positive parameter. Here, the non-diagonal 
entries of Bu are the same as those in the same 
diagonal block of A, but the diagonal entries are 
chosen such that I + Bn becomes stochastic and 
thus take nonpositive values. Thus, both B and 
C have column sums equal to zero. The small € 
suggests us that states can be aggregated into N 
groups with strong interactions within the groups, 
but connections among different groups are weak. 
This class of Markov chains is known as nearly 
completely decomposable. In general, however, it 
is difficult to uniquely determine the form (5) for 
a given chain. 


To exploit the sparse structure in the 
computation of stationary probability distribu¬ 
tions, one approach is to carry out decomposition 
or aggregation of the chains. The basic approach 
here is (i) to compute the local stationary 
distributions for I + Bu, (ii) to find the global 
stationary distribution for a chain representing 
the group interactions, and (iii) to finally use the 
obtained vectors to compute exact/approximate 
distribution for the entire chain; for details, see 
Meyer (1989). By interpreting such methods 
from the control theoretic viewpoints, in Phillips 
and Kokotovic (1981) and Aldhaheri and Khalil 
(1991), singular perturbation approaches have 
been developed. These methods lead us to the 
two-time scale decomposition of (controlled) 
Markov chain recursions. 

In the case of PageRank computation, sparsity 
is a relevant property since it is well known 
that many links in the web are intra-host ones, 
connecting pages within the same domains or 
directories. However, in the real web, it is easy 
to find pages that have only a few outlinks, but 
some of them are external ones. Such pages 
will prevent the link matrix from having small 
€ when decomposed in the form (5). Hence, the 
general aggregation methods outlined above are 
not directly applicable. 

An aggregation-based method suitable for 
PageRank computation is proposed in Ishii et al. 
(2012). There, the sparsity in the web is expressed 
by the limited number of external links pointing 
towards pages in other groups. For each page i, 
the node parameter Sf e [0,1] is given by 

# external outgoing links 

Oj := -. 

# total outgoing links 

Note that smaller <5, implies sparser networks. In 
this approach, for a given bound <5, the condition 
Si <8 is imposed only in the case page i belongs 
to a group consisting of multiple members. Thus, 
a page forming a group by itself is not required 
to satisfy the condition. This means that we can 
regroup the pages by first identifying pages that 
violate this condition in the initial groups and 
then making them separately as single groups. 
By repeating these steps, it is always possible to 
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obtain groups for a given web. Once the grouping 
is settled, an aggregation-based algorithm can 
be applied, which computes an approximated 
PageRank vector. A characteristic feature is the 
tradeoff between the accuracy in PageRank com¬ 
putation and the node parameter 8. More accurate 
computation requires a larger number of groups 
and thus a smaller <5. 


Distributed Randomized 
Computation 

For large-scale computation, distributed algo¬ 
rithms can be effective by employing multiple 
processors to compute in parallel. There are 
several methods of constructing algorithms to 
find stationary distributions of large Markov 
chains. In this section, motivated by the current 
literature on multi-agent systems, sequential 
distributed randomized approaches of gossip 
type are described for the PageRank problem. 

In gossip-type distributed algorithms, nodes 
make decisions and transmit information to 
their neighbors in a random fashion. That is, 
at any time instant, each node decides whether 
to communicate or not depending on a random 
variable. The random property is important to 
make the communication asynchronous so that 
simultaneous transmissions resulting in collisions 
can be avoided. Moreover, there is no need of any 
centralized decision maker or fixed order among 
pages. 

More precisely, each page i e V is equipped 
with a random process r]i(k) e {0,1} for k e 
Z+. If at time k, rjj(k) is equal to 1, then page 
i broadcasts its information to its neighboring 
pages connected by outgoing links. All pages in¬ 
volved at this time renew their values based on the 
latest available data. Here, rjj (k) is assumed to be 
an independent and identically distributed (i.i.d.) 
random process, and its probability distribution 
is given by Prob{^/(fc) = 1 }= a, k e Z+. 
Hence, all pages are given the same probability 
a to initiate an update. 

One of the proposed randomized approaches 
is based on the so-called asynchronous iteration 
algorithms for distributed computation of fixed 


points in the field of numerical analysis (Bert- 
sekas and Tsitsiklis 1989). The distributed update 
recursion is given as 

x(k + 1) = M r j l (k),..., r j n (k)x(k\ (6) 

where the initial state x(0) is a probability vector 
and the distributed link matrices M pl ,..., Pn are 
given as follows: Its (/, y )th entry is equal to 
(1 — m)ciij + m/ n if pi = 1; 1 if pi =0 and 
i = j; and 0 otherwise. Clearly, these matrices 
keep the rows of the original link matrix M in (3) 
for pages initiating updates. Other pages just 
keep their previous values. Thus, these matrices 
are not stochastic. From this update recursion, 
the PageRank x* is probabilistically obtained (in 
the mean square sense and in probability one), 
where the convergence speed is exponential in 
time k. Note that in this scheme (6), due to the 
way the distributed link matrices are constructed, 
each page needs to know which pages have links 
pointing towards it. This implies that popular 
pages linked by a number of pages must have 
extra memory to keep the data of such links. 

Another recently developed approach Ishii and 
Tempo (2010) and Zhao et al. (2013) has several 
notable differences from the asynchronous 
iteration approach above. First, the pages need 
to transmit their states only over their outgoing 
links; the information of such links are by 
default available locally, and thus, pages are 
not required to have the extra memory regarding 
incoming links. Second, it employs stochastic 
matrices in the update as in the centralized 
scheme; this aspect is utilized in the convergence 
analysis. As a consequence, it is established 
that the PageRank vector x* is computed in 
a probabilistic sense through the time average 
of the states x(0 ),..., x(k) given by y(k) \= 
\/(k + 1) Yu=o x (^)- The convergence speed in 
this case is of order l/k. 

PageRank Optimization 
via Hyperlink Designs 

For owners of websites, it is of particular interest 
to raise the PageRank values of their web pages. 
Especially in the area of e-business, this can 
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be critical for increasing the number of visitors 
to their sites. The values of PageRank can be 
affected by changing the structure of hyperlinks 
in the owned pages. Based on the random surfer 
model, intuitively, it makes sense to arrange the 
links so that surfers will stay within the domain 
of the owners as long as possible. 

PageRank optimization problems have rigor¬ 
ously been considered in, for example, de Ker- 
chove et al. (2008) and Fercoq et al. (2013). 
In general, these are combinatorial optimization 
problems since they deal with the issues on where 
to place hyperlinks, and thus the computation for 
solving them can be prohibitive especially when 
the web data is large. However, the work Fercoq 
et al. (2013) has shown that the problem can 
be solved in polynomial time. In what follows, 
we discuss a simplified discrete version of the 
problem setup of this work. 

Consider a subset Vo C V of web pages over 
which a webmaster has control. The objective is 
to maximize the total PageRank of the pages in 
this set Vo by finding the outgoing links from 
these pages. Each page i E Vo may have con¬ 
straints such as links that must be placed within 
the page and those that cannot be allowed. All 
other links, i.e., those that one can decide to have 
or not, are the design parameters. Hence, the 
PageRank optimization problem can be stated as 

ma x{U(x*,M) : x* = Mx\ v* E [0,1] ; \ 
l T x* = 1, M e M), 

where U is the utility function U(x*,M ) := 
S/eVo an< ^ ^ represents the set of admissible 
link matrices in accordance with the constraints 
introduced above. 

In Fercoq et al. (2013), an extended continu¬ 
ous problem is also studied where the set M of 
link matrices is a polytope of stochastic matrices 
and a more general utility function is employed. 
The motivation for such a problem comes from 
having weighted links so that webmasters can 
determine which links should be placed in a more 
visible location inside their pages to increase 
clickings on those hyperlinks. Both discrete and 
continuous problems are shown to be solvable 


in polynomial time by modeling them as con¬ 
strained Markov decision processes with ergodic 
rewards (see, e.g., Puterman 1994). 

Summary and Future Directions 

Markov chains form one of the simplest classes 
of stochastic processes but have been found pow¬ 
erful in their capability to model large-scale com¬ 
plex systems. In this entry, we introduced them 
mainly from the viewpoint of PageRank algo¬ 
rithms in the area of search engines and with a 
particular emphasis on recent works carried out 
based on control theoretic tools. Computational 
issues will remain in this area as major chal¬ 
lenges, and further studies will be needed. As we 
have observed in PageRank-related problems, it 
is important to pay careful attention to structures 
of the particular problems. 

Cross-References 

► Randomized Methods for Control of Uncertain 
Systems 
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Abstract 

Marine intervention requires the use of manip¬ 
ulators mounted on support vehicles. Such sys¬ 
tems, defined as vehicle-manipulator systems, 
exhibit specific mathematical properties and re¬ 
quire proper control design methodologies. This 
article briefly discusses the mathematical model 
within a control perspective as well as sensing 
and actuation peculiarities. 

Keywords 

Floating-base manipulators; Marine robotics; 
Underwater intervention; Underwater robotics 

Introduction 

In case of marine operations that require 
interaction with the environment, an underwater 


vehicle is usually equipped with one or more 
manipulators; such systems are defined under¬ 
water vehicle-manipulator systems (UVMSs). A 
UVMS holding six degree-of-freedom (DOF) 
manipulators is illustrated in Fig. 1. The vehicle 
carrying the manipulator may or may not be 
connected to the surface; in the first case we face 
a so-called remotely operated vehicle (ROV), 
while in the latter an autonomous underwater 
vehicle (AUV). ROVs, being physically linked, 
via the tether, to an operator that can be on a 
submarine or on a surface ship, receives power as 
well as control commands. AUVs, on the other 
hand, are supposed to be completely autonomous, 
thus relying to onboard power system and 
intelligence. 

Remotely controlled UVMSs represent the 
state of the art in underwater manipulation, while 
autonomous or semiautonomous UVMSs still are 
in their embryonic stage. All over the world, few 
experimental setups have been developed within 
on-purpose projects; see, e.g., the European 
project Trident (2012). 


Sensory System 

Any manipulation task requires that some vari¬ 
ables are measured; those may concern the inter¬ 
nal state of the system such as the end effector 
as well the vehicle position and orientation or the 
velocities. Some others concern the surrounding 
environment as it is the case of vision systems 
or range measurements. Underwater sensing is 
characterized by poorer performance with respect 
to the ground corresponding variables due to 
the physical properties of the water as medium 
carrying the electromagnetic or acoustic signals. 

One of the major challenges in underwater 
robotics is the localization due to the absence 
of a single, proprioceptive sensor that measures 
the vehicle position and the impossibility to use 
the Global Navigation Satellite System (GNSS) 
under the water. The use of redundant multisensor 
systems, thus, is common in order to perform sen¬ 
sor fusion and give fault detection and tolerance 
capabilities to the vehicle. 
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Mathematical Models of Marine Vehicle-Manipulator Systems, Fig. 1 Sketch of a UVMS, the inertial frame as 
well as the frames attached to all the rigid bodies are highlighted 


Localization 

A possible approach for AUV localization is to 
rely on inertial navigation systems (INSs); those 
are algorithms that implement dead reckoning 
techniques, i.e., the estimation of the position by 
properly merging and integrating measurements 
obtained with inertial and velocity sensors. Dead 
reckoning suffers from numerical drift due to the 
integration of sensor noise, as well as sensor bias 
and drift, and may be prone to the presence of 
external currents and model uncertainties. Since 
the variance of the estimation error grows with 
the distance traveled, this technique is only used 
for short dives. 

Several algorithms are based on the concept 
of trilateration. The vehicle measures its distance 
with respect to known positions and properly uses 
this information by applying geometric-based 
formulas. Under the water, the technology for 
trilateration is not based on the electromagnetic 
field, due to the attenuation of its radiations, but 
on acoustics. 


Among the commercially available solutions, 
long, short, and ultrashort baseline systems have 
found widespread use. The differences are in 
the baseline wavelength, the required distance 
among the transponders, the accuracy, and the 
installation cost. Acoustic underwater positioning 
is commercially mature, and several companies 
offer a variety of products. 

In case of intervention, when the UVMS is 
close to the target, rather than the absolute posi¬ 
tion with respect to an inertial frame, it is crucial 
to estimate the relative position with respect to 
the target itself. In such a case, vision-based 
systems may be considered. 

Actuation 

Underwater vehicles are usually controlled 
by thrusters and/or control surfaces. Control 
surfaces, such as rudders and sterns, are typically 
used in vehicles working at cruise speed, 
i.e., torpedo-shaped vehicles usually used in 
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monitoring or cable/pipeline inspection. In such 
vehicles a main thruster is used together with at 
least one rudder and one stern. 

This configuration is unsuitable for UVMSs 
since the force/moment provided by the control 
surfaces is the function of the velocity and it is 
null in hovering, when typically manipulation is 
performed. 

The relationship between the force/moment 
acting on the vehicle and the control input of the 
thrusters is highly nonlinear. It is the function 
of structural variables such as the density of the 
water, the tunnel cross-sectional area, the tunnel 
length, the volumetric flow rate between input 
and output of the thrusters, and the propeller di¬ 
ameter. The state of the dynamic system describ¬ 
ing the thrusters is constituted by the propeller 
revolution, the speed of the fluid going into the 
propeller, and the input torque. 


Modeling 

UVMSs can be modeled as rigid bodies con¬ 
nected to form a serial chain; the vehicle is the 
floating base, while each link of the manipulator 
represents an additional rigid body with one DOF, 
typically the rotation around the corresponding 
joint’s axis. Roughly speaking, modeling of a 
UVMS is the effort to represent the physical 
relationships of those bodies in order to measure 
and control its end effector, typically involved in 
a manipulation task. 

The first step of modeling is the so-called di¬ 
rect kinematics, consisting in computing the po¬ 
sition/orientation of the end effector with respect 
to an inertial, i.e., world fixed, frame. This is done 
via geometric relationship function of the system 
kinematic parameters, typically the lengths of 
the links, and the current system configuration, 
i.e., the vehicle position/orientation and the joint 
positions. 

Velocities of each rigid body affect the follow¬ 
ing rigid bodies and thus the end effector. For 
example, a vehicle roll movement or the joint 
velocity is projected into a linear and angular 
end-effector velocity. This velocity transforma¬ 
tion is studied by the differential kinematics. 


Analytic and/or geometric approaches may be 
used to retrieve those relationships. The study 
of the velocity-related equations is fundamental 
to understand how to balance the movement be¬ 
tween vehicle and manipulator and, within the 
manipulator, how to distribute it among the joints. 
This topic is strictly related to differential, and 
inverse, kinematics for industrial robots. 

The extension of Newton’s second law to 
UVMSs leads to a number of nonlinear differ¬ 
ential equations that link together the systems 
generalized forces and accelerations. With the 
word generalized forces, it is here intended as 
the forces and moments acting on the vehicles 
and the joint torques. Correspondingly, one is 
interested in the vehicle linear and angular accel¬ 
erations and joint accelerations. Those equations 
couple together all the DOFs of the structure, e.g., 
a force applied to the vehicle causes acceleration 
also on the joints. Study of the dynamics is crucial 
to design the controller. 

It is not possible to neglect that the bodies are 
moving in the water, the theory of fluidodynamics 
is rather complex, and it is difficult to develop 
a simple model for most of the hydrodynamic 
effects. A rigorous analysis for incompressible 
fluids would need to resort to the Navier-Stokes 
equations (distributed fluid flow). However, most 
of the hydrodynamic effects have no significant 
influence in the range of the operative velocities 
for UVMS intervention tasks. In particular, it 
is necessary to model added masses, linear and 
quadratic damping terms, and the buoyancy. 


Control 

Not surprisingly, the mathematical model of 
UVMS shares most of the characteristics 
of industrial robots as well as space robots 
modeling. Having taken into account to the 
physical differences, the control problems are 
also similar: 

• Kinematic control. The control problem is 
given in terms of motion of the end effector 
and needs to be transformed into the motion 
of the vehicle and the manipulator. This is 
often approached by resorting to the inverse 
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differential kinematic algorithms. In particu¬ 
lar, algorithms for redundant systems need to 
be considered since a UVMS always possess 
at least six DOFs. Moving a UVMS requires 
to handle additional variables with respect to 
the end effector such as the vehicle roll and 
pitch to preserve energy, the robot manipula- 
bility, the mechanical joint limits, or eventual 
directional sensing. 

• Motion control. Low-level control algorithms 
are designed to allow the system tracking the 
desired trajectory. UVMSs are characterized 
by different dynamics between vehicle and 
manipulator, uncertainty in the model parame¬ 
ter knowledge, poor sensing performance, and 
limit cycle in the thruster model. On the other 
hand, the limited bandwidth of the closed- 
loop system allows the use of simple control 
approaches. 

• Interaction control. Several applications re¬ 
quire exchange of forces with the environ¬ 
ment. A pure motion control algorithm is 
not devised for such operation and specific 
force control algorithms, both direct and indi¬ 
rect, may be necessary. Master/slave systems 
or haptic devices may be used on the pur¬ 
pose, while autonomous interaction control 
still is in the research phase for the marine 
environment. 


Summary and Future Directions 

This article is aimed at giving a short overview 
of the main mathematical and technological chal¬ 
lenges arising with UVMSs. All the components 
of an underwater mission, perception, actuation, 
and communication with the surface, are char¬ 
acterized by poorer performances with respect 
to the current industrial or advanced robotics 
applications. 

The underwater environment is hostile; as an 
example the marine current provides disturbances 
to be counteracted by the dynamic controller, or 
the sand’s whirlwinds obfuscate the vision-based 
operations close to the sea bottom. Both tele- 
operated and autonomous underwater missions 
require a significant human effort in planning, 


testing, and monitoring all the operations. Fault 
detection and recovery policies are necessary in 
each step to avoid loss of expensive hardware. 

Future generation of UVMSs needs to be au¬ 
tonomous, to percept and contextualize the en¬ 
vironment, to react with respect to unplanned 
situations, and to safely reschedule the tasks 
of complex missions. Those characteristics are 
being shared by all the branches of the service 
robotics. 

Cross-References 

► Advanced Manipulation for Underwater Sam¬ 
pling 

► Mathematical Models of Ships and Underwater 
Vehicles 

► Redundant Robots 

Recommended Reading 

The book of Fossen (1994) is one of the first 
books dedicated to control problems of marine 
systems, both underwater and surface. The same 
author presents, in Fossen (2002), an updated 
and extended version of the topics developed in 
the first book and in Fossen (2011), a handbook 
on marine craft hydrodynamics and control. A 
short introductory chapter to marine robotics may 
be found in Antonelli et al. (2008). Robotics 
fundamentals are also useful and can be found 
in Siciliano et al. (2009). To the best of our 
knowledge, Antonelli (2014) is the only mono¬ 
graph devoted to addressing the specific problems 
of underwater vehicle-manipulator systems. 
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Abstract 

This entry describes the equations of motion 
of ships and underwater vehicles. Standard 
hydrodynamic models in the literature are 
reviewed and presented using the nonlinear 
robot-like vectorial notation of Fossen (1991, 
1994, 2011). The matrix-vector notation is highly 
advantageous when designing control systems 
since well-known system properties such as 
symmetry, skew-symmetry, and positiveness can 
be exploited in the design. 
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Introduction 

The subject of this entry is mathematical model¬ 
ing of ships and underwater vehicles. With ship 


we mean “any large floating vessel capable of 
crossing open waters,” as opposed to a boat, 
which is generally a smaller craft. An underwater 
vehicle is a “small vehicle that is capable of 
propelling itself beneath the water surface as well 
as on the water’s surface.” This includes un¬ 
manned underwater vehicles (UUVs), remotely 
operated vehicles (ROVs), autonomous under¬ 
water vehicles (AUVs) and underwater robotic 
vehicles (URVs). 

This entry is based on Fossen (1991, 2011), 
which contains a large number of standard 
models for ships, rigs, and underwater vehicles. 
There exist a large number of textbooks on 
mathematical modeling of ships; see Rawson 
and Tupper (1994), Lewanddowski (2004), and 
Perez (2005). For underwater vehicles, see 
Allmendinger (1990), Sutton and Roberts (2006), 
Inzartsev (2009), Anotonelli (2010), and Wadoo 
and Kachroo (2010). Some useful references 
on ship hydrodynamics are Newman (1977), 
Faltinsen (1990), and Bertram (2012). 

Degrees of Freedom 

A mathematical model of marine craft is usu¬ 
ally represented by a set of ordinary differential 
equations (ODEs) describing the motions in six 
degrees of freedom (DOF): surge, sway, heave, 
roll, pitch, and yaw. 

Hydrodynamics 

In hydrodynamics it is common to distinguish 
between two theories: 

• Seakeeping theory: The motions of ships 
at zero or constant speed in waves are an¬ 
alyzed using hydrodynamic coefficients and 
wave forces, which depends of the wave ex¬ 
citation frequency and thus the hull geometry 
and mass distribution. For underwater vehicles 
operating below the wave-affected zone, the 
wave excitation frequency will not influence 
the hydrodynamic coefficients. 

• Maneuvering theory: The ship is moving in 
restricted calm water - that is, in sheltered 
waters or in a harbor. Hence, the maneuvering 
model is derived for a ship moving at positive 
speed under a zero-frequency wave excitation 
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assumption such that added mass and damping 
can be represented by constant parameters. 
Seakeeping models are typically used for 
ocean structures and dynamically positioned 
vessels. Several hundred ODEs are needed to 
effectively represent a seakeeping model; see 
Fossen (2011), and Perez and Fossen (201 la, b). 

The remainder of this entry assumes maneu¬ 
vering theory , since this gives lower-order mod¬ 
els typically suited for controller-observer design. 
Six ODEs are needed to describe the kinemat¬ 
ics, that is, the geometrical aspects of motion 
while Newton-Euler’s equations represent addi¬ 
tional six ODEs describing the forces and mo¬ 
ments causing the motion (kinetics). 

Notation 

The equations of motion are usually represented 
using generalized position, velocity and forces 
(Fossen 1991, 1994, 2011) defined by the state 
vectors: 


ri := [x,y,z, 0,0, 0] T 

(i) 

v := [u , v, w, p, q , r] T 

(2) 

r := [X, Y, Z, K, M, A] T 

(3) 


where rj is the generalized position expressed in 
the north-east-down (NED) reference frame {n}. 
A body-fixed reference frame {b} with axes: 

Xb - longitudinal axis (from aft to fore) 
yb - transversal axis (to starboard) 

Zb - normal axis (directed downward) 
is rotating about the NED reference frame {n} 
with angular velocity co = [p,q,r ] T . The gen¬ 
eralized velocity vector v and forces r are both 
expressed in {b}, and the 6-DOF states are de¬ 
fined according to SNAME (1950): 

• Surge position x, linear velocity u, force X 

• Sway position y, linear velocity v, force Y 

• Heave position z, linear velocity w, force Z 

• Roll angle 0, angular velocity p, moment K 

• Pitch angle 0 , angular velocity q, moment M 

• Yaw angle 0, angular velocity r, moment N 

Kinematics 

The generalized velocities rj and v in {b} and 
{n }, respectively satisfy the following kinematic 
transformation (Fossen 1994, 2011): 


rj = J (4) 


m := 


R(0) 03x3 
03x3 T(0) 


(5) 


where 0 = [0, 6, 0] T is the Euler angles and 


R(0) = 


cxf/cO 

S0C# 

—s 0 


—S0C0 + C0S0S0 
C0C0 + S0S0S0 
C0S0 


S0S0 + C0C0S# 
—C0S0 + S0S0C0 
C0C0 


( 6 ) 


with s • = sin(-), c • = cos(-) and t • = tan(-). 

The matrix R is recognized as the Euler angle 
rotation matrix R e SO( 3) satisfying RR T = 
R t R = I, and det(R) = 1, which implies that R 
is orthogonal. Consequently, the inverse rotation 
matrix is given by: R -1 = R T . The Euler rates 
0 = T (0)w are singular for 6 ^ ±tt/2 since: 


T(0) = 


1 s0t 6 ccptO 
0 c 0 — S0 

0 s0 /c0 c0/c(9 


0 ^ ±- 
^ 2 


(7) 


Singularities can be avoided by using unit quater¬ 
nions instead (Fossen 1994, 2011). 


Kinetics 

The rigid-body kinetics can be derived using the 
Newton-Euler formulation, which is based on 
Newton's second law. Following Fossen (1994, 
2011) this gives: 


+ C^ 5 (v)v = r rb (8) 

where is the rigid-body mass matrix, 
is the rigid-body Coriolis and centripetal matrix 
due to the rotation of {b} about the geographical 
frame {n}. The generalized force vector trb rep¬ 
resents external forces and moments expressed in 
{b}. In the nonlinear case: 

trb = -M A v-C A (v)v-Y>(v)v-g(ri) + T (9) 
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where the matrices and CU(v) represent 
hydrodynamic added mass due to acceleration 
v and Coriolis acceleration due to the rotation 
of {b} about the geographical frame {n}. The 
potential and viscous damping terms are lumped 
together into a nonlinear matrix D(v) while 
g (rj) is a vector of generalized restoring forces. 
The control inputs are generalized forces given 
by r. 

Formulae (8) and (9) together with (4) are 
the fundamental equations when deriving the ship 
and underwater vehicle models. This is the topic 
for the next sections. 


Ship Model 

The ship equations of motion are usually 
represented in three DOFs by neglecting heave, 
roll and pitch. Combining (4), (8), and (9) 
we get: 


rj = R(^)v (10) 

Mv C(v)v+D(v)v — T H - ^wind ^wave 

(ID 


where rj := [x, y, ^] T , v := [u, v, r] T and 


0 —mr 


C tffi(v) = 


mr 


mx g r 


0 

0 


—mx g r 

0 

0 


(14) 


CU(v) 


0 0 YyV + Yj.r 

0 0 —X^u 

—YyV — Yj.r X^u 0 

(15) 


Hydrodynamic damping will, in its simplest 
form, be linear: 


D = 


0 

0 


0 

-Y v 

~N r 


0 

~Y r 

-N r 


(16) 


while a nonlinear expression based on second- 
order modulus functions describing quadratic 
drag and cross-flow drag is: 


D(v) = 


X\u\u \w\ 6 

0 -Y\v\v M -Y\r\ v \r\ 

0 ~N\ V \ V |v| -N\ r \ v \r\ 


0 

— Y\ v \r |l>| —Y\ r \r \f\ 
-N\ v \ r |v| ~N \ r \r \r\ 


(17) 


%) = 


c\// —s\j/ 0 
si/s c x/r 0 

0 0 1 


( 12 ) 


is the rotation matrix in yaw. It is assumed that 
wind and wave-induced forces r w i n d and r wave 
can be linearly superpositioned. The system ma¬ 
trices M = M rb + and C(v) = C rb(v) + 
C^(v) are usually derived under the assumption 
of port-starboard symmetry and that surge can 
be decoupled from sway and yaw (Fossen 2011). 
Moreover, 


Other nonlinear representations are found in Fos¬ 
sen (1994, 2011). 

In the case of irrotational ocean currents , we 
introduce the relative velocity vector: 

v r = v — v c 

where v c = [u b c , v b c , 0] T is a vector of current 
velocities in {b}. Hence, the kinetic model takes 
the form: 

+ C^5(v)v 
'---' 


M = 


m- X h 
0 
0 


0 

m — Y{j 

mx g — Ny 


0 

mx g — Yj. 
h~Nr 

(13) 


rigid-body forces 

+ V r + C A^yr H - D(v^)v^ 

— — -——- — 

hydrodynamic forces 

— T + T w i n d ~\~ T wave 
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This model can be simplified if the ocean cur¬ 
rents are assumed to be constant and irrotational 
in {n}. According to Fossen (2011, Property 8.1), 
+ Ctf£(v)v = M^^V r + C RB(Vr)Vr if 
the rigid-body Coriolis and centripetal matrix sat¬ 
isfies C rb(Vv) = Crb(v). One parametrization 
satisfying this is (14). Hence, the Coriolis and 
centripetal matrix satisfies C(v r ) = Crb (v r ) + 
CU(v r ) and it follows that: 

Mv r 4“ C(v r )v r H - D(v r )v r — T 4“ 4“ T wave 

(IB) 

The kinematic equation (10) can be modi¬ 
fied to include the relative velocity v r according 
to: 

il =R(ir)v r + [u n c ,v n c ,0] T (19) 

where the ocean current velocities u n c = constant 
and v n c - constant in {n}. Notice that the body- 
fixed velocities v c = v n c , 0] T will 

vary with the heading angle \fr. 

The maneuvering model presented in this 
entry is intended for controller-observer design, 
prediction, and computer simulations, as 
well as system identification and parameter 
estimation. A large number of application- 
specific models for marine craft are found in 
Fossen (2011, Chap. 7). 

Hydrodynamic programs compute mass, iner¬ 
tia, potential damping and restoringforces while 


a more detailed treatment of viscous dissipa¬ 
tive forces (damping) and sealoads are found in 
the extensive literature on hydrodynamics - see 
Faltinsen (1990) and Newman (1977). 


Underwater Vehicle Model 

The 6-DOF underwater vehicle equations of 
motion follow from (4), (8), and (9) under the 
assumption that wave-induced motions can be 
neglected: 

rj = J (rj)v (20) 

Mv 4- C(v)v + D(v)v + g(]/) = r (21) 

with generalized position rj := [x,y,z,(j),9,\l/] T 
and velocity v := [u,v,w, p,q,r] T . Assume that 
the gravitational force acts through the center 
of gravity (CG) defined by the vector r g := 
[x g , y g , z g \ T with respect to the coordinate origin 
{&}. Similarly, the buoyancy force acts through 
the center of buoyancy (CB) defined by the vector 
r b := [x b ,y b ,Zb\ T - For most vehicles = y b = 
0. 

For a port-starboard symmetrical vehicle with 
homogenous mass distribution, CG satisfying 
y g = 0 and products of inertia I xy = I yz = 0, 
the system inertia matrix becomes: 


M := 


m X b 0 X ^ 

0 m — Yy 0 

—Xw 0 m — Z v ; 


0 -mz g —Y p 0 


mz g —Xq 0 —mx g —Zq 


0 mx g —Yy 0 


0 mz g —Xq 0 

—mz g —Y p 0 mx g —Yf. 


0 —mx g —Zq 0 


h~Kp 0 ~I ZX —Ky 


0 Iy-Mq 0 


-Izx-Kr 0 I-Ny 


( 22 ) 
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where the hydrodynamic derivatives are defined where 
according to SNAME (1950). The Coriolis and 
centripetal matrices are: 


' 0 

0 

0 

0 

-d 3 

d 2 

0 

0 

0 

d 3 

0 

—d\ 

0 

0 

0 

-d 2 

di 

0 

0 

-a 3 

d 2 

0 

b 3 

b 2 

a 3 

0 

—d\ 

b 3 

0 

~b i 

_ —Cl2 

ci\ 

0 

~b 2 

bi 

0 


(23) 

and 


a\ — XuU+X^w+Xqq 
ci 2 — Yy v + Yp p + Yj. v 

d 3 — ZuU+ZwW+Zqq 
b\ = KyV-\-Kpp-\-Kj.r 
— MuU+M^w+Mqq 
b 3 = Ny V-\- NpP~\~ NyV 


(24) 


' 0 

—mr 

mq 

mz g r 

—mx g q 

—mx g r 

mr 

0 

—mp 

0 

m(z g r + x g p ) 

0 

—mq 

mp 

0 

-mz g p 

-mz g q 

mx g p 

—mz g r 

0 

mz g p 

0 

-IxzP + I z r 

-I y q 

mx g q 

-m(z g r +x g p) 

mz g q 

1 xz p - I z r 

0 

-I xz r + I x p 

mx g r 

0 

—mx g p 

! y q 

I xz r ~ I*P 

0 


(25) 


Notice that this representation of Crb(v) 
only depends on the angular velocities p , 
q , and r, and not the linear velocities 
u,v, and r. This property will be exploited 
when including drift due to ocean cur¬ 
rents. 


g (i?)= 


~(W-B)s6 
-(W - B)c6s(f> 

-(W - B)c6c4> 

(z g W -z b B)cds<t> 

( z g W - zbB)s6+(x g W - XbB)c6c<p 
— (x g W — XbB)cQs(j) 

(27) 


Linear damping for a port-starboard symmet¬ 
rical vehicle takes the following form: 



- Xu 

0 

x w 

0 

x, 

0 


0 

Y v 

0 

Y P 

0 

Y, 

D = - 

Zu 

0 

z w 

0 

z q 

0 

0 

K v 

0 

K P 

0 

K r 


M u 

0 

M w 

0 

Mq 

0 


_ 0 

N v 

0 

N P 

0 

N r 


(26) 

Let W = mg and B = pgV denote 
the weight and buoyance where m is the 
mass of the vehicle including water in 
free floating space, V the volume of fluid 
displaced by the vehicle, g the accelera¬ 
tion of gravity (positive downward), and p 
the water density. Hence, the generalized 
restoring forces for a vehicle satisfying 
Yg — Yb = 0 becomes (Lossen 1994, 

2011 ): 


The expression for D can be extended to include 
nonlinear damping terms if necessary. Quadratic 
damping is important at higher speeds since 
the Coriolis and centripetal terms C(v)v can 
destabilize the system if only linear damping is 
used. 

In the presence of irrotational ocean currents, 
we can rewrite (20) and (21) in terms of relative 
velocity v r = v — v c according to: 

i) = J(y)v r + \u n c , u",w”,0,0,0] T (28) 

Mv r + C(v r )v r + D(v r )v r + g (rj) = r (29) 

where it is assumed that C RB(v r ) = Crb(v ), 
which clearly is satisfied for (25). In addition, it 
is assumed that u n c , v”, and w n c are constant. Lor 
more details see Lossen (2011). 
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Programs and Data 

The Marine Systems Simulator (MSS) is a Mat- 
lab/Simulink library and simulator for marine 
craft (http://www.marinecontrol.org). It includes 
models for ships, underwater vehicles, and float¬ 
ing structures. 


Summary and Future Directions 

This entry has presented standard models for 
simulation of ships and underwater vehicles. It 
is recommended to consult Fossen (1994, 2011) 
for a more detailed description of marine craft 
hydrodynamics. 


Cross-References 

► Control of Networks of Underwater 
Vehicles 

► Control of Ship Roll Motion 

► Dynamic Positioning Control Systems for 
Ships and Underwater Vehicles 

► Underactuated Marine Control 
Systems 
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Abstract 

Mean Field Game (MFG) theory studies the ex¬ 
istence of Nash equilibria, together with the indi¬ 
vidual strategies which generate them, in games 
involving a large number of agents modeled by 
controlled stochastic dynamical systems. This 
is achieved by exploiting the relationship be¬ 
tween the finite and corresponding infinite limit 
population problems. The solution of the infi¬ 
nite population problem is given by the fun¬ 
damental MFG Hamilton-Jacobi-Bellman (HJB) 
and Fokker-Planck-Kolmogorov (FPK) equations 
which are linked by the state distribution of a 
generic agent, otherwise known as the system’s 
mean field. 
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Keywords 

Fokker-Planck-Kolmogorov (FPK) equa¬ 
tion; Hamilton-Jacobi-Bellman (HJB) equa¬ 
tion; Nash equilibrium; Stochastic dynamical 
systems 

Introduction 

Large-population, dynamical, multi-agent, 
competitive, and cooperative phenomena occur 
in a wide range of designed and natural 
settings such as communication, environmental, 
epidemiological, transportation, and energy 
systems, and they underlie much economic 
and financial behavior. Analysis of such 
systems is intractable using the finite population 
game theoretic methods which have been 
developed for multi-agent control systems 
(see, e.g., Basar and Ho 1974; Basar and 
Olsder 1999; Ho 1980; and Bensoussan and 
Frehse 1984). The continuum population game 
theoretic models of economics (Aumann and 
Shapley 1974; Neyman 2002) are static, as, 
in general, are the large-population models 
employed in network games (Altman et al. 
2002) and transportation analysis (Correa 
and Stier-Moses 2010; Haurie and Marcotte 
1985; Wardrop 1952). However, dynamical (or 
sequential) stochastic games were analyzed in 
the continuum limit in the work of Jovanovic 
and Rosenthal (1988) and Bergin and Bernhardt 
(1992), where the fundamental mean field 
equations appear in the form of a discrete 
time dynamic programming equation and an 
evolution equation for the population state 
distribution. 

The mean field equations for dynamical games 
with large but finite populations of asymptotically 
negligible agents originated in the work of Huang 
et al. (2003, 2006, 2007) (where the framework 
was called the Nash Certainty Equivalence 
Principle) and independently in that of Lasry 
and Lions (2006a,b, 2007), where the now 
standard terminology of Mean Field Games 
(MFGs) was introduced. Independent of both 
of these, the closely related notion of Oblivious 


Equilibria for large-population dynamic games 
was introduced by Weintraub et al. (2005) in 
the framework of Markov Decision Processes 
(MDPs). 

One of the main results of MFG theory is that 
in large-population stochastic dynamic games in¬ 
dividual feedback strategies exist for which any 
given agent will be in a Nash equilibrium with 
respect to the pre-computable behavior of the 
mass of the other agents; this holds exactly in 
the asymptotic limit of an infinite population and 
with increasing accuracy for a finite population 
of agents using the infinite population feedback 
laws as the finite population size tends to infinity, 
a situation which is termed an £-Nash equilib¬ 
rium. This behavior is described by the solution 
to the infinite population MFG equations which 
are fundamental to the theory; they consist of 

(i) a parameterized family of HJB equations (in 
the nonuniform parameterized agent case) and 

(ii) a corresponding family of McKean-Vlasov 
(MV) FPK PDEs, where (i) and (ii) are linked 
by the probability distribution of the state of 
a generic agent, that is to say, the mean field. 
For each agent, these yield (i) a Nash value 
of the game, (ii) the best response strategy for 
the agent, (iii) the agent’s stochastic differential 
equation (SDE) (i.e., the MV-SDE path wise de¬ 
scription), and (iv) the state distribution of such 
an agent (via the MV FPK for the parameterized 
individual). 


Dynamical Agents 

In the diffusion-based models of large-population 
games, the state evolution of a collection of N 
agents Af , 1 < i < N < oo, is specified by a set 
N of controlled stochastic differential equations 
(SDEs) which in the important linear case take 
the form 

dxi(t) = [ FiXiit ) + GjUi(t)\dt + Djdwi(t), 

1 < i < N, 

where Xf e W 1 is the state, Uf e M m the control 
input, and w/ the state Wiener process of the i th 
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agent A t , where {w*, 1 < i < V} is a collec¬ 
tion of N independent standard Wiener processes 
in W independent of all mutually independent 
initial conditions. For simplicity, throughout this 
entry, all collections of system initial conditions 
are taken to be independent, zero mean and have 
finite second moment. 

A simplified form of the general case treated 
in Huang et al. (2007) and Nourian and Caines 
(2013) is given by the following set of controlled 
SDEs which for each agent Aj includes state 
coupling with all other agents: 


1 

dXj(t) = — f(t,Xi (t), Uj (t), Xj (t))dt 
N 7=1 

+ adwj(t), 1 < i < N, 


where here, for the sake of simplicity, only 
the uniform (non-parameterized) generic 
agent case is presented. The dynamics of 
a generic agent in the infinite population 
limit of this system is then described by the 
following controlled MV stochastic differential 
equation: 


dx(t) = f[t, x(t),u(t), jJL t \dt + adw(t), 


Agent Performance Functions 


In the basic finite population linear-quadratic dif¬ 
fusion case, the agent A t , 1 < i < N, possesses 
a performance, or loss, function of the form 


(iii , U—I ) = E f {\\ Xi (t) -m N (t )|| 
Jo 


+ IkCOIliWc 


where we assume the cost coupling to be of the 
form m^it) \= (x^it) + tj), rj e M”, where u-t 
denotes all agents’ control laws except for that 
of the / th agent and xjy denotes the population 
average state (1 / N) x i » anc ^ where here and 

below the expectation is taken over an underlying 
sample space which carries all initial conditions 
and Wiener processes. 

For the nonlinear case introduced in the pre¬ 
vious section, a corresponding finite population 
mean field loss function is 


Ji (.Ui ; U—i ) : = 


' 4 <iw s 


l < i < N, 


L{t , Xi ( 0 , Ui ( t ), xj ( t )) \dt, 


where f[t,x,u,/jL t \ = f Rn f(t,x,u,y)/i t (dy), 
with the initial condition measure /zo specified, 
where \i t (•) denotes the state distribution 
of the population at t e [0,7]. The dy¬ 
namics used in the analysis in Lasry and 
Lions (2006a,b, 2007) and Cardaliaguet 
(2012) are of the form dxt(t ) = Ui(t)dt + 
< idwt ( t ). 

The dynamical evolution of the state Xi of 
the /th agent Af in the discrete time Markov 
Decision Processes (MDP)-based formulation 
of the so-called anonymous sequential games 
(Bergin and Bernhardt 1992; Jovanovic and 
Rosenthal 1988; Weintraub et al. 2005) is 
described by a Markov state transition function, 
or kernel, of the form P t +\ := P(xi(t + 

P t ). 


where L is the nonlinear state cost-coupling 
function. Setting, by possible abuse of notation, 
L[t,x,u, fit] = f Rn L(t,x,u,y)fi t (dy ) 9 the 
infinite population limit of this cost function 
for a generic individual agent A is given by 


J(u, fl) 


■= E f 


L[t,x(t), u(t ), ii t ]dt, 


which is the general expression for the infinite 
population individual performance functions 
appearing in Huang et al. (2006) and Nourian and 
Caines (2013) and which includes those of Lasry 
and Lions (2006a,b, 2007) and Cardaliaguet 
(2012). Exponentially discounted costs with 
discount rate parameter p are employed for 
infinite time horizon performance functions in 
Huang et al. (2003, 2007), while the sample path 
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limit of the long-range average is used for ergodic 
MFG problems in Lasry and Lions (2006a, 2007) 
and Li and Zhang (2008) and in the analysis of 
adaptive MFG systems (Kizilkale and Caines 
2013). 

The Existence of Equilibria 

The objective of each agent is to find strategies 
(i.e., control laws) which are admissible with 
respect to information and other constraints and 
which minimize its performance function. The 
resulting problem is necessarily game theoretic 


and consequently central results of the topic con¬ 
cern the existence of Nash Equilibria and their 
properties. 

The basic linear-quadratic mean field problem 
has an explicit solution characterizing a Nash 
equilibrium (see Huang et al. 2003, 2007). 
Consider the scalar infinite time horizon 
discounted case, with nonuniform parameterized 
agents Aq with parameter distribution F(0), 0 e 
A, and dynamical parameters identified as 
a e := F e ,b e := G e ,Q := 1 ,r := R; then 
the so-called Nash Certainty Equivalence (NCE) 
equation scheme generating the equilibrium 
solution takes the form 


ps 0 


dse bl 

—— + a e se --II oso 

at r 


x 


* 


dx 0 

dt 



0 < t < oo, 


x(t) = f xo(t)dF(6 ), 

Ja 


x*(t) = y(x(t) + ri), 

bl 

pile = 2aeTle --n# + 1, IT# > 0, 

r 


Riccati Equation 


where the control action of the generic 
parameterized agent Aq is given by u° 0 (t) = 
— y(n exo(t) + s#(t)),0 < t < oo. u ° 0 is the 
optimal tracking feedback law with respect to 
x*(t) which is an affine function of the mean field 
term x(t), the mean with respect to the parameter 
distribution F of the 0 e A parameterized state 
means of the agents. Subject to the conditions 
for the NCE scheme to have a solution, each 
agent is necessarily in a Nash equilibrium in all 
full information causal (i.e., non-anticipative) 
feedback laws with respect to the remainder of 
the agents when these are employing the law u°. 

It is an important feature of the best response 
control law u° $ that its form depends only on 
the parametric data of the entire set of agents, 


and at any instant it is a feedback function 
of only the state of the agent Aq itself and 
the deterministic mean field-dependent offset 

so- 

For the general nonlinear case, the MFG 
equations on [0, T] are given by the linked 
equations for (i) the performance function 
V for each agent in the continuum, (ii) the 
FPK for the MV-SDE for that agent, and 
(iii) the specification of the best response 
feedback law depending on the mean field 
measure \i t and the agent’s state x(t). In the 
uniform agents case, these take the following 
form. 

The Mean Field Game HJB: (MV) FPK 
Equations 
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dV(t x) ( dV(t x) 

[MV-HJB]-—- = inf \f[t,x(t),u(t),fi t ] V - + L[t,x(t),u(t),fi t ] 

dt u<eu I dx 


+ 


3 2 V(t,x) 


2 dx 2 ’ 

V(T,x) = 0, (t,x) e[0,T]x . 


[MV-FPK] = d{f[t,x,u(t),fi,]fi(t,x)} + a 2 3 2 ii(t,x) 


dt 


dx 


2 dx 2 


[MV-BR] 


w (7) = cp(t,x(t)\fi t ), (t, x) e [0, T] x ; 


The general nonlinear MFG problem is 
approached by different routes in Huang et al. 
(2006) and Nourian and Caines (2013), and Lasry 
and Lions (2006a,b, 2007) and Cardaliaguet 
(2012), respectively. In the former, the so-called 
probabilistic method solves the MFG equations 
directly. Subject to technical conditions, an 
iterated contraction argument establishes the 
existence of a solution to the HJB-(MV) FPK 
equations; the best response control laws are 
obtained from these MFG equations, and 
these are necessarily Nash equilibria within all 
causal feedback laws for the infinite population 
problem. In Lasry and Lions (2006a, 2007) the 
MFG equations on the infinite time interval (i.e., 
the ergodic case) are obtained as the limit of Nash 
equilibria for increasing finite populations, while 
in the expository notes of Cardaliaguet (2012) the 
analytic properties of solutions to the HJB-FPK 
equations on the finite interval are analyzed using 
PDE methods including the theory of viscosity 
solutions. 

In Huang et al. (2003, 2006, 2007), Nourian 
and Caines (2013), and Cardaliaguet (2012), it 
is shown that subject to technical conditions, 
the solutions to the HJB-FPK scheme yield s- 
Nash solutions for finite population MFGs in that 
for any s > 0, there exists a population size 
N s such that for all larger populations the use 


of the feedback law given by the MFG infinite 
population scheme gives each agent a value to 
its performance function within s of the infinite 
population Nash value. 

A counterintuitive feature of these results is 
that, asymptotically in population size, observa¬ 
tions of the states of rival agents are of no value to 
any given agent; this is in contrast to the situation 
in single-agent optimal control theory where the 
value of observations on an agent’s environment 
is in general positive. 


Current Developments and Open 
Problems 

There is now an extensive literature on Mean 
Field Games, the following being a sample: 
the mathematical literature has focused on 
the study of general classes of solutions to 
the fundamental HJB-FPK equations (see e.g., 
Cardaliaguet (2013)), while in systems and 
control, the theory of major-minor agent MFG 
problems (in economics terminology, atoms 
and continua) is being developed (Huang 2010; 
Nourian and Caines 2013; Nguyen and Huang 
2012), adaptive control extensions of the LQG 
theory have been carried out (Kizilkale and 
Caines 2013), and the risk-sensitive case has 










Mean Field Games 


711 


been analyzed (Tembine et al. 2012). Much work 
is now under way in the applications of MFG 
theory to economics, finance, distributed energy 
systems, and electrical power markets. Each 
of these areas has significant open problems, 
including the application of mathematical 
transport theory to HJB-FPK equations, the 
role of MFG theory in portfolio optimization, 
and the analysis of systems where the presence 
of partially observed major and minor agent 
states incurs mean field and agent state 
estimation. 

Cross-References 

► Dynamic Noncooperative Games 

► Game Theory: Historical Overview 

► Stochastic Dynamic Programming 

► Stochastic Linear-Quadratic Control 

► Stochastic Maximum Principle 
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Abstract 

Mechanism design is concerned with the design 
of strategic environments to achieve desired 
outcomes at equilibria of the resulting game. 
We briefly overview central ideas in mechanism 
design. We survey both objectives the mechanism 
designer may seek to achieve, as well as 
equilibrium concepts the designer may use 
to model agents. We conclude by discussing 
a seminal example of mechanism design 
at work: the Vickrey-Clarke-Groves (VCG) 
mechanisms. 

Keywords 

Game theory; Incentive compatibility; Vickrey- 
Clarke-Groves mechanisms 

Introduction 

Informally, mechanism design might be 
considered “inverse game theory.” In mechanism 
design, a principal (the “designer”) creates a 
system (the “mechanism”) in which strategic 
agents interact with each other. Typically, the 
goal of the mechanism designer is to ensure 
that at an “equilibrium” of the resulting strategic 
interaction, a “desirable” outcome is achieved. 
Examples of mechanism design at work include 
the following: 

1. The FCC chooses to auction spectrum among 
multiple competing, strategic bidders to max¬ 
imize the revenue generated. How should the 
FCC design the auction? 

2. A search engine decides to run a market for 
sponsored search advertising. How should the 
market be designed? 

3. The local highway authority decides to charge 
tolls for certain roads to reduce congestion. 
How should the tolls be chosen? 


In each case, the mechanism designer is shap¬ 
ing the incentives of participants in the system. 
The mechanism designer must first define the 
desired objective and then choose a mechanism 
that optimizes that objective given a prediction of 
how strategic agents will respond. The theory of 
mechanism design provides guidance in solving 
such optimization problems. 

We provide a brief overview of some central 
concepts in mechanism design. In the first 
section, we delve into more detail on the structure 
of the optimization problem that a mechanism 
designer solves. In particular, we discuss two 
central features of this problem: (1) What 
is the objective that the mechanism designer 
seeks to achieve or optimize? (2) How does the 
mechanism designer model the agents, i.e., what 
equilibrium concept describes their strategic 
interactions? In the second section, we study 
a specific celebrated class of mechanisms, the 
Vickrey-Clarke-Groves mechanisms. 


Objectives and Equilibria 

A mechanism design problem requires two essen¬ 
tial inputs, as described in the introduction. First, 
what is the objective the mechanism designer is 
trying to achieve or optimize? And second, what 
are the constraints within which the mechanism 
designer operates? On the latter question, perhaps 
the biggest “constraint” in mechanism design is 
that the agents are assumed to act rationally in 
response to whatever mechanism is imposed on 
them. In other words, the mechanism designer 
needs to model how the agents will interact with 
each other. Mathematically, this is modeled by 
a choice of equilibrium concept. For simplicity, 
we focus only on static mechanism design, i.e., 
mechanism design for settings where all agents 
act simultaneously. 

Objectives 

In this section we briefly discuss three objec¬ 
tives the mechanism designer may choose to 
optimize for: efficiency , revenue , and & fairness 
criterion. 
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1. Efficiency. When the mechanism designer fo¬ 
cuses on “efficiency,” they are interested in 
ensuring that the equilibrium outcome of the 
game they create is a Pareto efficient outcome. 
In other words, at an equilibrium of the game, 
there should be no individual that can be made 
strictly better off while leaving all others at 
least as well off as they were before. The most 
important instantiation of the efficiency crite¬ 
rion arises in quasilinear settings, i.e., settings 
where the utility of all agents is measured in 
a common, transferable monetary unit. In this 
case, it can be shown that achieving efficient 
outcomes is equivalent to maximizing the ag¬ 
gregate utility of all agents in the system. 
See Chap. 23 in Mas-Colell et al. (1995) for 
more details on mechanism design for efficient 
outcomes. 

2. Revenue. Efficiency may be a reasonable goal 
for an impartial social planner; on the other 
hand, in many applied settings, the mechanism 
designer is often herself a profit-maximizing 
party. In these cases, it is commonly the goal 
of the mechanism designer to maximize her 
own payoff from the mechanism itself. 

A common example of this scenario is in 
the design of optimal auctions. An auction is a 
mechanism for the sale of a good (or multiple 
goods) among many competing buyers. When 
the principal is self-interested, she may wish 
to choose the auction design that maximizes 
her revenue from sale; the celebrated paper 
of Myerson (1981) studies this problem in 
detail. 

3. Fairness. Finally, in many settings, the 
mechanism designer may be interested more 
in achieving a “fair” outcome - even if such 
an outcome is potentially not Pareto efficient. 
Fairness is subjective, and therefore, there 
are many potential objectives that might be 
viewed as fair by the mechanism designer. 
One common setting where the mechanism 
design strives for fair outcomes is in cost 
sharing : in a canonical example, the cost 
of a project must be shared “fairly” among 
many participants. See Chap. 15 of Nisan 
et al. (2007) for more discussion of such 
mechanisms. 


Equilibrium Concepts 

In this section we briefly discuss a range of equi¬ 
librium concepts the mechanism designer might 
use to model the behavior of players. From an op¬ 
timization viewpoint, mechanism design should 
be viewed as maximization of the designer’s 
objective, subject to an equilibrium constraint. 
The equilibrium concept used captures the mech¬ 
anism designer’s judgment about how the agents 
can be expected to interact with each other, once 
the mechanism designer has fixed the mech¬ 
anism. Here we briefly discuss three possible 
equilibrium concepts that might be used by the 
mechanism designer. 

1. Dominant strategies. In dominant strategy 
implementation, the mechanism designer 
assumes that agents will play a (weak or strict) 
dominant strategy against their competitors. 
This equilibrium concept is obviously quite 
strong, as dominant strategies may not exist 
in general. However, the advantage is that 
when the mechanism possesses dominant 
strategies for each player, the prediction of 
play is quite strong. The Vickrey-Clarke- 
Groves mechanisms described below are 
central in the theory of mechanism design 
with dominant strategies. 

2. Bayesian equilibrium. In a Bayesian 
equilibrium, agents optimize given a common 
prior distribution about the other agents’ 
preferences. In Bayesian mechanism design, 
the mechanism designer chooses a mechanism 
taking into account that the agents will play 
according to a Bayesian equilibrium of the 
resulting game. This solution concept allows 
the designer to capture a lack of complete 
information among players, but typically 
allows for a richer family of mechanisms than 
mechanism design with dominant strategies. 
Myerson’s work on optimal auction design is 
carried out in a Bayesian framework (Myerson 
1981). 

3. Nash equilibrium. Finally, in a setting where 
the mechanism designer believes the agents 
will be quite knowledgeable about each 
other’s preferences, it may be reasonable to 
assume they will play a Nash equilibrium of 
the resulting game. Note that in this case, 
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it is typically assumed the designer does not 
know the utilities of agents at the time the 
mechanism is chosen - even though agents do 
know their own utilities at the time the result¬ 
ing game is actually played. See, e.g., Moore 
(1992) for an overview of mechanism design 
with Nash equilibrium as the solution concept. 

The Vickrey-Clarke-Groves 
Mechanisms 

In this section, we describe a seminal exam¬ 
ple of mechanism design at work: the Vickrey- 
Clarke-Groves (VCG) class of mechanisms. The 
key insight behind VCG mechanisms is that by 
structuring payment rules correctly, individuals 
can be incentivized to truthfully declare their 
utility functions to the market and in turn achieve 
an efficient allocation. VCG mechanisms are an 
example of mechanism design with dominant 
strategies and with the goal of welfare maxi¬ 
mization, i.e., efficiency. The presentation here 
is based on the material in Chap. 5 of Berry and 
Johari (2011), and the reader is referred there 
for further discussion. See also Vickrey (1961), 
Clarke (1971), and Groves (1973) for the original 
papers discussing this class of mechanisms. 

To illustrate the principle behind VCG mech¬ 
anisms, consider a simple example where we al¬ 
locate a single resource of unit capacity among R 
competing users. Each user’s utility is measured 
in terms of a common currency unit; in particular, 
if the allocated amount is x r and the payment to 
user r is t r , then her utility is U r (x r ) + t r ; we 
refer to U r as the valuation function, and let the 
space of valuation functions be denoted by U. 
For simplicity we assume the valuation functions 
are continuous. In line with our discussion of 
efficiency above, it can be shown that the Pareto 
efficient allocation is obtained by solving the 
following: 


maximize U r (x r ) 

r 

(1) 

subject to x r < 1; 

(2) 

x > 0. 

(3) 


However, achieving the efficient allocation re¬ 
quires knowledge of the utility functions; what 
can we do if these are unknown? The key insight 
is to make each user act as if they are opti¬ 
mizing the aggregate utility, by structuring pay¬ 
ments appropriately. The basic approach in a 
VCG mechanism is to let the strategy space of 
each user r be the set U of possible valuation 
functions and make a payment t r to user r so that 
her net payoff has the same form as the social 
objective (1). In particular, note that if user r 
receives an allocation x r and a payment t r , the 
payoff to user r is 

U r (x r ) + t r . 

On the other hand, the social objective (1) can be 
written as 

U r (x r ) + ^ ^ U s (x s ). 

s^r 

Comparing the preceding two expressions, the 
most natural means to align user objectives with 
the social planner’s objectives is to define the 
payment t r as the sum of the valuations of all 
users other than r. 

A VCG mechanism first asks each user to 
declare a valuation function. For each r, we use 
U r to denote the declared valuation function of 
user r and use U = (U\, ..., Ur) to denote the 
vector of declared valuations. Formally, given a 
vector of declared valuation functions U, a VCG 
mechanism chooses the allocation x(U) as an 
optimal solution to (1)— (2) given U, i.e., 

x(U) e arg max U r (x r ). (4) 

x>0:£ r x r <\ 

The payments are then structured so that 

au) = £ tU*,cu)) + Ar(u_ r ). (5) 

s^r 

Here h r is an arbitrary function of the declared 
valuation functions of users other than r, and 
various definitions of h r give rise to variants of 
the VCG mechanisms. Since user r cannot affect 
this term through the choice of U r , she chooses 
U r to maximize 
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u r ( Xr m + J2us(x s m. 

s^r 

Now note that given U_ r , the above expression is 
bounded above by 


we have only considered implementation in static 
environments. Most practical mechanism design 
settings are dynamic. Dynamic mechanism 
design remains an active area of fruitful research. 


max 

X>0:Er *r< 1 


U r {pCr) ^ ' U S (X S ) 

s^r 


But since x(U) satisfies (4), user r can achieve 
the preceding maximum by truthfully declaring 
U r = U r . Since this optimal strategy does not 
depend on the valuation functions (U s , s ^ r ) 
declared by the other users, we recover the impor¬ 
tant fact that in a VCG mechanism, truthful dec¬ 
laration is a weak dominant strategy for user r. 

For our purposes, the interesting feature of the 
VCG mechanism is that it elicits the true utilities 
from the users and in turn (because of the defini¬ 
tion of x(U)) chooses an efficient allocation. The 
feature that truthfulness is a dominant strategy is 
known as incentive compatibility : the individual 
incentives of users are aligned, or “compatible,” 
with overall efficiency of the system. The VCG 
mechanism achieves this by effectively paying 
each agent to tell the truth. The significance of 
the approach is that this payment can be properly 
structured even if the resource manager does 
not have prior knowledge of the true valuation 
functions. 


Cross-References 

► Auctions 

► Game Theory: Historical Overview 

► Linear Quadratic Zero-Sum Two-Person Dif¬ 
ferential Games 
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Summary and Future Directions 

Mechanism design provides an overarching 
framework for the “engineering” of economic 
systems. However, many significant challenges 
remain. First, VCG mechanisms are not 
computationally tractable in complex settings, 
e.g., combinatorial auctions (Hajek 2013); 
finding computationally tractable yet efficient 
mechanisms is a very active area of current 
research. Second, VCG mechanisms optimize 
for overall welfare, rather than revenue, and 
finding simple mechanisms that maximize 
revenue also presents new challenges. Finally, 
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Abstract 

The process of developing control-oriented math¬ 
ematical models of physical systems is a complex 
task, which in general implies a careful combi¬ 
nation of prior knowledge about the physics of 
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the system under study with information coming 
from experimental data. In this article the role 
of mathematical models in control system design 
and the problem of developing compact control- 
oriented models are discussed. 

Keywords 

Analytical models; Computational modeling; 
Continuous-time systems; Control-oriented mod¬ 
eling; Discrete-time systems; Parameter-varying 
systems; Simulation; System identification; 
Time-invariant systems; Time-varying systems; 
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Introduction 

The design of automatic control systems requires 
the availability of some knowledge of the dynam¬ 
ics of the process to be controlled. In this respect, 
current methods for control system synthesis can 
be classified in two broad categories: model-free 
and model-based ones. 

The former aim at designing (or tuning) con¬ 
trollers solely on the basis of experimental data 
collected directly on the plant, without resorting 
to mathematical models. 

The latter, on the contrary, assume that suit¬ 
able models of the plant to be controlled are 
available, and rely on this information to work 
out control laws capable of meeting the design 
requirements. 

While the research on model-free design 
methods is a very active field, the vast majority 
of control synthesis methods and tools fall in 
the model-based category and therefore assume 
that knowledge about the plant to be controlled 
is encoded in the form of dynamic models of 
the plant itself. Furthermore, in an increasing 
number of application areas, control system 
performance is becoming a key competitive 
factor for the success of innovative, high- 
tech systems. Consider, for example, high- 
performance mechatronic systems (such as 
robots); vehicles enhanced by active integrated 
stability, suspension, and braking control; 


aerospace systems; advanced energy conversion 
systems. All the abovementioned applications 
possess at least one of the following features, 
which in turn call for accurate mathematical 
modeling for the design of the control system: 
closed-loop performance critically depends on 
the dynamic behavior of the plant; the system 
is made of many closely interacting subsystems; 
advanced control systems are required to obtain 
competitive performance, and these in turn 
depend on explicit mathematical models for their 
design; the system is safety critical and requires 
extensive validation of closed-loop stability and 
performance by simulation. 

Therefore, building control-oriented mathe¬ 
matical models of physical systems is a crucial 
prerequisite to the design process itself (see, e.g., 
Lovera (2014) for a more detailed treatment of 
this topic). 

In the following, two aspects related to mod¬ 
eling for control system synthesis will be dis¬ 
cussed, namely, the role of models for control 
system synthesis and the actual process of model 
building itself. 

The Role of Models for Control 
System Synthesis 

Mathematical models play a number of different 
roles in the design of control systems. In particu¬ 
lar, different classes of mathematical models are 
usually employed: detailed , high-fidelity models 
for system simulation and compact models for 
control design. In this section the two model 
classes are presented and their respective roles in 
the design of control systems are described. Note, 
in passing, that although hybrid system control is 
an interesting and emerging field, this entry will 
focus on purely continuous-time physical mod¬ 
els, with application to the design of continuous¬ 
time or sampled-time control systems. 

Detailed Models for System Simulation 

Detailed models play a double role in the control 
design process. On one hand, they allow checking 
how good (or crude) the compact model is, com¬ 
pared to a more detailed description, thus helping 



Model Building for Control System Synthesis 


717 


to develop good compact models. On the other 
hand, they allow closed-loop performance verifi¬ 
cation of the controlled system, once a controller 
design is available. Indeed, verifying closed-loop 
performance using the same simplified model 
that was used for control system design is not 
a sound practice; conversely, verification per¬ 
formed with a more detailed model is usually 
deemed a good indicator of the control system 
performance, whenever experimental validation 
is not possible for some reason. 

Object-oriented modeling (OOM) method¬ 
ologies and equation-based, object-oriented 
languages (EOOLs) provide very good support 
for the development of such high-fidelity 
models, thanks to equation-based modeling, 
acausal physical ports, hierarchical system 
composition, and inheritance; see Tiller (2001) 
for a comprehensive overview. Any continuous¬ 
time EOOL model is equivalent to the set of 
differential-algebraic equations (DAEs) 

F(x(t),x(t), u(t), y(t), p, t ) = 0, (1) 

where v is the vector of dynamic variables, u is 
the vector of input variables, y is the vector of 
algebraic variables, p is the vector of parameters 
and t is the time. It is interesting to highlight that 
the object-oriented approach (in particular, the 
use of replaceable components) allows defining 
and managing families of models of the same 
plant with different levels of complexity, by pro¬ 
viding more or less detailed implementations of 
the same abstract interfaces. This feature of OOM 
allows the development of simulation models for 
different purposes and with different degrees of 
detail throughout the entire life of an engineering 
project, from preliminary design down to com¬ 
missioning and personnel training, all within a 
coherent framework. 

In particular, when focusing on control sys¬ 
tems verification (and regardless of the actual 
control design methodology) once the controller 
has been set up, an OOM tool can be used to 
run closed-loop simulations, including both the 
plant and the controller model. Many OOM tools 
provide model export facilities, which allow to 
connect a plant model with only causal external 


connectors (actuator inputs and sensor outputs) 
to a causal controller model in a causal simu¬ 
lation environment. From a mathematical point 
of view, this corresponds to reformulating (1) in 
state-space form, by means of analytical and/or 
numerical transformations. 

Finally, it is important to point out that 
physical model descriptions based on partial- 
differential equations (PDEs) can be handled in 
the OOM framework by means of discretization 
using finite volume, finite elements, or finite 
differences methods. 

Compact Models for Control Design 

The requirements for a control-oriented model 
can vary significantly from application to applica¬ 
tion. Design models can be tentatively classified 
in terms of two key features: complexity and 
accuracy. For a dynamic model, complexity can 
be measured in terms of its order; accuracy, on 
the other hand, can be measured using many 
different metrics (e.g., time-domain simulation 
or prediction error, frequency domain matching 
with the real plant, etc.) related to the capability 
of the model to reproduce the behavior of the true 
system in the operating conditions of interest. 

Broadly speaking, it can be safely stated that 
the required level of closed-loop performance 
drives the requirements on the accuracy and 
complexity of the design model. Similarly, 
it is intuitive that more complex models 
have the potential for being more accurate. 
So, one might be tempted to resort to very 
detailed mathematical representations of the 
plant to be controlled in order to maximize 
closed-loop performance. This consideration 
however is moderated by a number of additional 
requirements, which actually end up driving the 
control-oriented modeling process. First of all, 
present-day controller synthesis methods and 
tools have computational limitations in terms of 
the complexity of the mathematical models they 
can handle, so compact models representative of 
the dominant dynamics of the system under study 
are what is really needed. Furthermore, for many 
synthesis methods (such as, e.g., FQG or H 00 
synthesis), the complexity of the design model 
has an impact on the complexity of the controller, 
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which in turn is constrained by implementation 
issues. Last but not least, in engineering projects, 
the budget of the control-oriented modeling 
activity is usually quite limited, so the achievable 
level of accuracy is affected by this limitation. 

It is clear from the above discussion that devel¬ 
oping mathematical models suitable for control 
system synthesis is a nontrivial task but rather 
corresponds to the pursuit of a careful tradeoff 
between complexity and accuracy. Furthermore, 
throughout the model development, one should 
keep in mind the eventual control application of 
the model, so its mathematical structure has to be 
compatible with currently available methods and 
tools for control system analysis and design. 

Control-oriented models are usually formu¬ 
lated in state-space form: 

x(t) = f(x(t),u(t),p,t) 
y(t) = g(x(t),u(t),p,t) 

where x is the vector of state variables, u is the 
vector of system inputs (control variables and dis¬ 
turbances), y is the vector of system outputs, p is 
the vector of parameters, and t is the continuous 
time. In the following, however, the focus will 
be on linear models, which constitute the starting 
point for most control law design methods and 
tools. In this respect, the main categories of 
models used in control system synthesis can be 
defined as follows. 

Linear Time-Invariant Models 
Linear time-invariant (LTI) models can be de¬ 
scribed in state-space form as 

x(t) = Ax(t) + Bu(t ) 
y(t ) = Cx(t ) + Du(t ) 

or, equivalently, using an input-output model 
given by the (rational) transfer function 

G(s) = C(sl - A)~ l B + D, (4) 

where s denotes the Laplace variable. In many 
cases, the dynamics of systems in the form (2) in 
the neighborhood of an equilibrium (trim) point 


is approximated by (3) via analytical or numerical 
linearization. 

If, on the contrary, the control-oriented model 
is obtained by linearization of the DAE system 
(1), then a generalized LTI (or descriptor) model 
in the form 

Ex(t) = Ax(t) + Bu(t) 

y(t) = Cx(t) + Du(t) (5) 

is obtained. Clearly, a generalized LTI model is 
equivalent to a conventional one as long as E 
is nonsingular. The generalized form, however, 
encompasses the wider class of linearized plants 
with a singular E. 

Linear Time-Varying Models 
In some engineering applications, the need may 
arise to linearize the detailed model in the neigh¬ 
borhood of a trajectory rather than around an 
equilibrium point. Whenever this is the case, a 
linear time-varying (LTV) model is obtained, in 
the form 

x(t) = A(t)x(t) + B(t)u(t) 

y(t ) = C(t)x(t ) + D(t)u(t). 1 

An important subclass is the one of time periodic 
behavior of the state-space matrices of the model, 
which corresponds to a linear time periodic (LTP) 
model. LTP models arise when considering the 
linearization along periodic trajectories or, as it 
occurs in a number of engineering problems, 
whenever rotating systems are considered (e.g., 
spacecraft, rotorcraft, wind turbines). Finally, it 
is interesting to recall that (discrete-time) LTP 
models are needed to model multi-rate sampled 
data systems. 

Linear Para meter-Varying models 
The control-oriented modeling problem can be 
also formulated as the one of simultaneously 
representing all the linearizations of interest 
for control purposes of a given nonlinear plant. 
Indeed, in many control engineering applications 
a single control system must be designed to 
guarantee the satisfactory closed-loop operation 
of a given plant in many different operating 
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conditions (either equilibria or trajectories). 
Many design techniques are now available for 
this problem (see, e.g., Mohammadpour and 
Scherer 2012), provided that a suitable model in 
parameter-dependent form has been derived for 
the system to be controlled. Linear parameter- 
varying (LPV) models, described in state-space 
form as 

x{t) = A(p(t))x(t) + B(p(t))u(t) 

y(t ) = C(p(t))x(t) + D(p(t))u(t) 

are linear models the state-space representation 
of which depends on a parameter vector p that 
can be time varying. The elements of vector p 
may or may not be measurable, depending on the 
specific problem formulation. The present state 
of the art of LPV modeling can be briefly sum¬ 
marized by defining two classes of approaches 
(see Lopes dos Santos et al. (2011) for details). 
Analytical methods based on the availability of 
reliable nonlinear equations for the dynamics of 
the plant, from which suitable control-oriented 
representations can be derived (by resorting to, 
broadly speaking, suitable extensions of the fa¬ 
miliar notion of linearization, developed in order 
to take into account off-equilibrium operation 
of the system). Experimental methods based en¬ 
tirely on identification, i.e., aimed at deriving 
LPV models for the plant directly from input/ out¬ 
put data. In particular, some LPV identification 
techniques assume that one global identification 
experiment in which both the control input and 
the parameter vector are (persistently) excited in a 
simultaneous way, while others aim at deriving a 
parameter-dependent model on the basis of local 
experiments only, i.e., experiments in which the 
parameter vector is held constant and only the 
control input is excited. 

Modern control theory provides methods and 
tools to deal with design problems in which 
stability and performance have to be guaranteed 
also in the presence of model uncertainty, both 
for regulation around a specified operating point 
and for gain scheduled control system design. 
Therefore, modeling for control system synthesis 
should also provide methods to account for model 
uncertainty (both parametric and nonparametric) 
in the considered model class. 


Most of the existing control design literature 
assumes that the plant model is given in the form 
of a linear fractional transformation (LFT) (see, 
e.g., Skogestad and Postlethwaite (2007) for an 
introduction to LFT modeling of uncertainty and 
Hecker et al. (2005) for a discussion of algo¬ 
rithms and software tools). LFT models consist 
of a feedback interconnection between a nominal 
LTI plant and a (usually norm-bounded) operator 
which represents model uncertainties, e.g., poorly 
known or time-varying parameters, nonlineari¬ 
ties, etc. A generic such LFT interconnection 
is depicted in Fig. 1, where the nominal plant 
is denoted with P and the uncertainty block is 
denoted with A. The LFT formalism can be also 
used to provide a structured representation for the 
state-space form of LPV models, as depicted in 
Fig. 2, where the block A(a) takes into account 
the presence of the uncertain parameter vector 
a , while the block A(p) models the effect of 
the varying operating point, parameterized by the 
vector of time-varying parameters p. Therefore, 
LFT models can be used for the design of robust 
and gain scheduling controllers; in addition they 
can also serve as a basis for structured model 
identification techniques, where the uncertain pa¬ 
rameters that appear in the feedback blocks are 
estimated based on input/output data sequences. 
The process of extracting uncertain/scheduling 
parameters from the design model of the system 
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Block diagram of the typical LFT interconnection adopted 
in the robust control framework 



Model Building for Control System Synthesis, Fig. 2 

Block diagram of the typical LFT interconnection adopted 
in the robust LPV control framework 
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to be controlled is a highly complex one, in which 
symbolic techniques play a very important role. 
Tools already exist to perform this task (see, e.g., 
Hecker et al. 2005), while a recent overview of 
the state of the art in this research area can be 
found in Hecker and Varga (2006). 

Finally, it is important to point out that there is 
a vast body of advanced control techniques which 
are based on discrete-time models: 

x(k + 1) = f(x(k), u(k), p, k ) 

y(k) = g(x(k),u(k),p,k) w 

where the integer time step k usually corresponds 
to multiples of a sampling period T s . Many tech¬ 
niques are available to transform (2) into (8). 
Furthermore, LTI, LTV, and LPV models can 
be formulated in discrete time rather than in 
continuous time. 

Building Models for Control System 
Synthesis 

The development of control-oriented models of 
physical systems is a complex task, which in 
general implies a careful combination of prior 
knowledge about the physics of the system under 
study with information coming from experimen¬ 
tal data. In particular, this process can follow 
very different paths depending on the type of 
information which is available on the plant to be 
controlled. Such paths are typically classified in 
the literature as follows (see, e.g., Ljung (2008) 
for a more detailed discussion). 

White box modeling refers to the development 
of control-oriented models on the basis of first 
principles only. In this framework, one uses the 
available information on the plant to develop 
a detailed model using OOM or EOOL tools 
and subsequently works out a compact control- 
oriented model from it. If the adopted tool 
only supports simulation, then one can run 
simulations of the plant model, subject to suitably 
chosen excitation inputs (ranging from steps to 
persistently exciting input sequences such as, 
e.g., pseudorandom binary sequences and sine 


sweeps) and then reconstruct the dynamics by 
means of system identification methods. Note 
that in this way the structure/order selection stage 
of the system identification process provides 
effective means to manage the complexity versus 
accuracy tradeoff in the derivation of the compact 
model. A more direct approach, presently 
supported by many tools, is to directly compute 
the A, B, C, D matrices of the linearized system 
around specified equilibrium (trim) points, 
using symbolic and/or numerical linearization 
techniques. The result is usually a high-order 
linear system, which then can (sometimes 
must) be reduced to a low-order system by 
using model order reduction techniques (such 
as, e.g., balanced truncation). Model reduction 
techniques (see Antoulas (2009) for an in-depth 
treatment of this topic) allow to automatically 
obtain approximated compact models such as 
(3), starting from much more detailed simulation 
models, by formulating specific approximation 
bounds in control-relevant terms (e.g., percentage 
errors of steady-state output values, norm- 
bounded additive or multiplicative errors of 
weighted transfer functions, or L 2 -norm errors 
of output transients in response to specified input 
signals). 

Black box modeling, on the other hand, cor¬ 
responds to situations in which the modeling 
activity is entirely based on input-output data 
collected on the plant (which therefore must be 
already available), possibly in dedicated, suitably 
designed, experiments (see Ljung 1999). Regard¬ 
less of the type of model to be built (i.e., linear or 
nonlinear, time invariant or time varying, discrete 
time or continuous time), the black box approach 
consists of a number of well-defined steps. First 
of all the structure of the model to be identified 
must be defined: in the linear time-invariant case, 
this corresponds to the choice of the number of 
poles and zeros for an input-output model or 
to the choice of model order for a state-space 
representation; in the nonlinear case structure 
selection is a much more involved process in 
view of the much larger number of degrees of 
freedom which are potentially involved. Once 
a model structure has been defined, a suitable 
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cost function to measure the model performance 
must be selected (e.g., time-domain simulation or 
prediction error, frequency domain model fitting, 
etc.) and the experiments to collect identification 
and validation data must be designed. Finally, 
the uncertain model parameters must be esti¬ 
mated from the available identification dataset 
and the model must be validated on the validation 
dataset. 

Grey box modeling (in various shades) corre¬ 
sponds to the many possible intermediate cases 
which can occur in practice, ranging from the 
white box approach to the black box one. As 
recently discussed in Ljung (2008), the critical 
issue in the development of an effective approach 
to control-oriented grey box modeling lies in 
the integration of existing methods and tools for 
physical systems modeling and simulation with 
methods and tools for parameter estimation. Such 
integration can take place in a number of different 
ways depending on the relative role of data and 
priors on the physics of the system in the specific 
application. A typical situation which occurs fre¬ 
quently in applications is when a white box model 
(developed by means of OOM or EOOL tools) 
contains parameters having unknown or uncertain 
numerical values (such as, e.g., damping factors 
in structural models, aerodynamic coefficients 
in aircraft models and so on). Then, one may 
rely on input-output data collected in dedicated 
experiments on the real system to refine the 
white box model by estimating the parameters 
using the information provided by the data. This 
process is typically dependent on the specific 
application domain as the type of experiment, 
the number of measurements, and the estimation 
technique must meet application-specific con¬ 
straints (see, e.g., Klein and Morelli (2006) for 
an overview of grey box modeling in aerospace 
applications). 


Summary and Future Directions 

In this article the problem of model building 
for control system synthesis has been con¬ 


sidered. An overview of the different uses 
of mathematical models in control system 
design has been provided and the process 
of building compact control-oriented models 
starting from prior knowledge about the system 
and/or experimental data has been discussed. 
Present-day modeling and simulation tools 
support advanced control system design in a 
much more direct way. In particular, while 
methods and tools for the individual steps in the 
modeling process (such as OOM, linearization 
and model reduction, parameter estimation) are 
available, an integrated environment enabling 
the pursuit of all the abovementioned paths 
to the development of compact control- 
oriented models is still a subject for future 
development. The availability of such a tool 
might further promote the application of 
advanced, model-based techniques that are 
currently limited by the model development 
process. 
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Abstract 

Model order reduction (MOR) is here understood 
as a computational technique to reduce the order 
of a dynamical system described by a set of or¬ 
dinary or differential-algebraic equations (ODEs 
or DAEs) to facilitate or enable its simulation, 
the design of a controller, or optimization and 
design of the physical system modeled. It focuses 
on representing the map from inputs into the 
system to its outputs, while its dynamics are 
treated as a black box so that the large-scale 
set of describing ODEs/DAEs can be replaced 
by a much smaller set of ODEs/DAEs without 
sacrificing the accuracy of the input-to-output 
behavior. 


Keywords 

Balanced truncation; Interpolation-based meth¬ 
ods; Reduced-order models; SLICOT; 
Truncation-based methods 


Problem Description 

This survey is concerned with linear time- 
invariant (LTI) systems in state-space form 

Ex(t ) = Ax(t) + Bu(t), 
y(t) = Cx(t ) + Du(t), (1) 

where E, A e R nxn are the system matrices, 
B G R nXm is the input matrix, C e R pxn is the 
output matrix, and D eR pxm is the feedthrough 
(or input-output) matrix. The size n of the matrix 
A is often referred to as the order of the LTI 
system. It mainly determines the amount of time 
needed to simulate the LTI system. 

Such LTI systems often arise from a finite ele¬ 
ment modeling using commercial software such 
as ANSYS or NASTRAN which results in a 
second-order differential equation of the form 

Mx(t ) + Dx(t ) + Kx(t ) = Fu(t ), 
y(t) = C p x(t ) + C v x(t ), 

where the mass matrix M, the stiffness matrix K , 
and the damping matrix D are square matrices in 
W xs , F g R sxm , C p Xv e 7& qxs , x(t) e W, 
u(t ) G R m , y(t) G R q . Such second-order dif¬ 
ferential equations are typically transformed to a 
mathematically equivalent first-order differential 
equation 
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where E,A e M 2 * xZy , B e R 2sxm , C e R qx2s , 
z(t ) G M 2 *, G R m , y(t) G ML Various 
other linearizations have been proposed in the 
literature. 

The matrix E may be singular. In that case 
the first equation in (1) defines a system of 
differential-algebraic equations (DAEs); other¬ 
wise it is a system of ordinary differential equa¬ 
tions (ODEs). For example, for E = [ q o] with 
a j x j nonsingular matrix J, only the first j 
equations in the left-hand side expression in (1) 
form ordinary differential equations, while the 
last n — j equations form homogeneous linear 
equations. If further A = and B = 

\b 1 2 \ with the j x j matrix An, the j x m 
matrix B\ and a nonsingular matrix A 22 , this is 
easily seen: partitioning the state vector x(t) = 

^2(0 ] Xl W l en gth J > ^e DAE Ex(t ) = 
Ax(t) + Bu(t) splits into the algebraic equation 
0 = A 2 2 X 2 (t) + B 2 U 2 (t), and the ODE 

J X\(t) = A\\X\(t^) + — A\2A22 B2) u(t ). 

To simplify the description, only continuous¬ 
time systems are considered here. The discrete¬ 
time case can be treated mostly analogously; see, 
e.g., Antoulas (2005). 

An alternative way to represent LTI systems is 
provided by the transfer function matrix (TFM), 
a matrix-valued function whose elements are ra¬ 
tional functions. Assuming v(0) = 0 and tak¬ 
ing Laplace transforms in (1) yields sX(s) = 
AX(s) + BU(s), Y(s ) = CX(s) + DU(s), where 
X(s), Y(s), and U(s ) are the Laplace transforms 
of the time signals x(t),y(t) and u{t), respec¬ 
tively. The map from inputs U to outputs Y is 
then described by T(s) = G(s)U(s ) with the 
TFM 

G(s ) = C(sE - A)~ l B + D, s eC. (2) 

The aim of model order reduction is to find an 
LTI system 

Ex{t) = Ax(t)-\-Bu(t), y(t) = Cx(t)-\-Du(t) 

(3) 


of reduced-order r n such that the correspond¬ 

ing TFM 

G(s ) = C(sE - A)~ l B + D (4) 

approximates the original TFM (2). That is, 
using the same input u(t) in (1) and (3), we 
want that the output y(t) of the reduced order 
model (ROM) (3) approximates the output y(t) 
of (1) well enough for the application considered 
(e.g., controller design). In general, one requires 
\\y(t) — y(t)\\ < s for all feasible inputs u(t), 
for (almost) all t in the time domain of interest, 
and for a suitable norm || • ||. In control theory 
one often employs the C 2 - or Cqq -norms on R 
or [0, oo], respectively, to measure time signals 
or their Laplace transforms. In the situations 
considered here, the C 2 -norms employed in 
frequency and time domain coincide due to the 
Paley-Wiener theorem (or Parseval’s equation 
or the Plancherel theorem, respectively); see 
Antoulas (2005) a£d Zhou et al. (1 996) for 
details. As Y(s) — Y(s) = ( G(s ) — G(s))U(s), 
one can therefore consider the approximation 
error of the TFM || G(-) — G(-) || measured in an 
induced norm instead of the error in the output 

IIJ0-KOII. 

Depending on the choice of the norm, different 
MOR goals can be formulated. Typical choices 
are (see, e.g., Antoulas (2005) for a more thor¬ 
ough discussion) 

• IIG(•) - G(-)\\ noo , where 

II^OIIWoo = sup, 6C+ CT max (F(,s)). 

Here, (j max is the largest singular value of 
the matrix F(s). This minimizes the maximal 
magnitude of the frequency response of the 
error system and by the Paley-Wiener theorem 
bounds the C 2 -norm of the output error. 

• ||G(-) — G(-)||^ 2 , where (with i = V—T) 

i n +oo 

ll^(-)ll« 2 = ^ J tr (F(ico)*F(uo))dco. 

This ensures a small error || j(-)-T(0IUoc,(0,oo) 
= sup t >olb(0 - T(0lloo (With ||*||oo 
denoting the maximum norm of a vector) 
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uniformly over all inputs u(t ) having bounded 
/Venergy, that is, / 0 °° u(t) T u(t)dt < 1; see 
Gugercin et al. (2008). 

Besides a small approximation error, one may 
impose additional constraints for the ROM. One 
might require certain properties (such as sta¬ 
bility and passivity) of the original systems to 
be preserved. Rather than considering the full 
nonnegative real line in time domain or the full 
imaginary axis in frequency domain, one can 
also consider bounded intervals in both domains. 
For these variants, see, e.g., Antoulas (2005) and 
Obinata and Anderson (2001). 


to construct a real reduced-order model. All of 
the methods discussed in the following either do 
construct a real reduced-order system or there is 
a variant of the method which does. In order to 
keep this exposition at a reasonable length, the 
reader is referred to the cited literature. 

Truncation Based Methods 

The general idea of truncation is most easily 
explained by modal truncation'. For simplicity, 
assume that E = I and that A is diagonalizable, 
T~ l AT = Da = diag(Ai,..., X n ). Further we 
assume that the eigenvalues Xi e C of A can be 
ordered such that 


Methods 


Re(A„) < Re(A„_i) < ... < Re(Ai) < 0, (6) 


There are a number of different methods to con¬ 
struct ROMs, see, e.g., Antoulas (2005), Ben¬ 
ner et al. (2005), Obinata and Anderson (2001), 
and Schilders et al. (2008). Here we concen¬ 
trate on projection-based methods which restrict 
the full state x{t) to an r-dimensional subspace 
by choosing x{t) = W*x(t), where W is an 
n x r matrix. Here the conjugate transpose of 
a complex-valued matrix Z is denoted by Z*, 
while the transpose of a matrix Y will be denoted 
by Y t . Choosing V e C nxr such that W*V = 
I e M rxr yields an n x n projection matrix n = 
VW* which projects onto the r-dimensional sub¬ 
space spanned by the columns of V along the 
kernel of IF*. Applying this projection to (1), one 
obtains the reduced-order LTI system (3) with 

E = W*EV, A = W*AV, B = W*B, C = CV 

(5) 

and an unchanged D = D. If V = IF, n is 
an orthogonal projector and is called a Galerkin 
projection. If V ^ IF, n is an oblique projector, 
sometimes called a Petrov-Galerkin projection. 

In the following, we will briefly discuss the 
main classes of methods to construct suitable 
matrices V and W : truncation-based methods and 
interpolation-based methods. Other methods, in 
particular combinations of the two classes dis¬ 
cussed here, can be found in the literature. In case 
the original LTI system is real, it is often desirable 


(i.e., all eigenvalues lie in the open left half 
complex plane). This implies that the system is 
stable. Let V be the n x r matrix consisting of 
the first r columns of T and let IF* be the first r 
rows of T~\ that is, IF = F(F*F) _1 . Applying 
the transformation T to the LTI system (1) yields 

T~ l x(t ) = (T~ l AT)T~ x x{t) + ( T~ l B)u{t ) (7) 
y(t) = (C T)T~ l x(t) + Du(t) (8) 


with 


T~ l AT = 


W*AV 



, t~ 1 b= 


W*B~ 

B 2 


and CT = [C V C 2 ], where W*AV = 
diag(Ai,..., A r ) and A 2 = diag(A r +i,..., A„). 
Preserving the r dominant poles (eigenvalues 
with largest real part) by truncating the rest (i.e., 
discarding ^ 2 ,^ 2 , and C 2 from (7)) yields the 
ROM as in (5). It can be shown that the error 
bound 


||G(-) - GQIkoo 


< IIC 2 II ||ftII 


1 

|Re(A r+1 )| 


holds (Benner 2006). As eigenvalues contain only 
limited information about the system, this is not 
necessarily a meaningful reduced-order system. 
In particular, the dependence of the input-output 
relation on B and C is completely ignored. 
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This can be enhanced by more refined dominance 
measures taking B and C into account; see, e.g., 
Varga (1995) and Benner et al. (2011). 

More suitable reduced-order systems can be 
obtained by balanced truncation. To introduce 
this concept, we no longer need to assume A to 
be diagonalizable, but we require the stability 
of A in the sense of ( 6 ). For simplicity, we 
assume E = I. For treatment of the DAE case 
(.E 7 ^ /), see Benner et al. (2005, Chap. 3). 
Loosely speaking, a balanced representation 
of an LTI system is obtained by a change of 
coordinates such that the states which are hard 
to reach are at the same time those which are 
difficult to observe. This change of coordinates 
amounts to an equivalence transformation of the 
realization (A, B,C, D) of(l) called state-space 
transformation as in (7), where T now is the ma¬ 
trix representing the change of coordinates. The 
new system matrices ( T~ l AT,T~ l B,CT, D ) 
form a balanced realization of (1). Truncating in 
this balanced realization the states that are hard 
to reach and difficult to observe results in a ROM. 

Consider the Lyapunov equations 

AP + PA t +BB t = 0, A t Q + QA+C t C = 0. 

(9) 

The solution matrices P and Q are called con¬ 
trollability and observability Gramians, respec¬ 
tively. If both Gramians are positive definite, the 
LTI system is minimal. This will be assumed 
from here on in this section. 

In balanced coordinates the Gramians P 
and Q of a stable minimal LTI system satisfy 
P = Q = diag(or,..., o n ) with the Hankel 
singular values or > 02 > ... > o n >0. The 
Hankel singular values are the positive square 
roots of the eigenvalues of the product of the 
Gramians PQ, = \A k(PQ)- They are 
system invariants, i.e., they are independent of 
the chosen realization of ( 1 ) as they are preserved 
under state-space transformations. 

Given the LTI system (1) in a non-balanced 
coordinate form and the Gramians P and Q 
satisfying (9), the transformation matrix T which 
yields an LTI system in balanced coordinates 
can be computed via the so-called square root 
algorithm as follows: 


• Compute the Cholesky factors S and R of the 
Gramians such that P = S T S, Q = R T R. 

• Compute the singular value decomposition of 
SR t = d>ET r , where d> and T are orthogo¬ 
nal matrices and E is a diagonal matrix with 
the Hankel singular values on its diagonal. 
T = S 1 <FE 2 yields the balancing trans¬ 
formation (note that T~ l = E^cl> r S~ T = 
E-5T r «). 

• Partition <5, S, T into blocks of corresponding 
sizes, 



with Si = diag(oi,..., oy) and apply T 
to (1) to obtain (7) with 


W t B~ 

B 2 ’ 

( 10 ) 

and CT = [CV C 2 \ for W = R t T x S^ and 

_i 

V = S T &i'E l 2 . Preserving the r dominant 
Hankel singular values by truncating the rest 
yields the reduced-order model as in (5). 

As W T V = /, balanced truncation is an oblique 
projection method. The reduced-order model is 
stable with the Hankel singular values o \,..., a r . 
It can be shown that if o> > o>+i, the error bound 

n 

l|G(-) —G(-)|| Woo <2 °k (ID 

k=r +1 

holds. Given an error tolerance, this allows to 
choose the appropriate order r of the reduced 
system in the course of the computations. 

As the explicit computation of the balancing 
transformation T is numerically hazardous, one 
usually uses the equivalent balancing-free square 
root algorithm (Varga 1991) in which orthogonal 
bases for the column spaces of V and W are 
computed. The so obtained ROM is no longer 
balanced, but preserves all other properties (er¬ 
ror bound, stability). Furthermore, it is shown 
in Benner et al. (2000) how to implement the 
balancing-free square root algorithm using low- 
rank approximations to S and R without ever 


T~ l AT : 


W 1 AV A u 
A21 A22 


, T~ l B : 
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having to resort to the square solution matrices 
P and Q of the Lyapunov equations (9). This 
yields an efficient algorithm for balanced trunca¬ 
tion for LTI systems with large dense matrices. 
For systems with large-scale sparse A efficient 
algorithms based on sparse solvers for (9) exist; 
see Benner (2006). 

By replacing the solution matrices P and Q 
of (9) by other pairs of positive (semi-)definite 
matrices characterizing alternative controllability 
and observability related system information, 
one obtains a family of model reduction methods 
including stochastic/bounded-real/positive-real 
balanced truncation. These can be used if further 
properties like minimum phase, passivity, etc. are 
to be preserved in the reduced-order model; for 
further details, see Antoulas (2005) and Obinata 
and Anderson (2001). 

The balanced truncation yields good approxi¬ 
mation at high frequencies as G{ico) —> G(ico ) 
for co —> oo (as D = D ), while the maximum 
error is often attained for a; = 0. For a perfect 
match at zero and a good approximation for low 
frequencies, one may employ the singular pertur¬ 
bation approximation (SPA, also called balanced 
residualization). In view of (7) and (10), balanced 
truncation can be seen as partitioning T~ l x ac¬ 
cording to ( 10 ) into [xf , x^Y and setting X 2 = 0 
(i.e., X 2 = 0 as well). For SPA, one only sets 
x 2 = 0, such that 

X\ — WTAVx\ T A\ 2 X 2 + I VTBu, 

0 = A2\X\ + A.22-^2 T B 2 U. 

Solving the second equation for V 2 and inserting 
it into the first equation yields 

ii = (WTAV -AnA^A 21 ) Xl 
+ (WTB -A 12 A 22 B 2 ) u. 

For the output equation, it follows 

y — (C V — C 2 A 22 A 21 ) x\ + (D — C 2 A 22 Bi) w. 

This reduced-order model makes use of the in¬ 
formation in the matrices A 12 , A 21 , A 22 , B 2 , and 
C 2 discarded by balanced truncation. It fulfills 


G(0) = G(0) and the error bound (11); more¬ 
over, it preserves stability. 

Besides SPA, another related truncation 
method that is not based on projection is 
optimal Hankel norm approximation (HNA). The 
description of HNA is technically quite involved; 
for details, see Zhou et al. (1996) and Glover 
(1984). It should be noted that the so obtained 
ROM usually has similar stability and accuracy 
properties as for balanced truncation. 

Interpolation-Based Methods 

Another family of methods for MOR is based 
on (rational) interpolation. The unifying feature 
of the methods in this family is that the origi¬ 
nal TFM (2) is approximated by a rational ma¬ 
trix function of lower degree satisfying some 
interpolation conditions (i.e., the original and 
the reduced-order TFM coincide, e.g., G(^o) = 
G(sq) at some predefined value so for which 
A — soE is nonsingular). Computationally this 
is usually realized by certain Krylov subspace 
methods. 

The classical approach is known under 
the name of moment-matching or Pade(-type) 
approximation. In these methods, the transfer 
functions of the original and the reduced order 
systems are expanded into power series, and 
the reduced-order system is then determined so 
that the first coefficients in the series expansions 
match. In this context, the coefficients of the 
power series are called moments, which explains 
the term moment matching. 

Classically the expansion of the TFM (2) in a 
power series about an expansion point So 

00 

G(s) = Mj(s 0 )(s - s 0 y (12) 
j= 0 

is used. The moments Mj (so), j = 0,1,2,..., 
are given by 

Mj(s 0 ) = -C [(A-soEr'Ey'iA-soEr'B. 

Consider the (block) Krylov subspace JCk (F, H ) 
= span {H,FH,F 2 H,...,F k ~ l H} for F = 
(.A - s 0 E)~ l E and H = -(A - s 0 E)~ l B with 
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an appropriately chosen expansion point So which 
may be real or complex. From the definitions 
of A,B, and E , it follows that F e IK WXW 
and H e K nxm , where IK = R or IK = C 
depending on whether sq is chosen in R or in 
C. Considering JCk(F,H) column by column, 
this leads to the observation that the number of 
column vectors in {H, FH, F 2 H,..., F k ~ l H} 
is given by r = m • k, as there are k blocks 
F j H e IK" xm ,y = 0,...,£ - 1. In the case 
when all r column vectors are linearly inde¬ 
pendent, the dimension of the Krylov subspace 
JCk(F , H) is m • k. Assume that a unitary basis 
for this block Krylov subspace is generated such 
that the column space of the resulting unitary 
matrix V e C nxr spans JCk(F, G). Applying the 
Galerkin projection TI = VV* to (1) yields a re¬ 
duced system whose TFM satisfies the following 
(Hermite) interpolation conditions at So- 

G^ j \sq) = G^\so), j = 0, 1 ,. .., k — 1 . 

That is, the first k — 1 derivatives of G and G 
coincide at sq. Considering the power series ex¬ 
pansion ( 12 ) of the original and the reduced-order 
TFM, this is equivalent to saying that at least the 
first k moments M j (so) of the transfer function 
G(s ) of the reduced system (3) are equal to the 
first k moments Mj (so) of the TFM G(s) of the 
original system ( 1 ) at the expansion point So: 

Mj(s 0 ) = Mj(s 0 ), j = 0 , 1 , ...,k- 1 . 

If further the r columns of the unitary matrix W 
span the block Krylov subspace JCk (F, H ) for 
F = (A—soE)~ t E and H = -(A-s 0 E)- r C T , 
applying the Petrov-Galerkin projection TI = 
V(W*V)~ l W* to (1) yields a reduced system 
whose TFM matches at least the first 2k moments 
of the TFM of the original system. 

Theoretically, the matrix V (and W) 
can be computed by explicitly forming the 
columns which span the corresponding Krylov 
subspace JCk (F, H ) and using the Gram-Schmidt 
algorithm to generate unitary basis vectors for 
JCk(F,H). The forming of the moments (the 
Krylov subspace blocks F J H) is numerically 
precarious and has to be avoided under all 


circumstances. Using Krylov subspace methods 
to achieve an interpolation-based ROM as 
described above is recommended. The unitary 
basis of a (block) Krylov subspace can be 
computed by employing a (block) Arnoldi or 
(block) Lanczos method; see, e.g., Antoulas 
(2005), Golub and Van Loan (2013), and Freund 
(2003). 

In the case when an oblique projection is 
to be used, it is not necessary to compute two 
unitary bases as above. An alternative is then to 
use the nonsymmetric Lanczos process (Golub 
and Van Loan 2013). It computes bi-unitary (i.e., 
W*V = I r ) bases for the above mentioned 
Krylov subspaces and the reduced-order model 
as a by-product of the Lanczos process. An 
overview of the computational techniques for 
moment-matching and Pade approximation 
summarizing the work of a decade is given in 
Freund (2003) and the references therein. 

In general, the discussed MOR approaches are 
instances of rational interpolation. When the 
expansion point is chosen to be So = oo, 
the moments are called Markov parameters and 
the approximation problem is known as partial 
realization. If so = 0, the approximation problem 
is known as Pade approximation. 

As the use of one single expansion point So 
leads to good approximation only close to so, 
it might be desirable to use more than one ex¬ 
pansion point. This leads to multipoint moment¬ 
matching methods, also called rational Krylov 
methods; see, e.g., Ruhe and Skoogh (1998), 
Antoulas (2005), and Freund (2003). 

In contrast to balanced truncation, these (ratio¬ 
nal) interpolation methods do not necessarily pre¬ 
serve stability. Remedies have been suggested; 
see, e.g., Freund (2003). 

The use of complex-valued expansion points 
will lead to a complex-valued reduced-order sys¬ 
tem (3). In some applications (in particular, in 
case the original system is real valued), this 
is undesired. In that case one can always use 
complex-conjugate pairs of expansion points as 
then the entire computations can be done in real 
arithmetic. 

The methods just described provide good ap¬ 
proximation quality locally around the expansion 
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points. They do not aim at a global approxi¬ 
mation as measured by the % 2 - or 7/oo-norms. 
In Gugercin et al. (2008), an iterative procedure 
is presented which determines locally optimal 
expansion points w.r.t. the 7 / 2 -norm approxima¬ 
tion under the assumption that the order r of 
the reduced model is prescribed and only 0th- 
and lst-order derivatives are matched. Also, for 
multi-input multi-output systems (i.e., m and p 
in (1) are both larger than one), no full mo¬ 
ment matching is achieved, but only tangential 
interpolation: G(sj)bj = G(sj)bj , c*G(sj ) = 
cJ G (Sj ), c*G'(sj)bj = c*G'(sj )bj, for certain 
vectors bj , cj determined together with the opti¬ 
mal Sj by the iterative procedure. 


Tools 

Almost all commercial software packages 
for structural dynamics include modal analy¬ 
sis/truncation as a means to compute a ROM. 
Modal truncation and balanced truncation are 
available in the MATLAB® Control System 
Toolbox and the MATLAB® Robust Control 
Toolbox. 

Numerically reliable, well-tested, and efficient 
implementations of many variants of balancing- 
based MOR methods as well as Hankel 
norm approximation and singular perturbation 
approximation can be found in the Subroutine 
Library In Control Theory (SLICOT, http://www. 
slicot.org) (Varga 2001). Easy-to-use MATLAB 
interfaces to the Fortran 77 subroutines from 
SLICOT are available in the SLICOT Model 
and Controller Reduction Toolbox (http:// 
slicot.org/matlab-toolboxes/basic- control) ; see 
Benner et al. (2010). An implementation of 
moment matching via the (block) Arnoldi 
method is available in MOR for ANSYS® (http:// 
modelreduction.com/Software.html). 

There exist benchmark collections with 
mainly a number of LTI systems from various 
applications. There one can find systems in 
computer-readable format which can easily be 
used to test new algorithms and software: 


• Oberwolfach Model Reduction Benchmark 
Collection 

http://simulation.uni-freiburg.de/downloads/ 

benchmark/ 

• NICONET Benchmark Examples 
http://www.icm.tu-bs.de/NICONET/ 
benchmodred.html 

The MOR WiKi http://morwiki.mpi-magde 
burg.mpg.de/morwiki/ is a platform for MOR 
research and provides discussions of a number 
of methods, links to further software packages 
(e.g., MOREMBS and MORPACK), as well as 
additional benchmark examples. 

Summary and Future Directions 

MOR of LTI systems can now be considered 
as an established computational technique. Some 
open issues still remain and are currently investi¬ 
gated. These include methods yielding good ap¬ 
proximation in finite frequency or time intervals. 
Though numerous approaches for these tasks 
exist, methods with sharp local error bounds are 
still desirable. A related problem is the reduction 
of closed-loop systems and controller reduction. 
Also, the generalization of the methods discussed 
in this essay to descriptor systems (i.e., systems 
with DAE dynamics), second-order systems, or 
unstable LTI systems has only been partially 
achieved. An important problem class getting a 
lot of current attention consists of (uncertain) 
parametric systems. Here it is important to pre¬ 
serve parameters as symbolic quantities in the 
ROM. Most of the current approaches are based 
in one way or another on interpolation. MOR for 
nonlinear systems has also been a research topic 
for decades. Still, the development of satisfactory 
methods in the context of control design having 
computable error bounds and preserving interest¬ 
ing system properties remains a challenging task. 

Cross-References 

► Basic Numerical Methods and Software for 
Computer Aided Control Systems Design 

► Multi-domain Modeling and Simulation 
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Synonyms 

MRAC 


Abstract 

The fundamentals and design principles of 
model reference adaptive control (MRAC) 
are described. The controller structure and 
adaptive algorithms are delineated. Stability and 
convergence properties are summarized. 

Keywords 

Certainty equivalence; Lyapunov-SPR design; 
MIT rule 


Introduction 

Model reference adaptive control (MRAC) is an 
important adaptive control approach, supported 
by rigorous mathematical analysis and effective 
design toolsets. It is made up of a feedback 
control law that contains a controller C(s,6 c ) 
and an adjustment mechanism that generates the 
controller parameter updates 0 C (t) online. While 
different MRAC configurations can be found 
in the literature, the structure shown in Fig. 1 
is commonly used and includes all the basic 
components of an MRAC system. The prominent 
features of MRAC are that it incorporates a 
reference model which represents the desired 
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Model Reference Adaptive Control, Fig. 1 Schematic 
ofMRAC 

input-output behavior and that the controller 
and adaptation law are designed to force the 
response of the plant, y p , to track that of the 
reference model, y m , for any given reference 
input r. 

Different approaches have been used to 
design MRAC, and each may lead to a different 
implementation scheme. The implementation 
schemes fall into two categories: direct and 
indirect MRAC. The former updates the 
controller parameters 6 C directly using an 
adaptive law, while the latter updates the plant 
parameters 6 P first using an estimation algorithm 
and then updates 6 C by solving, at each time t, 
certain algebraic equations that relate 9 C with the 
online estimates of 6 p . In both direct and indirect 
MRAC schemes, the controller structure is kept 
the same as that which would be used in the case 
that the plant parameters are known. 


MRC Controller Structure 

Consider the design objective of model reference 
control (MRC) for linear time-invariant systems: 
Given a reference model M(s ), find a control law 
such that the closed-loop system is stable and 
y P —► ym as t oo for any bounded reference 
signal r. 

For the case of known plant parameters, the 
MRC objective can be achieved by designing 
the controller so that the closed-loop system 
has a transfer function equal to M(s). This 


is the so-called model matching condition. 
To assure the existence of a causal controller 
that meets the model matching condition and 
guarantees internal stability of the closed- 
loop system, the following assumptions are 
essential: 

• Al. The plant has a stable inverse, and 
the reference model is chosen to be 
stable. 

• A2. The relative degree of M(s ) is equal to or 
greater than that of the plant G p (s). Herein, 
the relative degree of a transfer function refers 
to the difference between the orders of the 
denominator and numerator polynomials. 

It should be noted that these assumptions are 
imposed to the MRC problem so that there is 
enough structural flexibility in the plant and in 
the reference model to meet the control objec¬ 
tives. Al is necessary for maintaining internal 
stability of the system while meeting the model 
matching condition, and A2 is needed to ensure 
the causality of the controller. Both assumptions 
are essential for non-adaptive applications when 
the plant parameters are known, let alone for 
the adaptive cases when the plant parameters are 
unknown. 

The reference model plays an important role 
in MRAC, as it will define the feasibility of 
MRAC design as well as the performance of 
the resulting closed-loop MRAC system. The 
reference model should reflect the desired closed- 
loop performance. Namely, any time-domain or 
frequency-domain specifications, such as time 
constant, damping ratio, natural frequency, band¬ 
width, etc., should be properly reflected in the 
chosen transfer function M(s). 

The controller structure for the MRAC is de¬ 
rived with these assumptions for the known plant 
case and extended to the adaptive case by com¬ 
bining it with a proper adaptive law. Under as¬ 
sumptions A1-A2, there exist infinitely many 
control solutions C that will achieve the MRC 
design objective for a given plant transfer func¬ 
tion G p (s). Nonetheless, only those extendable 
to MRAC with the simplest structure are of in¬ 
terest. It is known that if a special controller 
structure with the following parametrization is 
imposed, then the solution to the model matching 
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condition, in terms of the ideal parameter 0 *, will 
be unique: 

Up = 0 * t cd x + 6* t cd 2 + 9fy p + c*r = 9fo) 
where 0* e R ln , ft is the order of the plant, 


■ e *" 


0)1 

e* 


0) 2 

e; 

* 

, CO = 

y P 

l c o j 


r 


and co\,cc >2 e R" 1 are signals internal to the 
controller generated by stable filters (Ioannou and 
Sun 1996). 

This MRC control structure is particularly ap¬ 
pealing for adaptive control development, as the 
parameters appear linearly in the control law ex¬ 
pression, leading to a convenient linear paramet¬ 
ric model for adaptive algorithm development. 

Adaptation Algorithm 

Design of adaptive algorithms for parameter 
updating can be pursued in several different 
approaches, thereby resulting in different MR AC 
schemes. Three direct design approaches, 
namely, the Lyapunov-SPR, the certainty 
equivalence, and the MIT rule, will be briefly 
described together with indirect MRAC. 

Lyapunov-SPR Design 

One popular MRAC algorithm is derived us¬ 
ing Lyapunov’s direct method and the Meyer- 
Kalman-Yakubovich (MKY) Lemma based on 
the strictly positive real (SPR) argument. The 
concept of SPR transfer functions originates from 
network theory and is related to the driving point 
impendence of dissipative networks. The MKY 
Lemma states that given a stable transfer function 
M(s ) and its realization (A, B, C, d) where d > 
0 and all eigenvalues of the matrix A are in the 
open left half plane: If M(s ) is SPR, then for 
any given positive definite matrix L = L T > 0, 
there exists a scalar v > 0 , a vector q, and a 
P = P T > 0 such that 


A T P + PA — -qq T vL 
PB-C = ±qV2d 

By choosing M(s ) to be SPR, one can formulate 
a Lyapunov function consisting of the state track¬ 
ing and parameter estimation errors and use the 
MKY Lemma to define the adaptive law that will 
force the derivative of the Lyapunov function to 
be semi-negative definite. The resulting adaptive 
law has the following simple form: 

9 = —T^itusign(co) 

where e\ = y p — y m is simply the tracking error 
and Cq = k m /k p with k m ,k p being the high 
frequency gain of the transfer function for the 
reference model M(s ) and the plant G p (s ), re¬ 
spectively. This algorithm, however, applies only 
to systems with relative degree equal to 0 or 1, 
which is implied by the SPR condition imposed 
on M(s ) and assumption A2. 

The Lyapunov-SPR-based MRAC design is 
mathematically elegant in its stability analysis but 
is restricted to a special class of systems. While 
it can be extended to more general cases with 
relative degrees equal to 2 and 3, the resulting 
control law and adaptive algorithm become much 
more complicated and cumbersome as efforts 
must be made to augment the control signal in 
such a way that the MKY Lemma is applicable to 
the “reformulated” reference model. 

Certainty Equivalence Design 

For more general cases with a high relative de¬ 
gree, another design approach based on “certainty 
equivalence” (CE) principle is preferred, due to 
the simplicity in its design as well as its robust¬ 
ness properties in the presence of modeling er¬ 
rors. This approach treats the design of the adap¬ 
tive law as a parameter estimation problem, with 
the estimated parameters being the controller 
parameter vector 9*. Using the specific linear 
formulation of the control law and assuming 
that 9* satisfies the model matching condition, 
one can show that the ideal controller parameter 
satisfies the following parametric equation: 
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9* T co n 


with 


z = M(s)u p ,o) p 


M(s)coi 

M(s)y p 

_ y P _ 


This parametric model allows one to derive adap¬ 
tive laws to estimate the unknown controller 
parameter 0* using standard parameter identifi¬ 
cation techniques, such as the gradient and least 
squares algorithms. The corresponding MRAC is 
then implemented in the CE sense where the un¬ 
known parameters are replaced by their estimated 
value. It should be noted that a CE design does 
not guarantee closed-loop stability of the result¬ 
ing adaptive system, and additional analysis has 
been carried out to establish closed-loop stability. 


MIT Rule 

Besides the Lyapunov-SPR and CE approaches 
mentioned earlier, the direct MRAC problem can 
also be approached using the so-called MIT rule, 
an early form of MRAC developed in the 1950s- 
1960s in the Instrumentation Laboratory at MIT 
for flight control. The designer defines a cost 
function, e.g., a quadratic function of tracking 
error, and then adjusts parameters in the direction 
of steepest descent. The negative gradient of the 
cost function is usually calculated through the 
sensitivity derivative approach. The formulation 
is quite flexible, as different forms of MIT rule 
can be derived by changing the cost function 
following the same procedure and reusing the 
same sensitivity functions. Despite its effective¬ 
ness in some practical applications, MRAC sys¬ 
tems designed with MIT rule have had stability 
and robustness issues. 


Indirect MRAC 

While most of the MRAC systems are 
designed as direct adaptive systems, indirect 
MRAC systems can also be developed which 


explicitly estimate the plant parameter 0 p as 
an intermediate step. The adaptive law for an 
indirect MRAC includes two basic components: 
one for estimating the plant parameters and 
another for calculating the controller parameters 
based on the estimated plant parameters. This 
approach would be preferred if the plant transfer 
function is partially known, in which case 
the identification of the remaining unknown 
parameters represents a less complex problem. 
For example, if the plant has no zeros, the indirect 
scheme estimates n + 1 parameters, while the 
direct scheme has to estimate 2 n parameters. 

Indirect MRAC is a CE-based design. As such, 
the design is intuitive but the design process does 
not guarantee closed-loop stability, and separate 
analysis has to be carried out to establish stability. 
Except for systems with a low number of zeros, 
the “feasibility” problem could also complicate 
the matter, in the sense that the MRC problem 
may not have a solution for the estimated plant 
at some time instants even though the solution 
exists for the real plant. This problem is unique 
to the indirect design, and several mitigating 
solutions have been found at the expense of more 
complicated adaptation or control algorithms. 

Stability, Robustness, and Parameter 
Convergence 

Stability for MRAC often refers to the properties 
that all signals are bounded and tracking error 
converges to zero asymptotically. Robustness for 
adaptive systems implies that signal boundedness 
and tracking error convergence (to a small residue 
set) will be preserved in the presence of small 
perturbations such as disturbances, un-modeled 
dynamics, and time-varying parameters. For dif¬ 
ferent MRAC schemes, different approaches are 
used to establish their properties. 

For the Lyapunov-SPR-based MRAC systems, 
stability is established in the design process 
where the adaptive law is derived to enforce 
a Lyapunov stability condition. For CE-based 
designs, establishing stability for the closed- 
loop MRAC system is a nontrivial exercise 
for both direct and indirect schemes. Using 
properly normalized adaptive laws for parameter 
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estimation, however, stability can be proved 
for direct and indirect MRAC schemes. For 
MRAC systems designed with the MIT rule, local 
stability can be established under more restrictive 
conditions, such as when the parameters are close 
to the ideal ones. 

It should be noted that the adaptive control 
algorithm in the original form has been 
shown to have robustness issues, and extensive 
publications in the 1980s and 1990s were 
devoted to robust adaptive control in attempts 
to mitigate the problem. Many modifications 
have been proposed and shown to be effective 
in “robustifying” the MRAC; interested readers 
are referred to the article on ► Robust Adaptive 
Control for more details. 

Parameter convergence is not an intrinsic 
requirement for MRAC, as tracking error 
convergence can be achieved without parameter 
convergence. It has been shown, however, 
that parameter convergence could enhance 
robustness, particularly for indirect schemes. 
As in the case for parameter identification, a 
persistent excitation (PE) condition needs to 
be imposed on the regression signal to assure 
parameter convergence in MRAC. In general, 
PE is accomplished by properly choosing the 
reference input r. It can be established for most 
MRAC approaches that parameter convergence is 
achieved if, in addition to conditions required for 
stability, the reference input r is sufficiently rich 
of order 2 n, r is bounded, and there is no pole- 
zero cancelation in the plant transfer function. A 
signal is called to be sufficiently rich of order m 
if it contains at least mil distinct frequencies. 

Summary and Future Directions 

MRAC incorporates a reference model to capture 
the desired closed-loop responses and designs 
the control law and adaptation algorithm to force 
the output of the plant to follow the output of 
the reference model. Several different design 
approaches are available. Stability, robustness, 
and parameter convergence have been established 
for different MRAC designs with appropriate 
assumptions. 


MRAC had been a very active and fruitful 
research topic from the 1960s to 1990s, and it 
formed important foundations for modern adap¬ 
tive control theory. It also found many successful 
applications ranging from chemical process con¬ 
trols to automobile engine controls. More recent 
efforts have been mostly devoted to integrating it 
with other design approaches to treat nonstandard 
MRAC problems for nonlinear and complex dy¬ 
namic systems. 

Cross-References 

► Adaptive Control of Linear Time-Invariant 
Systems 

► Adaptive Control, Overview 

► History of Adaptive Control 

► Robust Adaptive Control 

Recommended Reading 

MRAC has been well covered in several text¬ 
books and research monographs. Astrom and 
Wittenmark (1994) presented different MRAC 
schemes in a tutorial fashion. Narendra and An¬ 
nas wamy (1989) focused on stability of determin¬ 
istic MRAC systems. Ioannou and Sun (1996) 
covered the detailed derivation and analysis of 
different MRAC schemes and provided a unified 
treatment for their stability and robustness analy¬ 
sis. MRAC systems for discrete-time (Goodwin 
and Sin 1984) and for nonlinear (Krstic et al. 
1995) processes are also well explored. 
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Abstract 

In many applications, e.g., in chemical process 
control, the purpose of control is to achieve an 
optimal performance of the controlled system 
despite the presence of significant uncertainties 
about its behavior and of external disturbances. 
Tracking of set points is often required for lower- 
level control loops, but at the system level in 
most cases, this is not the primary concern 
and may even be counterproductive. In this 
entry, the use of dynamic online optimization 
on a moving finite horizon to realize optimal 
system performance is discussed. By real¬ 
time optimization, a performance-oriented 
or economic cost criterion is minimized or 
maximized over a finite horizon while the usual 
control specifications enter as constraints but 
not as set points. This approach integrates the 
computation of optimal set-point trajectories and 
of the regulation to these trajectories. 

Keywords 

Model-predictive control (MPC); Integrated op¬ 
timization and control; Real-time optimization 
(RTO); Performance optimizing control; Process 
control 

Introduction 

From a systems point of view, the purpose of 
automatic feedback control (and that of manual 
control as well) in many cases is not primarily to 
keep the controlled variables at their set points as 
well as possible or to track dynamic set-point 
changes but to operate the system such that 
its performance is optimized in the presence 


of disturbances und uncertainties, exploiting the 
information gained in real time from the available 
measurements. This holds generally for the 
higher control layers in the process industries but 
similarly for many other applications. Suppose 
that, for example, the availability of cooling 
water at a lower temperature than assumed as 
a worst case during plant design enables plant 
operation at a higher throughput. In this case, 
what sense does it make to enforce the nominal 
operating point by tight feedback control? For 
a combustion engine, the goal is to achieve the 
desired torque with minimum consumption of 
fuel. For a cooling system, the goal is to keep 
the temperature of the goods or of a room within 
certain bounds with minimum consumption of 
energy, possibly weighted against the wear of the 
equipment. To regulate some variables to their 
set points may help to achieve these goals but it 
is not the real performance target for the overall 
system. Feedback control loops therefore usually 
are part of control hierarchies that establish 
good performance of the overall system and the 
meeting of constraints on its operation. 

There are four main approaches to the integra¬ 
tion of feedback control with system performance 
optimization: 

- Choice of regulated variables such that, im¬ 
plicitly via the regulation of these variables to 
their set points, the performance of the overall 
system is close to optimal (see the chapter on 
► Control Structure Selection). 

- Tracking of necessary conditions of optimality 
where variables which determine the optimal 
operating policy are kept at or close to their 
constraints. This is a widespread approach 
especially in chemical batch processes where, 
e.g., the feeding of reactants is such that the 
maximum cooling power available is used 
(Finkler et al. 2014); see also the chapter 
on ► Control and Optimization of Batch Pro¬ 
cesses). 

In these two approaches, the choice of the opti¬ 
mal set points or constraints to be tracked is done 
off-line, and they are then implemented by the 
feedback layer of the process control hierarchy 
(see the chapter on ► Control Hierarchy of Farge 
Processing Plants: An Overview). 
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- Combination of a regulatory (tracking) feed¬ 
back control with an optimization of the set 
points or system trajectories (called real-time 
optimization in the process industries) (see 
the chapter on ►Real-Time Optimization of 
Industrial Processes). 

- Reformulation of model-predictive control 
such that the control target is not the tracking 
of references but the optimization of the 
system performance over a finite horizon, 
taking constraints of system variables or 
inputs into account directly within the 
online optimization. Here, the optimization 
is performed with a dynamic model, in 
contrast to the steady-state optimization in 
real-time optimization or in the choice of self- 
optimizing control structures. 

The first three approaches are currently state 
of the art in the process industries. Tracking of 
necessary conditions of optimality is usually de¬ 
signed based on process insight rather than based 
upon a rigorous analysis, and the same holds 
for the selection of regulatory control structures. 
The last one is the most challenging approach in 
terms of the required models and algorithms and 
computing power, and its theoretical foundations 
are still under development. But on the other 
hand, it also has the highest potential in terms 
of the resulting performance of the controlled 
system, and it is structurally simple and easier to 
tune because the natural performance specifica¬ 
tion does not have to be translated into controller 
tunings, weights, etc. Therefore, the idea of direct 
model-based performance optimizing control has 
found much attention in process control in recent 
years. 

The four approaches above are discussed in 
more detail below. We also provide some histor¬ 
ical notes and outline some areas of continuing 
research. 


Performance Optimization 
by Regulation to Fixed Set Points 

Morari et al. (1980) stated that the objective in 
the synthesis of a control structure is “to trans¬ 
late the economic objectives into process control 


objectives.” A subgoal in this “translation” is to 
select the regulatory control structure of a process 
such that steady-state optimality of process oper¬ 
ations is realized to the maximum extent possible 
by driving the selected controlled variables to 
suitably chosen set points. A control structure 
with this property was termed “self-optimizing 
control” by Skogestad (2000). It should adjust 
the manipulated variables by keeping a function 
of the measured variables constant such that the 
process is operated at the economically optimal 
steady state in the presence of disturbances. From 
a system point of view, a control structure that 
yields nice transient responses and tight control 
of the selected variables may be of little use or 
even counterproductive if keeping the regulated 
variables at their set points does not improve 
the performance of the system. Ideally, in the 
steady state, a similar performance is obtained as 
it would be realized by optimizing the stationary 
values of the operational degrees of freedom of 
the system for known disturbances d and a per¬ 
fect model. By regulating the controlled variables 
to their set points at the steady state in the pres¬ 
ence of disturbances, a mapping u = /(y set , d) 
is implicitly realized which should be an approx¬ 
imation of the performance optimizing inputs 
^opt(^)- The choice of the self-optimizing control 
structure takes only the steady-state performance 
into account, not the dynamic reaction of the 
controlled plant. An extension of the approach to 
include also the dynamic behavior can be found 
in Pham and Engell (2011). 

Tracking of Necessary Conditions 
of Optimality 

Very often, the optimal operation of a system in 
a certain phase of its evolution or under certain 
conditions is defined by some variables being 
at their constraints. If these variables are known 
and the conditions can be monitored, a switching 
control structure can be built that keeps the (pos¬ 
sibly changing) set of critical variables at their 
constraints despite inaccuracies of the model, 
external disturbances, etc. In fact it turns out that 
such control schemes can, in the case of varying 
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parameters and in the presence of disturbances, 
perform as good as sophisticated model-based 
optimization schemes (Finkler et al. 2013). 

Performance Optimization 
by Steady-State Optimization 
and Regulation 

A well-established approach to create a link be¬ 
tween regulatory control and the optimization of 
the performance of a system is to compute the 
set points of the controllers by an optimization 
layer. In process operations, this layer is called 
real-time optimization (RTO) (see, e.g., Marlin 
and Hrymak (1997) and the references therein). 
An RTO system is a model-based, upper-level 
control system that is operated in closed loop and 
provides set points to the lower-level control sys¬ 
tems in order to maintain the process operation 
as close as possible to the economic optimum. 
It usually comprises an estimation of the plant 
state and plant parameters from the measured 
data and an economic or otherwise performance- 
related optimization of the operating point using 
a detailed nonlinear steady-state model. 

As the RTO system employs a stationary pro¬ 
cess model and the optimization is only per¬ 
formed if the plant is approximately in a steady 
state, the time between successive RTO steps 
must be large enough for the plant to reach a new 
steady state after the last commanded move. This 
structure is based upon a separation of concerns 
and of time-scales between the RTO system and 
the process control system. The RTO system 
optimizes the system economics on a medium 
timescale (shifts to days), while the control sys¬ 
tem provides tracking and disturbance rejection 
on shorter timescales from seconds to hours. 


As an approximation to real-time optimization 
with a nonlinear rigorous plant model, in many 
MPC implementations nowadays, an optimiza¬ 
tion of the steady-state values based on the linear 
model that is used in the MPC controller is imple¬ 
mented. Then the gain matrix of the model must 
be estimated carefully to obtain good results. 

Performance Optimizing Control 

Model-predictive control has become the stan¬ 
dard solution for demanding control problems in 
the process industries (Qin and Badgwell 2003) 
and increasingly is used also in other domains. 
The core idea is to employ a model to predict the 
effect of the future manipulated variables on the 
future controlled variables over a finite horizon 
and to use optimization to determine sequences 
of inputs which minimize a cost function over the 
so-called prediction horizon. In the unconstrained 
case with linear plant model and a quadratic 
cost function, the optimal control moves can 
be computed by a closed-form solution. When 
constraints on inputs, outputs, and possibly also 
state variables are present, for a quadratic cost 
function and linear plant model, the optimization 
problem becomes a quadratic program (QP) that 
has to be solved in real time. 

When the system dynamics are nonlinear 
and linear models are only sufficiently accurate 
within narrow operation bands, as is the case 
in many chemical processes, nonlinear model 
predictive control which is based on nonlinear 
models of the process dynamics provides 
superior performance and therefore has met 
increasing interest both in theory and in practice. 
The classical formulation of nonlinear model- 
predictive tracking control (TC) is 


min <pTC (y, u) 

u 


N / P R / M 

<Ptc = E E Yn,i ( y n ,ref (k - /) - y„ (k + i)) 2 ] + E E »ljAuf(k + j) 

n =1 \i = 1 / = 1 y = l 
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s.t. 

x(i + 1) = f{x{i ), z (/), u(i), i ), / = k ,..., k+P 

0 = g (x (/), z (0 , w(/), /), i = k, ... ,k + P 
y(i “hi) — h (v (i + 1), u (i )), i — k ,..., k T- P 
■^min — X (l') — Xmax•> i — k, . . . , k P 

Train — y (j) — Tmax? i = k, . . . , k ~\~ P 
^min < U (i) ^ ^max ? i — k, . . . , k ~ h M 

—Au m { n < Aw (/) < Aw max , i = k,... ,k + M 
u(i) = u (i — 1) + Aw (/), i = k,... ,k + M 
u (i) = u (k + M), V/ > k + M. 

Here / and g represent the plant model in the 
form of a system of differential-algebraic Equa¬ 
tions and h is the output function. P is the 
length of the prediction horizon and M is the 
length of the control horizon, and y i, • • • ,y^ are 
the predicted control outputs, u\, • • • , ur are the 
control inputs, a and y represent the weights on 
the control inputs and the control outputs, respec¬ 
tively. y re f refers to the set point or the desired 
output trajectory, and y(i) are the corrected model 
predictions. N is number of the controlled out¬ 
puts, and R is the number of the control inputs. 
Compensation for plant-model mismatch and un¬ 
measured disturbances is usually done using the 
bias correction equations: 

d (k) = y m ™(k) - y(k), 

y {k + /) — y (k -\- i^-\-d (JP ), i — k ,..., k-\-P. 

The idea of direct performance optimizing con¬ 
trol (POC) is to replace this formulation by a 
performance-related objective function: 

min (ppoc 0, u) 

U 

R / M \ 

<t>poc = E ( E &ij Auj(k + j) I 

+ 0 V 

Here \j/ (k i) represents the value of the per¬ 
formance cost criterion at the time step [k + i ]. 


The optimization of the future control moves is 
subject to the same constraints as before. In ad¬ 
dition, instead of reference tracking, constraints 
are formulated for all outputs that are critical for 
the operation of the system or its performance, 
e.g., product quality specifications or limitations 
of the equipment. In contrast to reference track¬ 
ing, these constraints usually are one-sided (in¬ 
equalities) or define operation bands. By this 
formulation, e.g., the production revenues can be 
maximized online over a finite horizon, consid¬ 
ering constraints on product purities and waste 
stream impurities. Feedback enters into the com¬ 
putation by the initialization of the model with 
a new initial state that is estimated from the 
available measurements of system variables and 
by the bias correction. Thus, direct performance 
optimizing control realizes an online optimiza¬ 
tion of all operational degrees of freedom in a 
feedback structure without tracking of a priori 
fixed set points or reference trajectories. The reg¬ 
ularization term that penalizes control moves is 
added to the purely economic objective function 
to obtain smoother solutions. 

This approach has several advantages over a 
combined steady-state optimization/ linear MPC 
scheme: 

• Immediate reaction to disturbances, no wait¬ 
ing for the plant to reach a steady state is 
required. 

• “Overregulation” is avoided - no variables are 
forced to fixed set points and all degrees of 
freedom can be used to improve the (eco¬ 
nomic) performance of the plant. 

• Performance goals and process constraints do 
not have to be mapped to a control cost that 
defines a compromise between different goals. 
In this way, the formulation of the optimiza¬ 
tion problem and the tuning are facilitated 
compared to achieving good performance by 
tuning of the weights of a tracking formula¬ 
tion. 

• More constraints than available manipulated 
variables can be handled as well as more 
manipulated variables than variables that have 
to be regulated. 

• No inconsistency arises from the use of differ¬ 
ent models on different layers. 

• The overall scheme is structurally simple. 
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Similar to any NMPC controller that is designed 
for reference tracking, a successful implementa¬ 
tion will require careful engineering such that as 
many uncertainties as possible are compensated 
by simple feedback controllers and only the key 
dynamic variables are handled by the optimizing 
controller based on a rigorous model of the essen¬ 
tial dynamics and of the stationary relations of the 
plant without too much detail. 


History and Examples 

The idea of economic or performance optimizing 
control originated from the process control com¬ 
munity. The first papers on directly integrating 
economic considerations into model-predictive 
control Zanin et al. (2000) proposed to achieve 
a better economic performance by adding an eco¬ 
nomic term to a classical tracking performance 
criterion and applied this to the control of a flu¬ 
idized bed catalytic cracker. Helbig et al. (2000) 
discussed different ways to integrate optimization 
and feedback control including direct dynamic 
optimization for the example of a semi-batch 
reactor. Toumi and Engell (2004) and Erdem 
et al. (2004) demonstrated online performance 
optimizing control schemes for simulated mov¬ 
ing bed (SMB) chromatographic separations in 
lab scale. SMB processes are periodic processes 
and constitute prototypical examples where ad¬ 
ditional degrees of freedom can be used to si¬ 
multaneously optimize system performance and 
to meet product specifications. Bartusiak (2005) 
reported already industrial applications of care¬ 
fully engineered performance optimizing NMPC 
controllers. 

Direct performance optimizing control 
was suggested as a promising general new 
control paradigm for the process industries by 
Rolandi and Romagnoli (2005), Engell (2006, 
2007), Rawlings and Amrit (2009), and others. 
Meanwhile it has been demonstrated in many 
simulation studies that direct optimization of 
a performance criterion can lead to superior 
economic performance compared to classical 
tracking (N)MPC, e.g., Ochoa et al. (2010) for a 


bioethanol process and Idris and Engell (2012) 
for a reactive distillation column. 


Further Issues 

Modeling and Robustness 

In a direct performance optimizing control ap¬ 
proach, sufficiently accurate dynamic nonlinear 
process models are needed. While in the pro¬ 
cess industries, nonlinear steady-state models are 
nowadays available for many processes because 
they are built and used extensively in the pro¬ 
cess design phase, there is still a considerable 
additional effort required to formulate, imple¬ 
ment, and validate nonlinear dynamic process 
models. The effort for rigorous or semi-rigorous 
modeling usually dominates the cost of an ad¬ 
vanced control project. The alternative approach 
to use black-box or gray-box models as pro¬ 
posed frequently in nonlinear model-predictive 
control may be effective for regulatory control 
where the model only has to capture the es¬ 
sential dynamic features of the plant near an 
operating point, but it seems to be less suitable 
for optimizing control where the optimal plant 
performance is aimed at and hence the best sta¬ 
tionary values of the inputs and of the controlled 
variables have to be computed by the controller. 
As increasingly so-called operator training sim¬ 
ulators are built in parallel to the construction 
of a plant and are continuously used and up¬ 
dated after the commissioning phase, it seems 
attractive to use the models contained in the 
simulators also for online optimization. However, 
the model formulations often are not suitable for 
this purpose. 

Model inaccuracies always have to be taken 
into account. They not only lead to suboptimal 
performance but also can cause that the con¬ 
straints even on measured variables cannot be met 
in the future because of an insufficient back-off 
from the constraints. A new approach to deal with 
uncertainties about model parameters and future 
influences on the process is multistage scenario- 
based optimization with recourse. Here the model 
uncertainties are represented by a set of scenarios 
of parameter variations and the future availability 
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of additional information is taken into account. 
It has been demonstrated that this is an effective 
tool to handle model uncertainties and to auto¬ 
matically generate the necessary back-off without 
being overly conservative (Lucia et al. 2013). 

State Estimation 

For the computation of economically optimal 
process trajectories based upon a rigorous non¬ 
linear process model, the state variables of the 
system at the beginning of the prediction horizon 
must be known. As not all states will be measured 
in a practical application, state estimation is a 
key ingredient of a performance optimizing con¬ 
troller. Extended Kalman filters are the standard 
solution used in the process industries, if the 
nonlinearities are significant, unscented Kalman 
filters or particle filters may be used. A novel 
approach is to formulate the state estimation 
problem also as an optimization problem on a 
moving horizon (Rao et al. 2003). The estima¬ 
tion of some important varying unknown model 
parameters can be included in this formulation. 
As accurate state estimation is at least as critical 
for the performance of the closed-loop system 
as the exact tuning of the optimizing controller, 
more attention should be paid to the investigation 
of the performance of state estimation schemes 
in realistic situations with non-negligible model- 
plant mismatch. 

Stability 

Optimization of a cost function over a finite hori¬ 
zon in general neither assures optimality of the 
complete trajectory beyond this horizon nor sta¬ 
bility of the closed-loop system. Closed-loop sta¬ 
bility has been addressed extensively in the the¬ 
oretical research in nonlinear model-predictive 
control. Stability can be assured by a proper 
choice of the stage cost within the prediction 
horizon and the addition of a cost on the ter¬ 
minal state and the restriction of the terminal 
state to a suitable set. In performance optimizing 
MPC, there is no a priori known steady state 
to which the trajectory should converge, and the 
economic cost function may not satisfy the usual 
conditions for closed-loop stability, e.g., because 
it only involves some of the inputs. In recent 


years, important results on closed-loop stabil¬ 
ity guaranteeing formulations have nonetheless 
been obtained, involving terminal constraints or 
a quasi-infinite horizon (Angeli et al. 2012; Diehl 
et al. 2011; Grime 2013). 

Reliability and Transparency 

Nowadays quite large nonlinear dynamic opti¬ 
mization problems can be solved in real time, 
not only for slow processes as they are found in 
the chemical industry but also in mechatronics 
and automotive control. So this issue does no 
longer prohibit the application of a performance 
optimizing control scheme to complex systems. 
A practically very important limiting issue how¬ 
ever is that of reliability and transparency. It is 
difficult to guarantee that a nonlinear optimizer 
will provide a solution which at least satisfies 
the constraints and gives a reasonable perfor¬ 
mance for all possible input data. While for an 
RTO scheme an inspection of the commanded 
set points by the operators usually will be fea¬ 
sible, this is less likely to be realistic in a dy¬ 
namic situation. Hence, automatic result filters 
are necessary as well as a backup scheme that 
stabilizes the process in the case where the result 
of the optimization is not considered safe. In 
the process industries, the operators will con¬ 
tinue to supervise the operation of the plant in 
the foreseeable future, so a control scheme that 
includes performance optimizing control must be 
structured into modules, the outputs of which can 
still be understood by the operators so that they 
build up trust in the optimization. Good operator 
interfaces that display the predicted moves and 
the predicted reaction of the plant and enable 
comparisons with the operators’ intuitive strate¬ 
gies are believed to be essential for practical 
success. 

Effort vs. Performance 

The gain in performance by a more sophisticated 
control scheme always has to be traded against 
the increase in cost due to the complexity of 
the control scheme - a complex scheme will 
not only cause cost for its implementation, but 
it will need more maintenance by better qual¬ 
ified people than a simple one. If a carefully 
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chosen standard regulatory control layer leads 
to a close-to-optimal operation, there is no need 
for optimizing control. If the disturbances that 
affect profitability and cannot be handled well 
by the regulatory layer (in terms of economic 
performance) are slow, the combination of reg¬ 
ulatory control and RTO is sufficient. In a more 
dynamic situation or for complex nonlinear mul¬ 
tivariable plants, the idea of direct performance 
optimizing control should be explored and im¬ 
plemented if significant gains can be realized in 
simulations. 


Cross-References 

► Control and Optimization of Batch 
Processes 

► Control Hierarchy of Large Processing Plants: 
An Overview 

► Control Structure Selection 

► Economic Model Predictive Control 

► Extended Kalman Filters 

► Model-Predictive Control in Practice 

► Moving Horizon Estimation 

► Particle Filters 

► Real-Time Optimization of Industrial Pro¬ 
cesses 
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Abstract 


6 are unknown and have to be determined using 
parameter estimation. When used in connection 
with system identification, these models are 
sometimes referred to as gray box models (in 
contrast to black box models) to indicate that 
some degree of physical knowledge is assumed. 
In ► System Identification: An Overview, various 
connections between physical models and 
parameter estimation are discussed. 


This entry describes how models can be formed 
from the basic principles of physics and the other 
fields of science. Use can be made of similarities 
between different domains which leads to the 
concepts of bond graphs and, more abstractly, to 
port-controlled Hamiltonian systems. The class 
of models is naturally extended to differential 
algebraic equation (DAE) models. The concepts 
described here form a natural basis for parameter 
identification in gray box models. 


Keywords 

Bond graph; Differential algebraic equation 
(DAE); Differential algebra; Gray box model; 
Hamiltonian; Physical analogy; Physical 
modeling 


Introduction 

The approach to the modeling of dynamic 
systems depends on how much is known about 
the system. When the internal mechanisms are 
known, it is natural to model them using known 
relationships from physics, chemistry, biology, 
etc. Often the result is a model of the following 
form: 


— = f(x,u;0), y = h(x,u;6 ) (1) 

at 

where u is the input, y is the output, and the 
state v contains internal physical variables, while 
0 contains parameters. Typically all of these 
are vectors. The model is known as a state 
space model. In many cases some elements in 


Overview of Physical Modeling 

Since modeling covers such a wide variety of 
physical systems, there are no universal system¬ 
atic principles. However, a few concepts have 
wide application. One of them is the preserva¬ 
tion of certain quantities like energy, leading to 
balance equations. A simple example is given by 
the heating of a body. If W is the energy stored 
as heat, P\ an external power input, and P 2 the 
heat loss to the environment per time unit, energy 
balance gives 


dW 

dt 


= Pi-Pi 


( 2 ) 


To get a complete model, one needs also con¬ 
stitutive relations , i.e., relations between relevant 
physical variables. For instance, one might know 
that the stored energy is proportional to the tem¬ 
perature T, W = C T and that the energy loss is 
from black body radiation, T *2 = kT A . The model 
is then 

C C P- = Pi — kT A (3) 

dt 

The model is now an ordinary differential equa¬ 
tion with state variable T, input variable P\ and 
parameters C and k. 


Physical Analogies and General 
Structures 

Physical Analogies 

Physicists and engineers have noted that mod¬ 
eling in different areas of physics often gives 
very similar models. The term “analogies” is 
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Modeling of Dynamic Systems from First Principles, 
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Modeling of Dynamic Systems from First Principles, 
Fig. 2 Mechanical system 


often used in modeling to describe this fact. Here 
we will show some analogies between electrical 
and mechanical phenomena. Consider the electric 
circuit given in Fig. 1. An ideal voltage source is 
connected in series with an inductor, a resistor, 
and a capacitor. Using u and v to denote the 
voltages over the voltage source and capacitor, 
respectively, and i to denote the current, a math¬ 
ematical model is 


T di 

L —— -\- Ri v — u 

dt 

The first equation uses the definition of capaci¬ 
tance and the second one uses Kirchhoff’s voltage 
law. Compare this to the mechanical system of 
Fig. 2 where an external force F is applied to 
a mass m that is also connected to a damper b 
and a spring with spring constant k. If S is the 
elongation force of the spring and w the velocity 
of the mass, a system model is 


Note that the products (voltage) x (current) and 
(force) x (velocity) give the power. 

Bond Graphs 

The bond graph is a tool to do systematic model¬ 
ing based on the analogies of the previous section. 
The basic element is the bond 

f 

formed by a half arrow showing the direction of 
positive energy flow. Two variables are associated 
with the bond, the effort variable e and the flow 
variable /. The product ef of these variables 
gives the power. In the electric domain e is 
voltage and / is current. For mechanical systems 
e is force, while / is velocity. Bond graph theory 
has three basic components to describe storage 
and dissipation of energy. The relations 

de df 

« 57 -f.fi-- yf- «> 


dS , 

—— = kw 
dt 


dw „ „ 

m ——|- bw + S = F 
dt 


(5) 


Here the first equation uses the definition of 
spring constant and the second one uses New¬ 
ton’s 2nd law. The models are seen to be the 
same with the following correspondences be¬ 
tween time-varying quantities 


u o F, i o w, v o S (6) 


and between parameters 


(7) 


are known as C, /, and R elements, respectively. 
Input signals are modeled by elements called 
effort sources S e or flow sources S /, respectively. 
A bond graph describes the energy flow between 
these elements. When the energy flow is split, it 
can either be at s junctions where the flows are 
equal and the efforts are added or at a p junction 
where efforts are equal and flows are additive. 
The model (5), for instance, can be described 
by the bond graph in Fig. 3. The graph shows 
how the energy from the external force is split 
into the acceleration of the mass, the elongation 
of the spring, and dissipation into the damper. The 
splitting of the energy flow is accomplished by an 
s element, meaning that the velocity is the same 
for all elements but that the forces are added: 


Col/k, Lom, Rob 
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and dissipation phenomena. To give an example 
the mechanical system used above is considered 
again. 

Introduce x\ as the length of the spring so that 
dx\/dt = w. If H\ is the energy stored in the 
spring, then the following relations hold: 


C:l/k 


Hi (xO = 


kx\ 8Hi 
2 ’ dxi 


= kx\ = S (12) 


w 


T 


Introducing *2 for the momentum and H 2 as the 
kinetic energy, one has dx 2 /dt = N and 


( 

R:b 


H 2 (jc 2 ) = 



3 H 2 

-— =m x 2 = w 

OX 2 


(13) 


Modeling of Dynamic Systems from First Principles, 
Fig. 3 Bond graph for mechanical or electric system 


Let H = H\ + H 2 be the total energy. Then the 
following relation holds: 


F = N + S + T (9) 

Here T and N denote the forces associated with 
the damper and the mass, respectively. From (8) 
it follows that 

, dS dw 

k —— = w, m—— = N, bw = T (10) 
dt dt 

Together (9) and (10) give the same model as (5). 
Using the correspondences (6), (7) it is seen that 
the same bond graph can also represent the elec¬ 
tric circuit (4). An overview of bond graph mod¬ 
eling is given in Rosenberg and Karnopp (1983). 
A general overview of modeling, including bond 
graphs and the connection with identification, can 
be found in Ljung and Glad (1994b). 

Port-Controlled Hamiltonian Systems 

Many physical processes can be modeled as 
Hamiltonian systems. This means that there are 
state variables x, a scalar function //, and a skew 
symmetric matrix M so that the system dynamics 
is 

( F = MVH{x) (11) 

dt 

The function H is called the Hamiltonian of the 
system. To be useful in a control context, this 
model class has to be extended to handle inputs 


dx _ f~ 0 1 
~dt~ vL —1 0 


00]\ T dH/dxi 

0b\) ldH/dx 2 


This model is a special case of 



— = (M - R)VH{x) + Bu (15) 
dt 


where M is a skew symmetric and R a nonnega¬ 
tive definite matrix, respectively. The model type 
is called a port-controlled Hamiltonian system 
with dissipation. Without external input (B = 
0) and dissipation (R = 0), it reduces to an 
ordinary Hamiltonian system of the form (11). 
For systems generated by simple bond graphs, it 
can be shown that the junction structure gives the 
skew symmetric M, while the R elements give 
the matrix R. The storage of energy in I and C 
elements is reflected in H . The reader is directed 
to Duindam et al. (2009) for a description of the 
port Hamiltonian approach to modeling. 


Component-Based Models and 
Modeling Languages 

Since engineering systems are usually assembled 
from components, it is natural to treat their math¬ 
ematical models in the same way. This is the idea 
behind block-oriented models where the output 
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of one model is connected to the input of another 
one: 



yi - u 2 






A nice feature of this block connection is that 
the state space description is preserved. Suppose 
the individual models are of the form ( 1 ) 


dxi 

dt 


— fi ip^i 5 


= hi(x i9 Uf), i = 1 , 2 
(16) 


Then the connection w 2 = Ji immediately gives 
the state space model 


algebraic equation (DAE). The difference from 
the connection of blocks in block diagrams is 
that now the connection is not between an input 
and an output. Instead there are the equations 
o)\ = 0)2 and M\ = —M 2 that destroy the state 
space structure. There exist modeling languages 
like Modelica (Fritzson 2000; Tiller 2001) or 
SimMechanics in MATLAB (MathWorks 2002) 
that accept this more general type of model. It 
is then possible to form model libraries of basic 
components that can be interconnected in very 
general ways to form models of complex systems. 
However, this more general structure poses some 
challenges when it comes to analysis and simula¬ 
tion that are described in the next section. 


d 

~ x x ~ 


f\(x\. Ml) 

dt 

. X 2_ 


_fl(x 2 , hi(x u Mi))_ 


y 2 = h 2 (x 2 , hi(x U Ml)) 

with input u\, output y 2 , and state (vi; x 2 ). 
This fact is the basis of block-oriented mod¬ 
eling and simulation tools like the MATLAB - 
based Simulink. Unfortunately the preservation 
of the state space structure does not extend to 
more general connections of systems. Consider, 
for instance, two pieces of rotating machinery 
described by 

J i~T~ = -bi“>i + Mi, i = 1,2 (18) 

dt 

where coi is the angular velocity, M, the external 
torque, the moment of inertia, and b\ the damp¬ 

ing coefficient. Suppose the pieces are joined 
together so that they rotate with the same angular 
velocity. The mathematical model would then be 
T doo\ 

J\ — 7 — = —b\(0\ + Mi 
dt 

dco 2 

Jl ~dT = ~ blQ)2 + Ml (19) 

0 )\ = co 2 
M\ = — M 2 

This is no longer a state space model of the 
form (1), but a mixture of dynamic and static 
relationships, usually referred to as a differential 


Differential Algebraic Equations 
(DAE) 

This model (19) is a special case of the general 
differential algebraic equation 

F(dz/dt, z,u) = 0 ( 20 ) 

A good description of both theory and numerical 
properties of such equations is given in Kunkel 
and Mehrmann (2006). In many cases it is possi¬ 
ble to split the variables and equations in such a 
way that the following structure is achieved: 

F(dz\/dt, z\, Z2, u) = 0, F 2 (zi, z 2 , u) = 0 

( 21 ) 

If Z 2 can be solved from the second equation 
and substituted into the first one, and if dz\/dt 
can then be solved from the first equation, the 
problem is reduced to an ordinary differential 
equation in z\- Often, however, the situation is 
not as simple as that. For the example (19) an 
addition of the first two equations gives 

(/1 + *^2)—^— — —( p \ + b 2 ) o )\ (22) 

dt 

which is a standard first-order system description. 
Note, however, that in order to arrive at this result, 
the relation oo\ = 002 has to be differentiated. This 
DAE thus includes an implicit differentiation. 
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In the general case one can investigate how many 
times (20) has to be differentiated in order to get 
an explicit expression for dz/dt. This number is 
called the (differentiation) index. Both theoretical 
analysis and practical experience show that the 
numerical difficulties encountered when solving 
a DAE increase with increasing index; see, e.g., 
the classical reference Brenan et al. (1987). It 
turns out that mechanical systems in particu¬ 
lar give high-index models when constructed by 
joining components, and this has been an obstacle 
to the use of DAE models. For linear DAE models 
the role of the index can be seen more easily. A 
linear model is given by 

E^ + Fz = Gu (23) 

dt 

where the matrix E is singular (if E is invertible, 
multiplication with E~ l from the left gives an 
ordinary differential equation). The system can 
be transformed by multiplying with P from the 
left and changing variables with z = Qw(P, Q 
nonsingular matrices). The transformed model is 
now 

PEQ^ + PFQw = PGu (24) 
dt 

If XE + F is nonsingular for some value of the 
scalar A, then it can be shown that there is a 
choice of P and Q such that (24) takes the form 


This expression shows that an index k > 1 
implies differentiation of the input (unless A B 2 
happens to be zero). This in turn implies potential 
difficulties, e.g., if u is a measured signal. 

Identification of DAE Models 

The extended use of DAE models in modern 
modeling tools also means that there is a need 
to use these models in system identification. To 
fully use system identification theory, one needs 
a stochastic model of disturbances. The inclusion 
of such disturbances leads to a class of models de¬ 
scribed as stochastic differential algebraic equa¬ 
tions. The treatment of such models leads to some 
interesting problems. In the previous section it 
was seen that DAE models often contain implicit 
differentiations of external signals. If a DAE 
model is to be well posed, this differentiation 
must not affect signals modeled as white noise. 
In Gerdin et al. (2007), conditions are given that 
guarantee that stochastic DAEs are well posed. 
There it is also described how a maximum like¬ 
lihood estimate can be made for DAE models, 
laying the basis for parameter estimation. 

Differential Algebra 

For the case where models consist of polynomial 
equations, it is possible to manipulate them in 
a very systematic way. The model (20) is then 
generalized to 


M 


" / 0 " 
_0 A _ 

1 1 

-h CN 
£ £ 

1_1 

+ 

'-A O' 
. 0 1 . 

1-1 

tSJ H-* 

= 

_b 2 _ 


(25) 


where A is a nilpotent matrix, i.e., N k = 0 for 
some positive integer k. The smallest such k turns 
out to be the index of the DAE. The transformed 
model (25) thus contains an ordinary differential 
equation: 

dw\ 

—— = Aw 1 + B\u (26) 

dt 

Using the nilpotency of A, the equation for W 2 
can be rewritten: 

du 1 1 d k ~ l u 

W2 = B2U — NB2— + • • • + (—A) B2 jjk—i 

(27) 


F(d n z/dt n ,..., dz/dt, z) = 0 (28) 

where z is now a vector containing an arbitrary 
mix of inputs, outputs, and internal variables. 
There is then a theory based on Ritt (1950) that al¬ 
lows the transformation of (28) to a standard form 
where the properties of the system can be easily 
determined. The process is similar to the use of 
Grobner bases but also includes the possibility of 
differentiating equations. Of particular interest to 
identification is the possibility of determining the 
identifiability of parameters with these tools. The 
model is then of the form 

E(d m y/dt m ,... ,dy/dt, y, d n z/dt n , 

..., dz/dt , z; 0) = 0 


( 29 ) 















746 


Modeling, Analysis, and Control with Petri Nets 


where y contains measured signals, z contains 
unmeasured variables, and 0 is a vector of pa¬ 
rameters to be identified, while F is a vector of 
polynomials in these variables. It was shown in 
Ljung and Glad (1994a) that there is an algorithm 
giving for each parameter Ok a polynomial: 

gk(d m y/dt m ,..., dy/dt, y; Ok) = 0 (30) 

This relation can be regarded as a polynomial 
in Ok where all coefficients are expressed in 
measured quantities. The local or global identifi- 
ability will then be determined by the number of 
solutions. If Ok is unidentifiable, then no equation 
of the form (30) will exist, and this fact will also 
be demonstrated by the output of the algorithm. 

Summary and Future Directions 

There is no general method to derive models from 
first principles. However, modeling techniques 
based on bond graphs or port-controlled Hamilto¬ 
nian systems offer a systematic approach for large 
model classes. Modeling languages like Modelica 
make the practical work with modeling much 
easier. A fundamental problem that comes up is 
that models are not necessarily in state space form 
but are so called differential algebraic equation 
(DAE) models. Much of the future work is ex¬ 
pected to deal with the handling of DAE models 
and in the development of modeling languages. 

Cross-References 

► Nonlinear System Identification: An Overview 
of Common Approaches 

► System Identification: An Overview 

Recommended Reading 

A classical book on physical modeling is Rosen¬ 
berg and Karnopp (1983) with emphasis on bond 
graph techniques. The physical modeling and 
identification perspectives are tied together in 
Ljung and Glad (1994b). A good reference for 


Hamiltonian techniques is Duindam et al. (2009). 
The Modelica modeling language is treated in 
Tiller (2001) and Fritzson (2000). The former 
emphasizes the physical modeling point of view; 
the latter also gives details of the language itself. 
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Abstract 

Petri net is a generic term used to designate a 
broad family of related formalisms for discrete 
event views of (dynamic) Systems (DES), all 
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sharing some basic relevant features, such as 
minimality in the number of primitives, local¬ 
ity of the states and actions (with consequences 
for model construction), or temporal realism. 
The global state of a system is obtained by the 
juxtaposition of the different local states. We 
should initially distinguish between autonomous 
formalisms and those extended by interpretation. 
Models in the latter group are obtained by re¬ 
stricting the underlying autonomous behaviors 
by means of constraints that can be related to 
different kinds of external events, in particular to 
time. This article first describes place/transition 
nets (PT-nets), by default simply called Petri nets 
(PNs). Other formalisms are then mentioned. As 
a system theory modeling paradigm for concur¬ 
rent DES, Petri nets are used in a wide variety of 
application fields. 

Keywords 

Condition/event nets (CE-nets); Continuous Petri 
nets (CPNs); Diagrams; Fluidization; Grafcet; 
Hybrid Petri nets (HPNs); Marking Petri nets; 
Place/transition nets (PT-nets); High-level Petri 
nets (HLPNs) 

Introduction 

Petri nets (PNs) are able to model concurrent 
and distributed DES (►Models for Discrete 
Event Systems: An Overview). They constitute 
a powerful family of formalisms with different 
expressive purposes and power. They may be 
applied to inter alia, modeling, logical analysis, 
performance evaluation, parametric optimization, 
dynamic control (minimum makespan, super¬ 
visory control, or other kinds), diagnosis, and 
implementation issues (eventually fault tolerant). 
Hybrid and continuous PNs are particularly 
useful when some parts of the system are highly 
populated. Being multidisciplinary , formalisms 
belonging to the Petri nets paradigm may cover 
several phases of the life cycle of complex DES. 

A Petri net can be represented as a bipartite 
directed graph provided with arcs inscriptions; 
alternatively, this structure can be represented in 


algebraic form using some matrices. As in the 
case of differential equations, an initial condition 
or state should be defined in order to represent 
a dynamic system. This is done by means of an 
initial distributed state. The English translation of 
the Carl Adam Petri’s seminal work, presented in 
1962, is Petri (1966). 


Untimed Place/Transition Net 
Systems 

A place/transition net (PT-net) can be viewed as 
AT = (P, T,Pre,Post ), where: 

• P and T are disjoint and finite nonempty sets 
of places and transitions , respectively. 

• Pre and Post are | P | x \T\ sized, natural¬ 
valued (zero included), incidence matrices. 
The net is said to be ordinary if Pre and Post 
are valued on {0,1}. Weighted arcs permit 
the abstract modeling of bulk services and 
arrivals. 

A PT-net is a structure. The Pre (Post) function 
defines the connections from places to transitions 
(transitions to places). Those two functions can 
alternatively be defined as weighted flow relations 
(nets as graphs). Thus, PT-nets can be represented 
as bipartite directed graphs with places ( p , using 
circles) and transitions ( t , using bars or rectan¬ 
gles) as nodes: J\f = ( P , T, F, W), where F c 
(P x T) U (T x P) is the flow relation (set of 
directed arcs, with dom(F)Urange(F) = PUT), 
and W : F -> N + assigns a natural weight to 
each arc. 

The net structure represents the static part 
of the DES model. Furthermore, a “distributed 
state” is defined over the set of places, known 
as the marking. This is “numerically quantified” 
(not in an arbitrary alphabet, as in automata), 
associating natural values to the local state 
variables, the places. If a place p has a value 
v(m(p) = v), it is said to have v tokens 

(frequently depicted in graphic terms with v 
black dots or just the number inside the place). 
The places are “state variables,” while the 
markings are their “values”; the global state 
is defined through the concatenation of local 
states. The net structure, provided with an initial 
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Modeling, Analysis, and Control with Petri Nets, 
Fig. 1 Most basic PN constructions: The logical OR 
is present around places, in choices (or branches) and 


o *" o y-y 1 

t r^-i 'O **• O* 

Join (RV) Fork 

attributions (or meets); the logical AND is formed around 
transitions, in joins (or waits or rendezvous) and forks (or 
splits) 



Modeling, Analysis, and Control with Petri Nets, Fig. 2 Only transitions b and c are initially enabled. The results 
of firing b or c are shown subsequently 


marking, to be denoted as ( J\f , mo), is a Petri net 
system , or marked Petri net. 

The last two basic PN constructions in Fig. 1 
(join and fork) do not appear in finite-state ma¬ 
chines; moreover, the arcs may be valued with 
natural numbers. The dynamic behavior of the net 
system (trajectories with changes in the marking) 
is produced by the firing of transitions, some 
“local operations” which follows very simple 
rules. 

Markings in net systems evolve according to 
the following firing (or occurrence) rules (see, 
Fig. 2): 

• A transition is said to be enabled at a given 
marking if each input place has at least as 
many tokens as the weight of the arc joining 
them. 

• The firing or occurrence of an enabled tran¬ 
sition is an instantaneous operation that re¬ 
moves from (adds to) each input (output) place 
a number of tokens equal to the weight of 
the arc joining the place (transition) to the 
transition (place). 

The precondition of a transition can be seen as the 
resources required for the transition to be fired. 
The weight of the arc from a place to a transition 


represents the number of resources to be con¬ 
sumed. The post-condition defines the number 
resources produced by the firing of the transition. 
This is made explicit by the weights of the arcs 
from the transition to the places. Three important 
observations should be taken into account: 

• The underlying logic in the firing of a tran¬ 
sition is non-monotonic! It is a consump¬ 
tion/production logic. 

• Enabled transitions are never forced to fire: 
This is a form of non-determinism. 

• An occurrence sequence is a sequence of fired 
transitions o = t\ ... 4. In the evolution 
from mo, the reached marking m can be easily 
computed as: 

m = ni{) + C • a, (1) 

where C = Post - Pre is the token flow matrix 
( incidence matrix if M is self-loop free) and 
o the firing count vector corresponding to <r. 
Thus m and o are vectors of natural numbers. 
The previous equation is the state-transition 
equation (frequently known as the fundamental 
or, simply, state equation). Nevertheless, two im¬ 
portant remarks should be made: 
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• It represents a necessary but not sufficient 
condition for reachability; the problem is that 
the existence of a a does not guarantee that a 
corresponding sequence cr is firable from m o; 
thus, certain solutions - called spurious (Silva 
et al. 1998) - are not reachable. This implies 
that - except in certain net system subclasses - 
only semi-decision algorithms can usually be 
derived. 

• All variables are natural numbers, which im¬ 
ply computational complexity. 

It should be pointed out that in finite-state ma¬ 
chines, the state is a single variable taking values 
in a symbolic unstructured set, while in PT-net 
systems, it is structured as a vector of nonnegative 
integers. This allows analysis techniques that do 
not require the enumeration of the state space. 

At a structural level, observe that the negation 
is missing in Fig. 1; its inclusion leads to the so- 
called inhibitor arcs , an extension in expressive 
power. In its most basic form, if the place at 
the origin of an inhibitor arc is marked, it “in¬ 
hibits” the enabling of the target transition. PT- 
net systems can model infinite-state systems, but 
not Turing machines. PT-net systems provided 
with inhibitor arcs (or priorities on the firing of 
transitions) can do it. 

With this conceptually simple formalism, it 
is not difficult to express basic synchronization 
schemas (Fig. 3). All the illustrated examples use 
joins. When weights are allowed in the arcs, 
another kind of synchronization appears: Sev¬ 
eral copies of the same resource are needed (or 
produced) in a single operation. Being able to 
express concurrency and synchronization , when 
viewing the system at a higher level, it is possible 
to build cooperation and competition relation¬ 
ships. 

Analysis and Control of Untimed 
PT Models 

The behavior of a concurrent (eventually dis¬ 
tributed) system is frequently difficult to under¬ 
stand and control. Thus, misunderstandings and 
mistakes are frequent during the design cycle. A 
way of cutting down the cost and duration of the 


design process is to express in a formalized way 
properties that the system should enjoy and to use 
formal proof techniques. Errors can be eventually 
detected close to the moment they are introduced, 
reducing their propagation to subsequent stages. 
The goal in verification is to ensure that a given 
system is correct with respect to its specification 
(perhaps expressed in temporal-logic terms) or to 
a certain set of predetermined properties. 

Among the most basic qualitative properties of 
“net systems” are the following: (1) reachability 
of a marking from a given one; (2) boundedness , 
characterizing finiteness of the state space; (3) 
liveness , related to potential fireability of all tran¬ 
sitions starting on an arbitrary reachable marking 
{deadlock-freeness is a weaker condition in which 
only global infinite fireability of the net system 
model is guaranteed, even if some transitions 
no longer fire); (4) reversibility , characterizing 
recoverability of the initial marking from any 
reachable one; and (5) mutual exclusion of two 
places , dealing with the impossibility of reaching 
markings in which both places are simultane¬ 
ously marked. 

All the above are behavioral properties, which 
depend on the net system {J\f,mf). In practice, 
sometimes problems with a net model are rooted 
in the net structure; thus, the study of the struc¬ 
tural counterpart of certain behavioral proper¬ 
ties may be of interest. For example, a “net” 
is structurally bounded if it is bounded for any 
initial marking; a “net” is structurally live if an 
initial marking exists that make the net system 
live (otherwise stated, it reflect non-liveness for 
arbitrary initial markings, a pathology of the net). 

Basic techniques to analyze net systems in¬ 
clude: (1) enumeration , in its most basic form 
based on the construction of a reachability graph 
(a sequentialized view of the behavior). If the net 
system is not bounded, losing some information, 
a finite coverability graph can be constructed; (2) 
transformation , based on an iterative rewriting 
process in which a net system enjoys a certain 
property if and only if a transformed (“sim¬ 
pler” to analyze) one also does. If the new sys¬ 
tem is easier to analyze, and the transformation 
is computationally cheap, the process may be 
extremely interesting; (3) structural , based on 
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Modeling, Analysis, and Control with Petri Nets, 
Fig. 3 Basic synchronization schemes: (7) Join or ren¬ 
dezvous, RV; (2) Semaphore, S; ( 3 ) Mutual exclusion 
semaphore (mutex), R, representing a shared resource; 
(4) Symmetric RV built with two semaphores; (5) Asym¬ 
metric RV built with two semaphores (i master/slave ); 
(6) Fork-join (or par-begin/par-end); (7) Non-recursive 
subprogram (places i and j cannot be simultaneously 

graph properties, or in mathematical program¬ 
ming techniques rooted on the state equation; 
(4) simulation , particularly interesting for gaining 
certain confidence about the absence of certain 
pathological behaviors. Analysis strategies com¬ 
bining all these kinds of techniques are extremely 
useful in practice. 

Reachability techniques provide sequentialized 
views for a particular initial marking. Moreover 
they suffer from the so-called state explosion 
problem. The reduction of this computational 
issue leads to techniques such as stubborn 
set methods (smaller, equally informative 


marked - must be in mutex - to remember the returning 
point; for simplicity, it is assumed that the subprogram is 
single input/single output); (8) Guard (a self-loop from a 
place through a transition); its role is like a traffic light: 
If at least one token is present at the place, it allows the 
evolution, but it is not consumed. Synchronizations can 
also be modeled by the weights associated to the arcs 
going to transitions 

reachability graphs), also to non-sequentialized 
views such as those based on unfoldings. 
Transformation techniques are extremely useful, 
but not complete (i.e., not all net systems can 
be reduced in a practical way). In most cases, 
structural techniques only provide necessary or 
sufficient conditions (e.g., a sufficient condition 
for deadlock-freeness, a necessary condition for 
reversibility, etc.), but not a full characterization. 
As already pointed out, a limitation of methods 
based on the state equation for analyzing net 
systems is the existence of non reachable 
solutions (the so-called spurious solutions). In 
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this context, three kinds of related notions that 
must be differentiated are the following: (1) 
some natural vectors (left and right annullers 
of the token flow matrix, C: P-semiflows and 
T-semiflows), (2) some invariant laws ( token 
conservation and repetitive behaviors ), and 
(3) some peculiar subnets (< conservative and 
consistent components, generated by the subsets 
of nodes in the P- and T-semiflows, respectively). 

More than analysis, control leads to synthesis 
problems. The idea is to enforce the given system 
in order to fulfill a specification (e.g., to enforce 
certain mutual exclusion properties). Technically 
speaking, the idea is to “add” some elements in 
order to constrain the behavior in such a way that 
a correct execution is obtained. Questions related 
to control, observation, diagnosis, or identifica¬ 
tion are all areas of ongoing research. 

With respect to classical control theory, there 
are two main differences: Models are DES and 
untimed (autonomous, fully nondeterministic, 
eventually labeling the transitions in order to 
be able to consider the PNs languages). Let us 
remark that for control purposes, the transitions 
should be partitioned into controllable (when 
enabled, you can either force or block the 
firing) and uncontrollable (if enabled, the firing 
is nondeterministic). A natural approach to 
synthesize a control is to start modeling the plant 
dynamics (by means of a PN, P) and adopting a 
specification for the desired closed-loop system 
(S). The goal is to compute a controller (L) 
such that S equals the parallel-composition of 
P and L; in other words, controllers (called 
“supervisors”) are designed to ensure that only 
behaviors consistent with the specification may 
occur. The previous equality is not always 
possible, and the goal is usually relaxed to 
minimally limit the behavior within the specified 
legality (i.e., to compute maximally permissive 
controllers). For an approach in the framework 
of finite-state machines and regular languages, 
see ► Supervisory Control of Discrete-Event 
Systems. The synthesis in the framework of 
Petri nets and having goals as enforcing some 
generalized mutual exclusions constraints in 
markings or avoiding deadlocks, for example, 
can be efficiently approached by means of the 


so named structure theory , based on the direct 
exploitation of the structure of the net model 
(using graph or mathematical programming 
theories and algorithms, where the initial marking 
is a parameter). 

Similarly, transitions (or places) can be parti¬ 
tioned into observable and unobservable. Many 
observability problems may be of interest; for ex¬ 
ample, observing the firing of a subset of transi¬ 
tions to compute the subset of markings in which 
the system may be. Related to observability, di¬ 
agnosis is the process of detecting a failure (any 
deviation of a system from its intended behavior) 
and identifying the cause of the abnormality. Di- 
agnosability , like observability or controllability, 
is a logical criterion. If a model is diagnosable 
with respect to a certain subset of possible faults 
(i.e., it is possible to detect the occurrence of 
those faults in finite time), a diagnoser can be 
constructed (see section “Diagnosis and Diagnos- 
ability Analysis of Petri Nets” in ► Diagnosis of 
Discrete Event Systems). Identification of DES 
is also a question that has required attention in 
recent years. In general, the starting point is a 
behavioral observation, the goal being to con¬ 
struct a PN model that generates the observed 
behavior, either from examples/counterexamples 
of its language or from the structure of a reach¬ 
ability graph. So the results are derived models, 
not human-made models (i.e., not made by de¬ 
signers). 

The Petri Nets Modeling Paradigm 

Along the life cycle of DES, designers may deal 
with basic modeling, analysis, and synthesis from 
different perspectives together with implemen¬ 
tation and operation issues. Thus, the designer 
may be interested in expressing the basic struc¬ 
ture, understanding untimed possible behaviors, 
checking logic properties on the model when 
provided with some timing (e.g., in order to 
guarantee if a certain reaction is possible before 
3 ms; something relevant in real-time systems), 
computing some performance indices on timed 
models (related to the throughput in the firing of a 
given transition, or to the length of a waiting line, 
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expressed as the number of tokens in a place), 
computing a schedule or control that optimizes 
a certain objective function, decomposing the 
model in order to prepare an efficient imple¬ 
mentation, efficiently determining redundancies 
in order to increase the degree of fault tolerance, 
etc. For these different tasks, different formalisms 
may be used. Nevertheless, it seems desirable 
to have a family of related formalisms rather 
than a collection of “unrelated” or weakly related 
formalisms. The expected advantages would in¬ 
clude coherence among models usable in dif¬ 
ferent phases, economy in the transformations 
and synergy in the development of models and 
theories. 


Other Untimed PN Formalisms: Levels of 
Expressive Power 

PT-net systems are more powerful than condi¬ 
tion/event (CE) systems, roughly speaking the 
basic seminal formalism of Carl Adam Petri in 
which places can be marked only with zero or 
one token (Boolean marking). CE-systems can 
model only finite-state systems. As already said, 
“extensions” of the expressive power of untimed 
PT-net systems to the level of Turing machines 
are obtained by adding inhibitor arcs or priorities 
to the firing of transitions. 

An important idea is adding the notion of 
individuals to tokens (e.g., from anonymous 
to labeled or colored tokens). Information in 
tokens allows the objects to be named (they 
are no longer indistinguishable) and dynamic 
associations to be created. Moving from PT- 
nets to so-called high-level PNs (HLPNs) is 
something like “moving from assembler to 
high-level programming languages,” or, at 
the computational level, like “moving from 
pure numerical to a symbolic level.” There 
are many proposals in this respect, the more 
important being predicate/transition nets and 
colored PNs. Sometimes, this type of abstraction 
has the same theoretical expressiveness as 
PT-net systems (e.g., colored PNs if the 
number of colors is finite); in other words, 
high-level views may lead to more compact 


and structured models, while keeping the 
same theoretical expressive power of PT- 
nets (i.e., we can speak of “abbreviations,” 
not of “extensions”). In other cases, object- 
oriented concepts from computer programming 
are included in certain HLPNs. The analysis 
techniques of HLPNs can be approached 
with techniques based on enumeration, trans¬ 
formation, or structural considerations and 
simulation, generalizing those developed for PT- 
net systems. 


Extending Net Systems with External 
Events and Time: Nonautonomous 
Formalisms 

When dealing with net systems that interact with 
some specific environment, the marking evolu¬ 
tion rule must be slightly modified. This can 
be done in an enormous number of ways, con¬ 
sidering external events and logical conditions 
as inputs to the net model, in particular some 
depending on time. The same interpretation given 
to a graph in order to define a finit e-state diagram 
can be used to define a marking diagram , a for¬ 
malism in which the key point is to recognize that 
the state is now numerical (for PT-net systems) 
and distributed. For example, drawing a parallel 
with Moore automata, the transitions should be 
labeled with logical conditions and events, while 
unconditional actions are associated to the places. 
If a place is marked, the associated actions are 
emitted. 

Even if only time-based interpretations are 
considered, there are a large number of successful 
proposals for formalisms. For example, it should 
be specified if time is associated to the firing of 
transitions (T-timing may be atomic or in three 
phases), to the residence of tokens in places (P- 
timed), to the arcs of the net, as tags to the 
tokens, etc. Moreover, even for T-timed models, 
there are many ways of defining the timing: time 
intervals , stochastic or possibilistic forms, and 
the deterministic case as a particular one. If the 
firing of transitions follows exponential pdfs, and 
the conflict resolution follows the race policy 
(i.e., fire in conflicts the first that ends the task, 
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not a preselection policy), the underlying Markov 
chain described is isomorphic to the reachability 
graph (due to the Markovian “memoryless” prop¬ 
erty). Moreover, the addition of immediate transi¬ 
tions (whose firing is instantaneous) enriches the 
practical modeling possibilities, eventually com¬ 
plicating the analysis techniques. Timed models 
are used to compute minimum and maximum 
time delays (when time intervals are provided, 
in real-time problems) or performance figures 
(throughput, utilization rate of resources, average 
number of tokens - clients - in services, etc.). For 
performance evaluation, there is an array of tech¬ 
niques to compute bounds, approximated values, 
or exact values, sometimes generalizing those 
that are used in certain queueing network classes 
of models. Simulation techniques are frequently 
very helpful in practice to produce an educated 
guess about the expected performance. 

Time constraints on Petri nets may change 
logical properties of models (e.g., mutual exclu¬ 
sion constraints, deadlock-freeness, etc.), calling 
for new analysis techniques. For example, certain 
timings on transitions can transform a live system 
into a non-live one (if to the net system in Fig. 2 
are associated deterministic times to transitions 
and a race policy with the time associated to tran¬ 
sition c smaller than that of transition a, transition 
b cannot be fired, after firing transition d\ thus it 
is non-live, while the untimed model was live). 
By the addition of some time constraints, the 
transformation of a non-live model into a live one 
is also possible. So additional analysis techniques 
need to be considered, redefining the state, now 
depending also on time, more than just on the 
marking. 

Finally, as in any DES, the optimal control 
of timed Petri net models (scheduling, sequenc¬ 
ing, etc.) may be approached by techniques as 
dynamic programming or perturbation analysis 
(presented in the context of queueing networks 
and Markov chains, see ► Perturbation Analysis 
of Discrete Event Systems). In practice, those 
problems are frequently approached by means 
of some partially heuristic strategies. About the 
diagnosis of timed Petri nets, see ► Diagnosis of 
Discrete Event Systems. Of course, all these tasks 
can be done with HLPNs. 


Fluid and Hybrid PN Models 

Different ideas may lead to different kinds of 
hybrid PNs. One is to fluidize (here to relax the 
natural numbers of discrete markings into the 
nonnegative reals) the firing of transitions that are 
“most time” enabled. Then the relaxed model has 
discrete and continuous transitions, thus also dis¬ 
crete and continuous places. If all transitions are 
fluidized, the PN system is said to be fluid or con¬ 
tinuous , even if technically it is a hybrid one. In 
this approach, the main goal is to try to overcome 
the state explosion problem inherent to enumer¬ 
ation techniques. Proceeding in that way, some 
computationally NP-hard problems may become 
much easier to solve, eventually in polynomial 
time. In other words, fluidization is an abstraction 
that tries to make tractable certain real-scale DES 
problems (► Discrete Event Systems and Hybrid 
Systems, Connections Between). 

When transitions are timed with the so-called 
infinite server semantics, the PN system can be 
observed as a time differentiable piecewise affine 
system. Thus, even if the relaxation “simplifies” 
computations, it should be taken into account that 
continuous PNs with infinite server semantics 
are able to simulate Turing machines. From a 
different perspective, the steady-state throughput 
of a given transition may be non-monotonic with 
respect to the firing rates or the initial marking 
(e.g., if faster or more machines are used, the un¬ 
controlled system may be slower); moreover, due 
to the important expressive power, discontinuities 
may even appear with respect to continuous de¬ 
sign parameters as firing rates, for example. 

An alternative way to define hybrid Petri nets 
is a generalization of hybrid automata : The net 
system is a DES, but sets of differential equations 
are associated to the marking of places. If a 
place is marked, the corresponding differential 
equations contribute to define its evolution. 

Summary and Future Directions 

Petri nets designate a broad family of related 
DES formalisms (a modeling paradigm) each 
one specifically tailored to approach certain 
problems. Conceptual simplicity coexists with 
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powerful modeling, analysis, and synthesis 
capabilities. From a control theory perspective, 
much work remains to be done for both untimed 
and timed formalisms (remember, there are many 
different ways of timing), particularly when 
dealing with optimal control of timed models. 
In engineering practice, approaches to the latter 
class of problems frequently use heuristic strate¬ 
gies. From a broader perspective, future research 
directions include improvements required to 
deal with controllability and the design of 
controllers, with observability and the design 
of observers, with diagnosability and the design 
of diagnosers, and with identification. This work 
is not limited to the strict DES framework, but 
also applies to analogous problems relating to 
relaxations into hybrid ox fluid approximations 
(particularly useful when high populations are 
considered). The distributed nature of system is 
more and more frequent and is introducing new 
constraints, a subject requiring serious attention. 
In all cases, different from firing languages 
approaches, the so named structure theory of 
Petri nets should gain more interest. 

Cross-References 

► Applications of Discrete-Event Systems 

► Diagnosis of Discrete Event Systems 

► Discrete Event Systems and Hybrid Systems, 
Connections Between 

► Models for Discrete Event Systems: An 
Overview 

► Perturbation Analysis of Discrete Event 
Systems 

► Supervisory Control of Discrete-Event Systems 

Recommended Reading 

Topics related to PNs are considered in well 
over a hundred thousand papers and reports. 
The first generation of books concerning this 
field is Brauer (1980), immediately followed by 
Starke (1980), Peterson (1981), Brams (1983), 
Reisig (1985), and Silva (1985). The fact that 
they are written in English, French, German, and 
Spanish is proof of the rapid dissemination of this 


knowledge. Most of these books deal essentially 
with PT-net systems. Complementary surveys 
are Murata (1989), Silva (1993), and David and 
Alla (1994), the latter also considering some 
continuous and hybrid models. Concerning high- 
level PNs, Jensen and Rozenberg (1991) is a se¬ 
lection of papers covering the main developments 
during the 1980s. Jensen and Kristensen (2009) 
focuses on state space methods and simulation 
where elements of timed models are taken into 
account, but performance evaluation of stochastic 
systems is not covered. Approaching the present 
day, relevant works written with complementary 
perspectives include inter alia, Girault and Valk 
(2003), Diaz (2009), David and Alla (2010), 
and Seatzu et al. (2013). The consideration of 
time in nets with an emphasis on performance 
and performability evaluation is addressed in 
monographs such as Ajmone Marsan et al. 
(1995), Bause and Kritzinger (1996), Balbo 
and Silva (1998), and Haas (2002), while timed 
models under different fuzzy interpretations are 
the subject of Cardoso and Camargo (1999). 
Structure-based approaches to controlling PN 
models is the main subject in Iordache and 
Antsaklis (2006) or Chen and Li (2013). Different 
kinds of hybrid PN models are studied in Di 
Febbraro et al. (2001), Villani et al. (2007), and 
David and Alla (2010), while a broad perspective 
about modeling, analysis, and control of contin¬ 
uous (untimed and timed) PNs is provided by 
Silva et al. (2011). 

DiCesare et al. (1993) and Desrochers and Al- 
Jaar (1995) are devoted to the applications of 
PNs to manufacturing systems. A comprehensive 
updated introduction to business process systems 
and PNs can be found in van der Aalst and 
Stahl (2011). Special volumes dealing with other 
monographic topics are, for example, Billington 
et al. (1999), Agha et al. (2001), and Cortadella 
et al. (2002). An application domain for Petri 
nets emerging over the last two decades is sys¬ 
tems biology , a model-based approach devoted to 
the analysis of biological systems (Koch et al. 
2011; Wingender 2011). Furthermore, it should 
be pointed out that Petri nets have also been em¬ 
ployed in many other application domains (e.g., 
from logistics to musical systems). 
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For an overall perspective of the field over the 
five decades that have elapsed since the publica¬ 
tion of Carl Adam Petri’s PhD thesis, including 
historical, epistemological, and technical aspects, 
see Silva (2013). 
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Abstract 

This entry provides a brief description of model 
predictive control (MPC) technology and how it 
is used in practice. The emphasis here is on re¬ 
fining and chemical plant applications where the 
technology has achieved its greatest acceptance. 
After a short description of what MPC is and 
how it fits into the hierarchy of control functions, 
the basic algorithm is presented as a sequence of 
three optimization problems. The steps required 
for a successful application are then outlined, 
followed by a summary and outline of likely 
future directions for MPC technology. 


Keywords 

Computer control; Mathematical programming; 
Predictive control 


Introduction 

Model predictive control (MPC) refers to a class 
of computer control algorithms that utilize an 
explicit mathematical model to predict future 
process behavior. At each control interval, in the 
most general case, an MPC algorithm solves a 
sequence of three nonlinear programs to answer 
the following essential questions: where is the 


process heading (state estimation), where should 
the process go (steady-state target optimization), 
and what is the best sequence of control (input) 
adjustments to send it to the right place (dynamic 
optimization). The first control (input) adjust¬ 
ment is implemented and then the entire cal¬ 
culation sequence is repeated at the subsequent 
control cycles. 

MPC technology arose first in the context of 
petroleum refinery and power plant control prob¬ 
lems (Cutler and Ramaker 1979; Richalet et al. 
1978). Specific needs that drove the development 
of MPC technology include the requirement for 
economic optimization and strict enforcement 
of safety and equipment constraints. Promising 
early results led to a wave of successful industrial 
applications, sparking the development of several 
commercial offerings (Qin and Badgwell 2003) 
and generating intense interest from the academic 
community (Mayne et al. 2000). Today MPC 
technology permeates the refining and chemical 
industries and has gained increasing acceptance 
in a wide variety of areas including chemicals, 
automotive, aerospace, and food processing ap¬ 
plications. The total number of MPC applications 
worldwide was estimated in 2003 to be 4,500 
(Qin and Badgwell 2003). 

MPC Control Hierarchy 

In a modern chemical plant or refinery, MPC 
is part of a multilevel hierarchy, as illustrated 
in Fig. 1. Moving from the top level to the 
bottom, the control functions execute at a 
higher frequency but cover a smaller geographic 
scope. At the bottom level, referred to as 
Level 0, proportional-integral-derivative (PID) 
controllers execute several times a second within 
distributed control system (DCS) hardware. 
These controllers adjust individual valves to 
maintain desired flows, pressures, levels, and 
temperatures. 

At Level 1, MPC runs once a minute 
to perform dynamic constraint control for 
an individual processing unit, such as crude 
distillation unit or a fluid catalytic cracker (Gary 
et al. 2007). It typically utilizes a linear dynamic 
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Level 3: Global Economic Optimization 
(every day) 


Level 2: Local Economic Optimization 
(every hour) 


Level 1: Unit Dynamic Constraint Control 
(every minute) 


Level 0: Unit Basic Dynamic Control 
(every second) 


Model-Predictive Control in Practice, Fig. 1 Hierarchy of control functions in a refinery/chemical plant 


model identified directly from process step-test 
data. The MPC has the job of holding the unit 
at the best economic operating point in the 
face of dynamic disturbances and operational 
constraints. 

At Level 2, a real-time optimizer (RTO) runs 
hourly to calculate optimal steady-state targets 
for a collection of processing units. It uses a 
rigorous first-principles steady-state model to cal¬ 
culate targets for key operating variables such 
as unit temperatures and feed rates. These are 
typically passed down to several MPCs for im¬ 
plementation. 

At Level 3, planning and scheduling func¬ 
tions are carried out daily to optimize economics 
for an entire chemical plant or refinery. Simple 
steady-state models are typically used at this 
level, with some nonlinear but mostly linear con¬ 
nections between model inputs and outputs. Key 
operating targets and economic data are typ¬ 
ically passed to several RTO applications for 
implementation. 

Note that a different mathematical model of 
the process is used at each level of the hierarchy. 
These models must be reconciled in some man¬ 
ner with current plant operation and with each 
other in order for the overall system to function 
properly. 


MPC Algorithms 

MPC algorithms function in much the same way 
that an experienced human operator would ap¬ 
proach a control problem. Figure 2 illustrates the 
flow of information for a typical MPC imple¬ 
mentation. At each control interval, the algorithm 
compares the current model output prediction 
y p to the measured output y m and passes the 
prediction error e and control (input) n to a 
state estimator, which estimates the dynamic state 
v. The most commonly used methods for state 
estimation can be viewed as special cases of 
an optimization-based formulation called mov¬ 
ing horizon estimation (MHE) (Rawlings and 
Mayne 2009). The state estimate x , which in¬ 
cludes an estimate of the process disturbances 
d, is then passed to a steady-state optimizer to 
determine the best operating point for the unit. 
The steady-state optimizer must also consider 
operator-entered output and control (input) tar¬ 
gets y t and u t . The steady-state state and control 
(input) targets x s and u s are then passed, along 
with the state estimate x, to a dynamic optimizer 
to compute the best trajectory of future control 
(input) adjustments. The first computed control 
(input) adjustment is then implemented and the 
entire calculation sequence is repeated at the 
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next control interval. The various commercial 
MPC algorithms differ in such details as the 
mathematical form of the dynamic model and 
the specific formulations of the state estimation, 
steady-state optimization, and dynamic optimiza¬ 
tion problems (Qin and Badgwell 2003). 

In the general case, the MPC algorithm must 
solve the three optimization problems outlined 
above at each control interval. For the case of 
linear models and reasonable tuning parameters, 
these problems take the form of a convex 
quadratic program (QP) with a constant, positive- 
definite Hessian. As such, they can be solved 
relatively easily using standard optimization 
codes. For the case of a linear state-space model, 
the structure can be exploited even further to 
develop a specialized solution algorithm using an 
interior point method (Rao et al. 1998). 

For the case of nonlinear models, these 
problems take the form of a nonlinear program 
(NLP) for which the solution domain is no longer 
convex, greatly complicating the numerical 
solution. A typical strategy is to iterate on 
a linearized version of the problem until 
convergence (Bielger 2010). 

Implementation 

The combined experience of thousands of MPC 
applications in the process industries has led to 
a near consensus on the steps required for a 
successful implementation: 

• Justification - make the economic case for the 

application. 


• Pre-test - design the control and test sensors 
and actuators. 

• Step-test - generate process response data. 

• Modeling - develop model from process re¬ 
sponse data. 

• Configuration - configure the software and 
test preliminary tuning by simulation. 

• Commissioning - turn on and test the con¬ 
troller. 

• Post-audit - measure and certify economic 
performance. 

• Sustainment - monitor and maintain the appli¬ 
cation. 

The most expensive of these steps, both in 
terms of engineering time and lost production, is 
the generation of process response data through 
the step test. This is accomplished, in principle, 
by making significant adjustments to each vari¬ 
able that will be adjusted by the MPC while 
operating open loop to prevent compensating 
control action. This will necessarily cause abnor¬ 
mal movement in key operating variables, which 
may lead to lower throughput and off-spec prod¬ 
ucts. Significant progress has been made in re¬ 
cent years to minimize these difficulties through 
the use of approximate closed-loop step testing 
(Darby and Nikolaou 2012). 

Once the application has been commis¬ 
sioned, it is critical to set up an aggressive 
monitoring and sustainment program. MPC 
application benefits can fall off quickly 
due to changes in the process operation 
and as new personnel interact with it. New 
constraint variables may need to be added 
and key sections of the model may need 
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to be updated as time goes on. The math¬ 
ematical problem of MPC monitoring re¬ 
mains a topic of current academic research 
(Zagrobelny et al. 2012). 

Note that the implementation steps outlined 
above must be carried out by a carefully selected 
project team that typically includes, in addition 
to the MPC expert, an engineer with detailed 
knowledge of the process and an operator with 
significant relevant experience. 

Summary and Future Directions 

Model predictive control is now a mature tech¬ 
nology in the process industries. A representative 
MPC algorithm in this domain includes a state 
estimator, a steady-state optimizer, and a dynamic 
optimizer, running once a minute. A successful 
MPC application usually starts with a careful 
economic justification, includes significant par¬ 
ticipation from process engineers and operators, 
and is maintained with an aggressive sustainment 
program. Many thousands of such applications 
are currently operating around the world, gen¬ 
erating billions of dollars per year in economic 
benefits. 

Likely future directions for MPC prac¬ 
tice include increasing use of nonlinear 
models, improved state estimation through 
unmeasured disturbance modeling (Pannocchia 
and Rawlings 2003), and development of 
more efficient numerical solution methods 
(Zavala and Biegler 2009). 

Cross-References 

► Distributed Model Predictive Control 

► Nominal Model-Predictive Control 

► Optimization Algorithms for Model Predictive 
Control 

► Tracking Model Predictive Control 

Recommended Reading 

The first descriptions of MPC technology appear 
in papers by Richalet et al. (1978) and Cutler 


and Ramaker (1979). A detailed summary of the 
history of MPC technology development, as well 
as a survey of commercial offerings through 2003 
can be found in the review article by Qin and 
Badgwell (2003). Darby and Nikolaou present a 
more recent summary of MPC practice (Darby 
and Nikolaou 2012). Textbook descriptions of 
MPC theory and design, suitable for classroom 
use, include Rawlings and Mayne (2009) and 
Maciejowski (2002). The book by Ljung (1999) 
provides a good summary of methods for identi¬ 
fying dynamic models from test data. Theoretical 
properties of MPC are analyzed in a highly cited 
paper by Mayne and coworkers (2000). Guide¬ 
lines for designing disturbance models so as to 
achieve offset-free control can be found in Pan¬ 
nocchia and Rawlings (2003). Numerical solution 
strategies for the nonlinear programs found in 
MPC are discussed in the book by Biegler (2010). 
An efficient interior-point method for solving the 
linear MPC dynamic optimization is described 
in Rao et al. (1998). A promising algorithm for 
solving the nonlinear MPC dynamic optimiza¬ 
tion is outlined in Zavala and Biegler (2009). 
A data-based method for tuning Kalman Filters, 
which are often used for MPC state estimation, 
is described in Odelson et al. (2006). A new 
method for monitoring the performance of MPC 
is summarized in Zagrobelny et al. (2012). A 
readable summary of refining operations can be 
found in Gary et al. (2007). 
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Abstract 

This article provides an introduction to discrete 
event systems (DES) as a class of dynamic 
systems with characteristics significantly 
distinguishing them from traditional time-driven 
systems. It also overviews the main modeling 
frameworks used to formally describe the 
operation of DES and to study problems related 
to their control and optimization. 
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Introduction 

Discrete event systems (DES) form an important 
class of dynamic systems. The term was intro¬ 
duced in the early 1980s to describe a DES in 
terms of its most critical feature: the fact that 
its behavior is governed by discrete events which 
occur asynchronously over time and which are 
solely responsible for generating state transitions. 
In between event occurrences, the state of a DES 
is unaffected. Examples of such behavior abound 
in technological environments, including com¬ 
puter and communication networks, manufac¬ 
turing systems, transportation systems, logistics, 
and so forth. The operation of a DES is largely 
regulated by rules which are often unstructured 
and frequently human-made, as in initiating or 
terminating activities and scheduling the use of 
resources through controlled events (e.g., turning 
equipment “on”). On the other hand, their op¬ 
eration is also subject to uncontrolled randomly 
occurring events (e.g., a spontaneous equipment 
failure) which may or may not be observable 
through sensors. It is worth pointing out that the 
term “discrete event dynamic system” (DEDS) is 
also commonly used to emphasize the importance 
of the dynamical behavior of such systems (Cas¬ 
sandras and Lafortune 2008; Ho 1991). 

There are two aspects of a DES that define its 
behavior: 

1. The variables involved are both continuous 
and discrete, sometimes purely symbolic, i.e., 
nonnumeric (e.g., describing the state of a 
traffic light as “red” or “green”). This renders 
traditional mathematical models based on dif¬ 
ferential (or difference) equations inadequate 
and related methods based on calculus of lim¬ 
ited use. 

2. Because of the asynchronous nature of events 
that cause state transitions in a DES, it is 
neither natural nor efficient to use time as a 
synchronizing element driving its dynamics. 
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It is for this reason that DES are often referred 
to as event driven , to contrast them to clas¬ 
sical time-driven systems based on the laws 
of physics; in the latter, as time evolves state, 
variables such as position, velocity, tempera¬ 
ture, voltage, etc., also continuously evolve. In 
order to capture event-driven state dynamics, 
however, different mathematical models are 
necessary. 

In addition, uncertainties are inherent in 
the technological environments where DES are 
encountered. Therefore, associated mathematical 
models and methods for analysis and control 
must incorporate such uncertainties. Finally, 
complexity is also inherent in DES of practical 
interest, usually manifesting itself in the form of 
combinatorially explosive state spaces. Although 
purely analytical methods for DES design, 
analysis, and control are limited, they have 
still enabled reliable approximations of their 
dynamic behavior and the derivation of useful 
structural properties and provable performance 
guarantees. Much of the progress made in this 
field, however, has relied on new paradigms 
characterized by a combination of mathematical 
techniques, computer-based tools, and effective 
processing of experimental data. 

Event-driven and time-driven system com¬ 
ponents are often viewed as coexisting and 
interacting and are referred to as hybrid systems 
(separately considered in the Encyclopedia, 
including the article ►Discrete Event Systems 
and Hybrid Systems, Connections Between). 
Arguably, most contemporary technological 
systems are combinations of time-driven 
components (typically, the physical parts of a 
system) and event-driven components (usually, 
the computer-based controllers that collect data 
from and issue commands to the physical parts). 

Event-Driven vs. Time-Driven 
Systems 

In order to explain the difference between time- 
driven and event-driven behavior, we begin 
with the concept of “event.” An event should 
be thought of as occurring instantaneously and 


causing transitions from one system state value 
to another. It may be identified with an action 
(e.g., pressing a button), a spontaneous natural 
occurrence (e.g., a random equipment failure), or 
the result of conditions met by the system state 
(e.g., the fluid level in a tank exceeds a given 
value). For the purpose of developing a model 
for DES, we will use the symbol e to denote an 
event. Since a system is generally affected by 
different types of events, we assume that we can 
define a discrete event set E with e e E. 

In a classical system model, the “clock” is 
what drives a typical state trajectory: with every 
“clock tick” (which may be thought of as an 
“event”), the state is expected to change, since 
continuous state variables continuously change 
with time. This leads to the term time driven. 
In the case of time-driven systems described by 
continuous variables, the field of systems and 
control has based much of its success on the use 
of well-known differential-equation-based mod¬ 
els, such as 

x(t) = f(x(t),u(t), t), x(t 0 ) = X 0 (1) 

y(0 = g(x(0,u(0,0, (2) 

where (1) is a (vector) state equation with initial 
conditions specified and (2) is a (vector) output 
equation. As is common in system theory, x(^) 
denotes the state of the system, y(t) is the output, 
and u(0 represents the input, often associated 
with controllable variables used to manipulate 
the state so as to attain a desired output. Com¬ 
mon physical quantities such as position, veloc¬ 
ity, temperature, pressure, flow, etc., define state 
variables in (1). The state generally changes as 
time changes, and, as a result, the time variable t 
(or some integer k = 0,1,2,...in discrete time) 
is a natural independent variable for modeling 
such systems. 

In contrast, in a DES, time no longer serves 
the purpose of driving such a system and may 
no longer be an appropriate independent variable. 
Instead, at least some of the state variables are 
discrete, and their values change only at certain 
points in time through instantaneous transitions 
which we associate with “events.” If a clock 
is used, consider two possibilities: (i) At every 
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clock tick, an event e is selected from the event 
set E (if no event takes place, we use a “null 
event” as a member of E such that it causes no 
state change), and (ii) at various time instants (not 
necessarily known in advance or coinciding with 
clock ticks), some event e “announces” that it 
is occurring. Observe that in (i) state transitions 
are synchronized by the clock which is solely 
responsible for any possible state transition. In 
(ii), every event e e E defines a distinct process 
through which the time instants when e occurs 
are determined. State transitions are the result of 
combining these asynchronous concurrent event 
processes. Moreover, these processes need not 
be independent of each other. The distinction 
between (i) and (ii) gives rise to the terms time- 
driven and event-driven systems, respectively. 

Comparing state trajectories of time-driven 
and event-driven systems is useful in understand¬ 
ing the differences between the two and setting 
the stage for DES modeling frameworks. Thus, in 
Fig. 1, we observe the following: (i) For the time- 
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Models for Discrete Event Systems: An Overview, 
Fig. 1 Comparison of time-driven and event-driven state 
trajectories 


driven system shown, the state space X is the set 
of real numbers M, and x(t) can take any value 
from this set. The function x(t) is the solution 
of a differential equation of the general form 
x(t) = f(x(t),u(t),t), where u{t) is the input, 
(ii) For the event-driven system, the state space is 
some discrete set X = {s\,S 2 , S 3 , S 4 }. The sample 
path can only jump from one state to another 
whenever an event occurs. Note that an event 
may take place, but not cause a state transition, 
as in the case of e\. There is no immediately 
obvious analog to x(t) = f(x(t), u(t), t), i.e., no 
mechanism to specify how events might interact 
over time or how their time of occurrence might 
be determined. Thus, a large part of the early 
developments in the DES field has been devoted 
to the specification of an appropriate mathemati¬ 
cal model containing the same expressive power 
as (1)—(2) (Baccelli et al. 1992; Cassandras and 
Fafortune 2008; Glasserman and Yao 1994). 

We should point out that a time-driven 
system with continuous state variables, usually 
modeled through (1)—(2), may be abstracted 
as a DES through some form of discretization 
in time and quantization in the state space. 
We should also point out that discrete event 
systems should not be confused with discrete 
time systems. The class of discrete time systems 
contains both time-driven and event-driven 
systems. 

Timed and Untimed Models 
of Discrete Event Systems 

Returning to Fig. 1, instead of constructing the 
piecewise constant function x(t) as shown, it is 
convenient to simply write the timed sequence of 
events (e 2 ,h), (e 3 ,t 3 ), (e 4 ,? 4 >, Os ^5)} 

which contains the same information as the 
state trajectory. Assuming that the initial state 
of the system in this case) is known and that 
the system is “deterministic” in the sense that 
the next state after the occurrence of an event is 
unique, we can recover the state of the system 
at any point in time and reconstruct the DES 
state trajectory. The set of all possible timed 
sequences of events that a given system can ever 
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execute is called the timed language model of 
the system. The word “language” comes from 
the fact that we can think of the event E as an 
“alphabet” and of (finite) sequences of events as 
“words” (Hopcroft and Ullman 1979). We can 
further refine such a model by adding statistical 
information regarding the set of state trajectories 
(sample paths) of the system. Let us assume that 
probability distribution functions are available 
about the “lifetime” of each event type e e E, 
that is, the elapsed time between successive 
occurrences of this particular e. A stochastic 
timed language is a timed language together with 
associated probability distribution functions for 
the events. 

Stochastic timed language modeling is the 
most detailed in the sense that it contains event 
information in the form of event occurrences and 
their orderings, information about the exact times 
at which the events occur (not only their relative 
ordering), and statistical information about suc¬ 
cessive occurrences of events. If we delete the 
timing information from a timed language, we 
obtain an untimed language , or simply language , 
which is the set of all possible orderings of 
events that could happen in the given system. 
For example, the untimed sequence correspond¬ 
ing to the timed sequence of events in Fig. 1 is 
{e\, £ 2 , £3, £4, £5}- 

Untimed and timed languages represent 
different levels of abstraction at which DES 
are modeled and studied. The choice of the 
appropriate level of abstraction clearly depends 
on the objectives of the analysis. In many 
instances, we are interested in the “logical 
behavior” of the system, that is, in ensuring that 
all the event sequences it can generate satisfy 
a given set of specifications, e.g., maintaining 
a precise ordering of events. In this context, 
the actual timing of events is not required, 
and it is sufficient to model only the untimed 
behavior of the system. Supervisory control 
that is discussed in the article ► Supervisory 
Control of Discrete-Event Systems is the term 
established for describing the systematic means 
(i.e., enabling or disabling events which are 
controllable) by which the logical behavior 
of a DES is regulated to achieve a given 


specification (Cassandras and Lafortune 2008; 
Moody and Antsaklis 1998; Ramadge and 
Wonham 1987). 

On the other hand, we may be interested in 
event timing in order to answer questions such 
as the following: “How much time does the sys¬ 
tem spend at a particular state?” or “Can this 
sequence of events be completed by a partic¬ 
ular deadline?” More generally, event timing 
is important in assessing the performance of a 
DES often measured through quantities such as 
throughput or response time. In these instances, 
we need to consider the timed language model 
of the system. Since DES frequently operate in a 
stochastic setting, an additional level of complex¬ 
ity is introduced, necessitating the development 
of probabilistic models and related analytical 
methodologies for design and performance anal¬ 
ysis based on stochastic timed language models. 
Sample path analysis and perturbation analysis , 
discussed in the entry ►Perturbation Analysis 
of Discrete Event Systems, refer to the study of 
sample paths of DES, focusing on the extrac¬ 
tion of information for the purpose of efficiently 
estimating performance sensitivities of the sys¬ 
tem and, ultimately, achieving online control and 
optimization (Cassandras and Lafortune 2008; 
Glasserman 1991; Ho and Cao 1991; Ho and 
Cassandras 1983). 

These different levels of abstraction are com¬ 
plementary, as they address different issues about 
the behavior of a DES. Although the language- 
based approach to DES modeling is attractive, 
it is by itself not convenient to address verifica¬ 
tion, controller synthesis, or performance issues. 
This motivates the development of discrete event 
modeling formalisms which represent languages 
in a manner that highlights structural information 
about the system behavior and can be used to 
address analysis and controller synthesis issues. 
Next, we provide an overview of three major 
modeling formalisms which are used by most 
(but not all) system and control theoretic method¬ 
ologies pertaining to DES. Additional modeling 
formalisms encountered in the computer science 
literature include process algebras (Baeten and 
Weijland 1990) and communicating sequential 
processes (Hoare 1985). 
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Automata 

A deterministic automaton , denoted by G, is a 
six-tuple 


G = (x,e,f,r 9 x 0 ,x m ), 

where X is the set of states , £ is the finite set of 
events associated with the transitions in G, and 
/ : X x £ —> X is the transition function ; 
specifically, f(x,e) = y means that there is 
a transition labeled by event e from state x to 
state y and, in general, / is a partial function 
on its domain. T : A —> 2 s is the active event 
function (or feasible event function); T(x) is the 
set of all events e for which f{x,e) is defined 
and it is called the active event set (or feasible 
event set) of G at x. Finally, xo is the initial 
state and X m c A is the set of marked states. 
The terms state machine and generator (which 
explains the notation G) are also used to describe 
the above object. Moreover, if A is a finite set, we 
call G a deterministic finite-state automaton. A 
nondeterministic automaton is defined by means 
of a relation over A x £ x A or, equivalently, a 
function from A x £ to 2 X . 

The automaton G operates as follows. It starts 
in the initial state xo, and upon the occurrence of 
an event e e T (xo) c £ , it makes a transition to 
state f(xo,e) e X. This process then continues 
based on the transitions for which / is defined. 
Note that an event may occur without changing 
the state, i.e., f(x,e) = x. It is also possible that 
two distinct events occur at a given state causing 
the exact same transition, i.e., for a,b e £, 
f(x,a ) = f(x,b) = y. What is interesting 
about the latter fact is that we may not be able 
to distinguish between events a and b by simply 
observing a transition from state x to state y. 

For the sake of convenience, / is always 
extended from domain x £ to domain A x 
£*, where £* is the set of all finite strings 
of elements of £, including the empty string 
(denoted by s); the * operation is called the 
Kleene closure. This is accomplished in the fol¬ 
lowing recursive manner: f(x,s ) := x and 

f(x,se) := f{f{x,s),e) for s E £* and 
e E £. The (untimed) language generated by 


G and denoted by C{G) is the set of all strings 
in £* for which the extended function / is 
defined. The automaton model above is one in¬ 
stance of what is referred to as a generalized 
semi-Markov scheme (GSMS) in the literature of 
stochastic processes. A GSMS is viewed as the 
basis for extending automata to incorporate an 
event timing structure as well as nondeterministic 
state transition mechanisms, ultimately leading 
to stochastic timed automata , discussed in the 
sequel. 

Let us allow for generally countable sets A 
and E and leave out of the definition any con¬ 
sideration for marked states. Thus, we begin with 
an automaton model (X, £ , /, T, xo). We extend 
the modeling setting to timed automata by in¬ 
corporating a “clock structure” associated with 
the event set E which now becomes the input 
from which a specific event sequence can be 
deduced. The clock structure (or timing structure) 
associated with an event set £ is a set V = {v z : 
i E £} of clock (or lifetime) sequences 

v/={ui,i, Vi t 2 , ..i e £, Vi^ e M + , k = 1,2,... 


Timed Automaton. A timed automaton is de¬ 
fined as a six-tuple 


(X,£,fiT, * 0 ,V), 


where V = {v* : i E £} is a clock structure 
and (X,£, fir, xo) is an automaton. The automa¬ 
ton generates a state sequence x' = f{x,e') 
driven by an event sequence {ei,e 2 ,...} gener¬ 
ated through 


<?' = arg min {>’, } (3) 

i er(x) 

with the clock values y,, i e £, defined by 


y,= 


yi - j* if > # e ' and 1 e r<X> . rr ^ 

v iNl +i if i = e’ or i £ T(x) 1 ’ 

(4) 


where the interevent time y* is defined as 


y* = min {y, } 

i er(x) 


( 5 ) 
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and the event scores Nj,i E £, are defined by 

N , = \N, + Ufi=e' 0 'itr(x) , € , 

1 ( Ni otherwise 

( 6 ) 

In addition, initial conditions are y* = \ and 

Ni = 1 for all i E T(jco). If i r(jc 0 ), then y t 
is undefined and Nf =0. 

A simple interpretation of this elaborate def¬ 
inition is as follows. Given that the system is at 
some state x, the next event e’ is the one with 
the smallest clock value among all feasible events 
i E T(x). The corresponding clock value, y*, 
is the interevent time between the occurrence of 
e and e', and it provides the amount by which 
the time, t , moves forward: t' = t + y*. Clock 
values for all events that remain active in state x' 
are decremented by y*, except for the triggering 
event e' and all newly activated events, which are 
assigned a new lifetime Event scores are 

incremented whenever a new lifetime is assigned 
to them. It is important to note that the “system 
clock” t is fully controlled by the occurrence of 
events, which cause it to move forward; if no 
event occurs, the system remains at the last state 
observed. 

Comparing x' = f (x, e r ) to the state equa¬ 
tion (1) for time-driven systems, we see that 
the former can be viewed as the event-driven 
analog of the latter. However, the simplicity of 
x r = f (x, e f ) is deceptive: unless an event se¬ 
quence is given, determining the triggering event 
e’ which is required to obtain the next state x r 
involves the combination of (3)-(6). Therefore, 
the analog of (1) as a “canonical” state equation 
for a DES requires all Eqs. (3)-(6). Observe that 
this timed automaton generates a timed language, 
thus extending the untimed language generated 
by the original automaton G. 

In the definition above, the clock structure V 
is assumed to be fully specified in a deterministic 
sense and so are state transitions dictated by 
x' = f(x,e'). The sequences {v*}, i E £, can 
be extended to be specified only as stochastic 
sequences through distribution functions denoted 
by Fj, i E £. Thus, the stochastic clock structure 
(or stochastic timing structure ) associated with 


an event set £ is a set of distribution functions 
F = {Fj : i E £} characterizing the stochastic 
clock sequences 

{Vij c } = {V t ,uVu,...}, ieS, 

V itk € R + , k = 1,2,... 

It is usually assumed that each clock sequence 
consists of random variables which are inde¬ 
pendent and identically distributed (iid) and that 
all clock sequences are mutually independent. 
Thus, each {T/y} is completely characterized by 
a distribution function F t (t) = P[Vi < t]. There 
are, however, several ways in which a clock struc¬ 
ture can be extended to include situations where 
elements of a sequence {T/y} are correlated or 
two clock sequences are interdependent. As for 
state transitions which may be nondeterministic 
in nature, such behavior is modeled through state 
transition probabilities as explained next. 

Stochastic Timed Automaton. We can extend 
the definition of a timed automaton by viewing 
the state, event, and all event scores and clock 
values as random variables denoted respectively 
by capital letters X, E, Ni, and Y t , i E £. Thus, 
a stochastic timed automaton is a six-tuple 

(X,£,T, p, p 0 , F), 

where A' is a countable state space ; £ is a count¬ 
able event set ; T (v) is the active event set (or fea¬ 
sible event set); p(x';x,e') is a state transition 
probability defined for all x, x' E A, e' E £ and 
such that p(x';x,e') = 0 for all e' £ T(v); po is 
the probability mass function P [X 0 = x], x E A, 
of the initial state X$\ and F is a stochastic clock 
structure. The automaton generates a stochastic 
state sequence {Xo, X\, ...} through a transition 
mechanism (based on observations X = x, E' = 

e'): 

X' = x' with probability p(x'\x, e') (7) 

and it is driven by a stochastic event sequence 
{Ei, E 2 ,...} generated exactly as in (3)-(6) with 
random variables E, Yi, and Ni, i E £, instead 
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of deterministic quantities and with {Vi#} ~ 
Fi (~ denotes “with distribution”). In addition, 
initial conditions are Xo ~ po(x), Yj = Vi t \, and 
= 1 if i G r(X 0 ). If i £ T(X 0 ), then Y t is 
undefined and A; = 0. 

It is conceivable for two events to occur at 
the same time, in which case we need a priority 
scheme to overcome a possible ambiguity in the 
selection of the triggering event in (3). In prac¬ 
tice, it is common to expect that every in 
the clock structure is absolutely continuous over 
[0, oo) (so that its density function exists) and 
has a finite mean. This implies that two events 
can occur at exactly the same time only with 
probability 0. 

A stochastic process {X(f)} with state space 
X which is generated by a stochastic timed au¬ 
tomaton (X, £, T, p, po, F) is referred to as a 
generalized semi-Markov process (GSMP). This 
process is used as the basis of much of the sample 
path analysis methods for DES (see Cassandras 
and Lafortune 2008; Glasserman 1991; Ho and 
Cao 1991). 

Petri Nets 

An alternative modeling formalism for DES is 
provided by Petri nets , originating in the work 
of C. A. Petri in the early 1960s. Like an au¬ 
tomaton, a Petri net (Peterson 1981) is a device 
that manipulates events according to certain rules. 
One of its features is the inclusion of explicit 
conditions under which an event can be enabled. 
The Petri net modeling framework is the subject 
of the article ► Modeling, Analysis, and Control 
with Petri Nets. Thus, we limit ourselves here to 
a brief introduction. First, we define a Petri net 
graph, also called the Petri net structure. Then, 
we adjoin to this graph an initial state, a set of 
marked states, and a transition labeling function, 
resulting in the complete Petri net model, its 
associated dynamics, and the languages that it 
generates and marks. 

Petri Net Graph. A Petri net is a directed 
bipartite graph with two types of nodes, places 
and transitions , and arcs connecting them. Events 


are associated with transition nodes. In order 
for a transition to occur, several conditions may 
have to be satisfied. Information related to these 
conditions is contained in place nodes. Some 
such places are viewed as the “input” to a tran¬ 
sition; they are associated with the conditions 
required for this transition to occur. Other places 
are viewed as the output of a transition; they 
are associated with conditions that are affected 
by the occurrence of this transition. A Petri net 
graph is formally defined as a weighted directed 
bipartite graph (P,T, A, w) where P is the finite 
set of places (one type of node in the graph), T 
is the finite set of transitions (the other type of 
node in the graph), A c (P x T) U (T x P) 
is the set of arcs with directions from places to 
transitions and from transitions to places in the 
graph, and w : A -» {1,2, 3,...} is the weight 
function on the arcs. Let P = {p\, P 2 , • • •, p n }* 
and T = {t\, t 2 ,... ,t m }. It is convenient to 
use I(tj) to represent the set of input places to 
transition tj. Similarly, 0(tf) represents the set 
of output places from transition tj . Thus, we have 
I(tj) = {pi e P : e A} and 0(tj) = 

{Pi £ P • (tj, pi) e A}. 

Petri Net Dynamics. Tokens are assigned to 
places in a Petri net graph in order to indi¬ 
cate the fact that the condition described by that 
place is satisfied. The way in which tokens are 
assigned to a Petri net graph defines a mark¬ 
ing. Formally, a marking x of a Petri net graph 
(P,T,A,w) is a function x : P N = 
{0,1,2,...}. Marking x defines row vector x = 
[x(p\), x(p 2 ),... ,x(p n )], where n is the number 
of places in the Petri net. The / th entry of this 
vector indicates the (nonnegative integer) number 
of tokens in place p\, x(pi ) e N. In Petri 
net graphs, a token is indicated by a dark dot 
positioned in the appropriate place. The state of 
a Petri net is defined to be its marking vector 
x. The state transition mechanism of a Petri net 
is captured by the structure of its graph and by 
“moving” tokens from one place to another. A 
transition tj e T in a Petri net is said to be 
enabled if 

x(pi) > w(piJj) for all Pi e I (tj) . (8) 
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In words, transition tj in the Petri net is enabled 
when the number of tokens in p\ is at least as 
large as the weight of the arc connecting p\ to 
tj , for all places pi that are input to transition 
tj. When a transition is enabled, it can occur or 
fire. The state transition function of a Petri net is 
defined through the change in the state of the Petri 
net due to the firing of an enabled transition. The 
state transition function, / : N n x T -» N", of 
Petri net (P, T, A, w, x) is defined for transition 
tj e T if and only if (8) holds. Then, we set 
x' = f(x,tj) where 

x'(Pi) = x(pi) - w(pi,tj) + w(tj,Pi), 

i = l,...,n. (9) 

An “enabled transition” is therefore equivalent to 
a “feasible event” in an automaton. But whereas 
in automata the state transition function enumer¬ 
ates all feasible state transitions, here the state 
transition function is based on the structure of 
the Petri net. Thus, the next state defined by (9) 
explicitly depends on the input and output places 
of a transition and on the weights of the arcs con¬ 
necting these places to the transition. According 
to (9), if pi is an input place of tj , it loses as many 
tokens as the weight of the arc from p t to tj ; if it 
is an output place of tj , it gains as many tokens as 
the weight of the arc from tj to pi . Clearly, it is 
possible that pi is both an input and output place 
of tj. 

In general, it is entirely possible that, after 
several transition firings, the resulting state is 
x = [0,..., 0] or that the number of tokens in 
one or more places grows arbitrarily large after 
an arbitrarily large number of transition firings. 
The latter phenomenon is a key difference with 
automata, where finite-state automata have only a 
finite number of states, by definition. In contrast, 
a finite Petri net graph may result in a Petri net 
with an unbounded number of states. It should 
be noted that a finite-state automaton can always 
be represented as a Petri net; on the other hand, 
not all Petri nets can be represented as finite-state 
automata. 

Similar to timed automata, we can define 
timed Petri nets by introducing a clock structure, 
except that now a clock sequence v 7 is associated 


with a transition tj. A positive real number, 
Vj t k, assigned to tj has the following meaning: 
when transition tj is enabled for the kth time, 
it does not fire immediately, but incurs a firing 
delay given by Vj #; during this delay, tokens are 
kept in the input places of tj. Not all transitions 
are required to have firing delays. Thus, we 
partition T into subsets To and Td, such that 
T = To U Td . To is the set of transitions always 
incurring zero firing delay, and Td is the set 
of transitions that generally incur some firing 
delay. The latter are called timed transitions. The 
clock structure (or timing structure ) associated 
with a set of timed transitions Td Q T of a 
marked Petri net (P,T, A,w, x) is a set V = 
{Vj : tj G Td } of clock (or lifetime) sequences 
Vj = {Vj'UVj'2 ,---}, tj € T d , Vjjc € K+, 
k = 1,2,... A timed Petri net is a six-tuple 
(P,T, A,w,x,\), where (P,T, A,w,x) is a 
marked Petri net and V = {\j : tj G Td} is 
a clock structure. It is worth mentioning that 
this general structure allows for a variety of 
behaviors in a timed Petri net, including the 
possibility of multiple transitions being enabled 
at the same time or an enabled transition being 
preempted by the firing of another, depending 
on the values of the associated firing delays. 
The need to analyze and control such behavior 
in DES has motivated the development of a 
considerable body of analysis techniques for 
Petri net models which have been proven to be 
particularly suitable for this purpose (Moody and 
Antsaklis 1998; Peterson 1981). 

Dioid Algebras 

Another modeling framework is based on devel¬ 
oping an algebra using two operations: mm{a, b} 
(or ma x{a, b}) for any real numbers a and b and 
addition (<a + b). The motivation comes from 
the observation that the operations “min” and 
“+” are the only ones required to develop the 
timed automaton model. Similarly, the operations 
“max” and “+” are the only ones used in de¬ 
veloping the timed Petri net models described 
above. The operations are formally named addi¬ 
tion and multiplication and denoted by 0 and 0 
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respectively. However, their actual meaning (in 
terms of regular algebra) is different. For any two 
real numbers a and b , we define 

Addition : a ® b = max{a, b} (10) 

Multiplication : a 0 b = a + b. (11) 

This dioid algebra is also called a (max, +) 
algebra (Baccelli et al. 1992; Cuninghame-Green 
1979). If we consider a standard linear discrete 
time system, its state equation is of the form 

x(k + 1) = Ax(k) + Bu(£), 

which involves (regular) multiplication (x) and 
addition (+). It turns out that we can use a 
(max, +) algebra with DES, replacing the (+, x) 
algebra of conventional time-driven systems, and 
come up with a representation similar to the one 
above, thus paralleling to a considerable extent 
the analysis of classical time-driven linear sys¬ 
tems. We should emphasize, however, that this 
particular representation is only possible for a 
subset of DES. Moreover, while conceptually 
this offers an attractive way to capture the event 
timing dynamics in a DES, from a computa¬ 
tional standpoint, one still has to confront the 
complexity of performing the “max” operation 
when numerical information is ultimately needed 
to analyze the system or to design controllers for 
its proper operation. 

Control and Optimization of Discrete 
Event Systems 

The various control and optimization methodolo¬ 
gies developed to date for DES depend on the 
modeling level appropriate for the problem of 
interest. 

Logical Behavior. Issues such as ordering 
events according to some specification or 
ensuring the reachability of a particular state are 
normally addressed through the use of automata 
and Petri nets (Chen and Lafortune 1991; Moody 
and Antsaklis 1998; Ramadge and Wonham 
1987). Supervisory control theory provides 


a systematic framework for formulating and 
solving problems of this type; a comprehensive 
coverage can be found in Cassandras and 
Lafortune (2008). Logical behavior issues are 
also encountered in the diagnosis of partially 
observed DES, a topic covered in the article 
► Diagnosis of Discrete Event Systems. 

Event Timing. When timing issues are intro¬ 
duced, timed automata and timed Petri nets are 
invoked for modeling purposes. Supervisory con¬ 
trol in this case becomes significantly more com¬ 
plicated. An important class of problems, how¬ 
ever, does not involve the ordering of individual 
events, but rather the requirement that selected 
events occur within a given “time window” or 
with some desired periodic characteristics. Mod¬ 
els based on the algebraic structure of timed Petri 
nets or the (max, +) algebra provide convenient 
settings for formulating and solving such prob¬ 
lems (Baccelli et al. 1992; Glasserman and Yao 
1994). 

Performance Analysis. As in classical control 
theory, one can define a performance (or cost) 
function as a means for quantifying system 
behavior. This approach is particularly crucial 
in the study of stochastic DES. Because of 
the complexity of DES dynamics, analytical 
expressions for such performance metrics in 
terms of controllable variables are seldom 
available. This has motivated the use of 
simulation and, more generally, the study 
of DES sample paths; these have proven to 
contain a surprising wealth of information for 
control purposes. The theory of perturbation 
analysis presented in the article ►Perturbation 
Analysis of Discrete Event Systems has provided 
a systematic way of estimating performance 
sensitivities with respect to system parameters 
(Cassandras and Lafortune 2008; Cassandras and 
Panayiotou 1999; Glasserman 1991; Ho and Cao 
1991). 

Discrete Event Simulation. Because of the 
aforementioned complexity of DES dynamics, 
simulation becomes an essential part of DES 
performance analysis (Law and Kelton 1991). 
Discrete event simulation can be defined as a 
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systematic way of generating sample paths of 
a DES by means of a timed automaton or its 
stochastic counterpart. The same process can be 
carried out using a Petri net model or one based 
on the dioid algebra setting. 

Optimization. Optimization problems can be 
formulated in the context of both untimed and 
timed models of DES. Moreover, such problems 
can be formulated in both a deterministic and a 
stochastic setting. In the latter case, the ability 
to efficiently estimate performance sensitivities 
with respect to controllable system parameters 
provides a powerful tool for stochastic gradient- 
based optimization (when one can define 
derivatives) (Vazquez-Abad et al. 1998). 

A treatment of all such problems from an 
application-oriented standpoint, along with fur¬ 
ther details on the use of the modeling frame¬ 
works discussed in this entry, can be found in the 
article ► Applications of Discrete-Event Systems. 
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Abstract 

Mathematical models arising in biology might 
sometime exhibit the remarkable feature of 
preserving ordering of their solutions with 
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respect to initial data: in words, the “more” of 
x (the state variable) at time 0, the more of it 
at all subsequent times. Similar monotonicity 
properties are possibly exhibited also with 
respect to input levels. When this is the case, 
important features of the system’s dynamics can 
be inferred on the basis of purely qualitative 
or relatively basic quantitative knowledge of 
the system’s characteristics. We will discuss 
how monotonicity-related tools can be used 
to analyze and design biological systems 
with prescribed dynamical behaviors such 
as global stability, multistability, or periodic 
oscillations. 


Keywords 

Feedback interconnections; Monotone dynamics; 
Monotonicity checks 


Introduction 

Ordinary differential equations of a scalar 
unknown, under suitable assumptions for unicity 
of solutions, trivially enjoy the property that any 
pair of ordered initial conditions (according to 
the standard < order defined for real numbers) 
gives rise to ordered solutions at all positive times 
(as well as negative, though this is less relevant 
for the developments that follow). Monotone 
systems are a special but significant class of 
dynamical models, possibly evolving in high¬ 
dimensional or even infinite-dimensional state 
spaces, that are nevertheless characterized by 
a similar property holding with respect to a 
suitably defined notion of partial order. They 
became the focus of considerable interest in 
mathematics after a series of seminal papers 
by Hirsch (1985, 1988) provided the basic 
definitions as well as deep results showing 
how generic convergence properties of their 
solutions are expected under suitable technical 
assumptions. Shortly before that Smale (1976), 
Smale’s construction had already highlighted 
how specific solutions could instead exhibit 


arbitrary behavior (including periodic or chaotic). 
Further results along these lines provide insight 
into which set of extra assumptions allow one 
to strengthen generic convergence to global 
convergence, including, for instance, existence of 
positive first integrals (Banaji and Angeli 2010; 
Mierczynski 1987), tridiagonal structure (Smillie 
1984), or positive translation invariance (Angeli 
and Sontag 2008a). 

While these tools were initially developed 
having in mind applications arising in ecology, 
epidemiology, chemistry, or economy, it was due 
to the increased importance of mathematical 
modeling in molecular biology and the 
subsequent rapid development of systems biology 
as an emerging independent field of investigation 
that they became particularly relevant to biology. 
The paper Angeli and Sontag (2003) first 
introduced the notion of control monotone 
systems, including input and output variables, 
that is of interest if one is looking at systems 
arising from interconnection of monotone 
modules. Small-gain theorems and related 
conditions were defined to study both positive 
(Angeli and Sontag 2004b) and negative (Angeli 
and Sontag 2003) feedback interconnections by 
relating their asymptotic behavior to properties 
of the discrete iterations of a suitable map, 
called the steady-state characteristic of the 
system. 

In particular, convergence of this map is re¬ 
lated to convergent solutions for the original con¬ 
tinuous time system; on the other hand, specific 
negative feedback interconnections can instead 
give rise to oscillations as a result of Hopf bi¬ 
furcations as in Angeli and Sontag (2008b) or 
to relaxation oscillators as highlighted in Gedeon 
and Sontag (2007). 

A parallel line of investigation, originated in 
the work of Volpert et al. (1994), exploited the 
specific features of models arising in biochem¬ 
istry by focusing on structural conditions for 
monotonicity of chemical reaction networks (An¬ 
geli et al. 2010; Banaji 2009). Monotonicity is 
only one of the possible tools adopted in the 
study of dynamics for such class of models in 
the related field of chemical reaction networks 
theory. 
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Mathematical Preliminaries 

To illustrate the main tools of monotone dynam¬ 
ics, we consider the following systems defined 
on partially ordered input, state, and output 
spaces. Namely, along with the sets U,X,Y 
(which denote input, state, and output space, 
respectively), we consider corresponding partial 
orders A typical way of defining 

a partial order on a set S embedded in some 
Euclidean space E, S C E, is to first identify 
a cone K of positive vectors which belong to 
E. A cone in this context is any closed convex 
set which is preserved under multiplication times 
nonnegative scalars and such that K Pi — K = {0}. 
Accordingly we may denote s\ hs s 2 whenever 
S\ — S 2 £ K. A typical choice of K in the case 
of finite-dimensional E = M" is the positive 
orthant, (K = [0, + 00 )”), in which case >: can be 
interpreted as componentwise inequalities. More 
general orthants are also very useful in several 
applications as well as more exotic cones, smooth 
or polyhedral, according to the specific model 
considered. When dealing with input signals, we 
let U denote the set of locally essentially bounded 
and measurable functions of time. In particular, 
we inherit a partial order on U from the partial 
order on U according to the following definition: 

Mi(-) hu m 2 (0 O mi(0 hu m 2 (0 V/g1. 

When obvious from the context, we do not 
emphasize the space to which variables belong 
and simply write >:. Strict order notions are also 
of interest and especially relevant for some of 
the deepest implications of the theory. We let 
s\ >- S 2 denote S\ > S 2 and s 1 ^ s 2 . While for 
partial orders induced by positivity cones, we let 
s 1 S 2 denote s\ — s 2 G infiW). 

A dynamical system is for us a continuous 
map <p : M x X —> X which fulfills the property, 
( p(0,x ) = v for all x G X and cpfe, cp(tu x)) = 
cp(t\ + t 2 , x) for all t\,t 2 £ Sometimes, when 
solutions are not globally defined (for instance, if 
the system is defined through a set of nonlinear 
differential equations), it is enough to restrict the 
definitions that follow to the domain of existence 
of solutions. 


Definition 1 A monotone system <p is one that 
fulfills the following: 

Vvi,x 2 e X : x\ >_ X 2 (p(t,x 1) >: (p(t,X 2 ) 

Vt> 0. (1) 

A system cp is strongly monotone when the fol¬ 
lowing holds: 

Vxi,x 2 G X : X\ > X 2 (p(t,X 1) (p(t,X 2 ) 

V t > 0 . ( 2 ) 

A control system is characterized by two con¬ 
tinuous mappings: (p:RxXxU^X and the 
readout map h : X xU -> Y . 

Definition 2 A control system is monotone if 

V mi, U2 g U : u\ >: w 2 , Vxi,x 2 gX : x\ >_ v 2 , 

>0 (p(t, x\, u\) (p{t, jc 2 , w 2 ) (3) 

and 

V Ml, U2 G U \U\>2 U2 , V Vi, X 2 GA : X\ >2 X2, 

h(x\,u\) > A(x 2 ,m 2 ). (4) 

Notice that for any ordered state and input pairs 
X\,X 2 , mi,m 2 , the signals y 1 and y 2 defined 
as yi(0 := A(^(t,xi, mi),m^O), J 2(0 := 

hicpit, jc 2 , w 2 ), w 2 (^)) also fulfill, thanks to the 
Definition 2, yi(0 ^7 y 2 (0 (for all t > 0). 

A system which is monotone with respect to 
the positive orthant is called cooperative. If a 
system is cooperative after reverting the direction 
of time, it is called competitive. Checking if 
a mathematical model specified by differential 
equations is monotone with respect to the partial 
order induced by some cone K is not too diffi¬ 
cult. In particular, monotonicity, in its most basic 
formulation (1), simply amounts to a check of 
positive invariance of the set T := {(xi,x 2 ) G 
X 2 : x\ >2 X 2 ) for a system formed by two copies 
of cp in parallel. This can be assessed without 
explicit knowledge of solutions, for instance, by 
using the notion of tangent cones and Nagumo’s 
theorem (Angeli and Sontag 2003). Sufficient 
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conditions also exist to assess strong monotonic¬ 
ity, for instance, in the case of orthant cones. 
Finding whether there exists an order (as induced, 
for instance, by a suitable cone K) which can 
make a system monotone is instead a harder task 
which normally entails a good deal of insight in 
the systems dynamics. 

It is worth mentioning that for the special case 
of linear systems, monotonicity is just equiva¬ 
lent to invariance of the cone K , as incremental 
properties (referred to pairs of solutions) are just 
equivalent to their non-incremental counterparts 
(referred to the 0 solution). In this respect, a 
substantial amount of theory exists starting from 
classical works such as the Perron-Frobenius the¬ 
ory on positive and cone-preserving maps; this is, 
however, outside the scope of this entry, and the 
interested reader may refer to Farina and Rinaldi 
(2000) for a recent book on the subject. 

Monotone Dynamics 

We divide this section in three parts; first we sum¬ 
marize the main tools for checking monotonicity 
with respect to orthant cones, then we recall some 
of the main consequences of monotonicity for the 
long-term behavior of solutions and, finally, we 
study interconnections of monotone systems. 

Checking Monotonicity 

Orthant cones and the partial orders they induce 
play a major role in biology applications. In fact, 
for systems described by equations 

* = f{x) (5) 

with I cK" open and / : X —>• M" of class C 1 , 
the following characterization holds: 

Proposition 1 The system cp induced hy the set 
of differential equations (5) is cooperative if and 
only if the Jacobian is a Metzler matrix for all 
x e X. 

We recall that M is Metzler if mq > 0 for all i ^ 
j . Let A = diag[Ai,..., \ n \ with A/ e {—1,1} 
and assume that the orthant O = A[0, -\-oo ) n . 
It is straightforward to see that X\ *2 O 


Ax\ > Av 2 , where denotes the partial order 
induced by the positive orthant, while de¬ 
notes the order induced by O. This means that 
we may check monotonicity with respect to O 
by performing a simple change of coordinates 
z = Ax. As a corollary: 

Proposition 2 The system cp induced by the set 
of differential equations (5) is monotone with 
respect to if and only if A|^A is a Metzler 
matrix for all x e X. 

Notice that conditions of Propositions 1 and 2 
can be expressed in terms of sign constraints on 
off-diagonal entries of the Jacobian; in biological 
terms a sign constraint in an off-diagonal entry 
amounts to asking that a particular species (mean¬ 
ing chemical compound or otherwise) consis¬ 
tently exhibit throughout the considered model’s 
state space either an excitatory or inhibitory effect 
on some other species of interest. Qualitative 
diagrams showing effects of species on each other 
are commonly used by biologists to understand 
the working principles of biomolecular networks. 

Remarkably, Proposition 2 has also an in¬ 
teresting graph theoretical interpretation if one 
thinks of sign as the adjacency matrix of a 
graph with nodes X\ ... x n corresponding to the 
state variables of the system. 

Proposition 3 The system <p induced by the set 
of differential equations (5) is monotone with 
respect to >o if and only if the directed graph 
of adjacency matrix sign (§0 (neglecting diag¬ 
onal entries) has undirected loops with an even 
number of negative edges. 

This means in particular that must be sign 
symmetric (no predator-prey-type interactions) 
and in addition that a similar parity property has 
to hold on undirected loops of arbitrary length. 
Sufficient conditions for strong monotonicity are 
also known, for instance, in terms of irreducibil- 
ity of the Jacobian matrix (Kamke’s condition; 
see Hirsch and Smith 2003). 

Asymptotic Dynamics 

As previously mentioned, several important im¬ 
plications of monotonicity are with respect to 



Monotone Systems in Biology 


773 


asymptotic dynamics. Let £ denote the set of 
equilibria of (p. The following result is due to 
Hirsch (1985). 

Theorem 1 Let cp be a strongly monotone sys¬ 
tem with bounded solutions. There exists a zero 
measure set Q such that each solution starting in 
X\Q converges toward £. 

Global convergence results can be achieved for 
important classes of monotone dynamics. For 
instance, when increasing conservation laws are 
present (see Banaji and Angeli 2010): 

Theorem 2 Let X C K C M" be any two proper 
cones. Let cp on X be strongly monotone with 
respect to the partial order induced by K and 
preserving a K -increasing first integral. Then 
every bounded solution converges. 

Dually to first integrals, positive translation in¬ 
variance of the dynamics also provide grounds 
for global convergence (see Angeli and Sontag 
2008a): 

Theorem 3 If a system is strongly monotone and 
fulfills cp{t, Xo+hv) = cp(t , xf)-\-hv for all h e M 
and some v 0, then all solutions with bounded 
projections in v 1 - converge. 

The class of tridiagonal cooperative systems has 
also been investigated as a significant remarkable 
class of global convergent dynamics; see Smillie 
(1984). These arise from differential equations 
x = f(x) when dfi / dxj = 0 for all | i — j \ > 1 . 

Finally it is worth emphasizing how signif¬ 
icant for biological systems, often subject to 
phenomena evolving at different timescales, are 
also results on singular perturbations (Gedeon 
and Sontag 2007; Wang and Sontag 2008). 

Interconnected Monotone Systems 

Results on interconnected monotone SISO sys¬ 
tems are surveyed in Angeli and Sontag (2004a). 
The main tool used in this context is the notion of 
input-state and input-output steady-state charac¬ 
teristic. 

Definition 3 A control system admits a well- 
defined input-state characteristic if for all 
constant inputs u there exists a unique globally 


asymptotically stable equilibrium k x (u) and 
the map k x (u ) is continuous. If moreover the 
equilibrium is hyperbolic, then k x is called a 
non-degenerate characteristic. The input-output 
characteristic is defined as k y (u ) = h(k x (u)). 

Let cp be system with a well-defined input-output 
characteristic k y ; we may define the iteration 

Uk+\=k y (uk). ( 6 ) 

It is clear that fixed points of ( 6 ) correspond to 
input values (and therefore to equilibria through 
the characteristic map k x ) of the closed-loop 
system derived by considering the unity feedback 
interconnection u = y. What is remarkable for 
monotone systems is that both in the case of 
positive and negative feedback and in a precise 
sense, stability properties of the fixed points of 
the discrete iteration ( 6 ) are matched by stability 
properties of the corresponding associated solu¬ 
tions of the original continuous time system. See 
Angeli and Sontag (2004b) for the case of posi¬ 
tive feedback interconnections and Angeli et al. 
(2004) for applications of such results to synthe¬ 
sis and detection of multistability in molecular 
biology. 

Multistability, in particular, is an important 
dynamical feature of specific cellular systems and 
can be achieved, with good degree of robustness 
with respect to different types of uncertainties, by 
means of positive feedback interconnections of 
monotone subsystems. The typical input-output 
characteristic k y giving rise to such behavior is, 
in the SISO case, that of a sigmoidal function 
intersecting in 3 points the diagonal u = y. Two 
of the fixed points, namely, u\ and u 3 (see Fig. 1), 
are asymptotically stable for ( 6 ), and the cor¬ 
responding equilibria of the original continuous 
time monotone system are also asymptotically 
stable with a basin of attraction which covers 
almost all initial conditions. The fixed-point U 2 
is unstable and the corresponding equilibrium is 
also such (under suitable technical assumption 
on the non-degeneracy of the 1-0 characteristic). 
Extensions of similar criteria to the MIMO case 
are presented in Enciso and Sontag (2005). 
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Negative feedback normally destroys mono¬ 
tonicity. As a result, the likelihood of complex 
dynamical behavior is highly increased. Never¬ 
theless, input-output characteristics still can pro¬ 
vide useful insight in the system’s dynamics at 
least in the case of low feedback gain or, for 
high feedback gains, in the presence of suffi¬ 
ciently large delays. For instance, unity negative 
feedback interconnection of a SISO monotone 
system may give rise to a unique and globally 
asymptotically stable fixed point of (6), thanks 
to the decreasingness of the input-output char¬ 
acteristic and as shown in Fig. 2. Under such 
circumstances a small-gain result applies and 
global asymptotic stability of the corresponding 
equilibrium is guaranteed regardless of arbitrary 
input delays in the systems. See Angeli and 
Sontag (2003) for the simplest small-gain theo¬ 
rem developed in the context of SISO negative 
feedback interconnections of monotone systems 
and Enciso and Sontag (2006) for generalizations 
to systems with multiple inputs as well as delays. 
A generalization of small-gain results to the case 
of MIMO systems which are neither in a positive 
nor negative feedback configuration is presented 
in Angeli and Sontag (2011). 


When the iteration (6) has an unstable fixed 
point, for instance, it converges to a period- 
2 solution, one may expect insurgence of 
oscillations around the equilibrium through a 
Hopf bifurcation provided sufficiently large 
delays in the input channels are allowed. This 
situation is analyzed in Angeli and Sontag 
(2008b) and illustrated through the study of the 
classical Golbeter’s model for the Drosophila's 
circadian rhythm. 

Summary and Future Directions 

Verifying that a control system preserves some 
ordering of initial conditions provides impor¬ 
tant and far-reaching implications for its dynam¬ 
ics. Insurgence of specific behaviors can often 
be inferred on the basis of purely qualitative 
knowledge (as in the case of Hirsch’s generic 
convergence theorem) as well as additional basic 
quantitative knowledge as in the case of positive 
and negative feedback interconnections of mono¬ 
tone systems. For the above reasons, applications 
in molecular biology of monotone system’s the¬ 
ory are gradually emerging: for instance, in the 
study of MAPK cascades or circadian oscilla- 
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tions, as well as in Chemical Reaction Networks 
Theory. Generally speaking, while monotonicity 
as a whole cannot be expected in large networks, 
experimental data shows that the number of neg¬ 
ative feedback loops in biological regulatory net¬ 
works is significantly lower than in a random 
signed graph of comparable size Maayan et al. 
(2008). 

Analyzing the properties of monotone dynam¬ 
ics may potentially lead to better understanding 
of the key regulatory mechanisms of complex 
networks as well as the development of bottom- 
up approaches for the identification of meaning¬ 
ful submodules in biological networks. Poten¬ 
tial research directions may include both novel 
computational tools and specific applications to 
systems biology, for instance: 

• Algorithms for detection of monotonicity with 
respect to exotic orders (such as arbitrary 
polytopic cones or even state-dependent 
cones) 

• Application of monotonicity-based ideas to 
control synthesis (see, for instance, Aswani 
and Tomlin (2009) where the special class of 
piecewise affine systems is considered) 

Cross-References 

► Deterministic Description of Biochemical Net¬ 
works 

► Spatial Description of Biochemical Networks 

► Stochastic Description of Biochemical 
Networks 

Recommended Reading 

For readers interested in the mathematical details 
of monotone systems theory we recommend the 
following: 

Smith H (1995) Monotone dynamical systems: 
an introduction to the theory of competitive 
and cooperative systems. Mathematical sur¬ 
veys and monographs, vol 41. AMS, Provi¬ 
dence 

A more recent technical survey of aspects related 
to asymptotic dynamics of monotone systems is 


Hirsch MW, Smith H (2005) Monotone 
dynamical systems (Chapter 4). In: Canada A, 
Drabek P, Fonda A (eds) Handbook of 
differential equations ordinary differential 
equations, vol 2. Elsevier 
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Abstract 

The fundamental idea behind symbolic control 
is to mitigate the complexity of a dynamic sys¬ 
tem by limiting the set of available controls to 
a typically finite collection of symbols. Each 
symbol represents a control law that may be 
either open or closed loop. With these symbols, 
a simpler description of the motion of the system 
can be created, thereby easing the challenges 
of analysis and control design. In this entry, 
we provide a high-level description of symbolic 


control; discuss briefly its history, connections, 
and applications; and provide a few insights into 
where the field is going. 

Keywords 

Abstraction; Complex systems; Formal methods 


Introduction 

Systems and control theory is powerful paradigm 
for analyzing, understanding, and controlling dy¬ 
namic systems. Traditional tools in the field for 
developing and analyzing control laws, however, 
face significant challenges when one needs to 
deal with the complexity that arises in many 
practical, real-world settings such as the control 
of autonomous, mobile systems operating in un¬ 
certain and changing physical environments. This 
is particularly true when the tasks to be achieved 
are not easily framed in terms of motion to a point 
in the state space. One of the primary goals of 
symbolic control is to mitigate this complexity 
by abstracting some combination of the system 
dynamics, the space of control inputs, and the 
physical environment to a simpler, typically fi¬ 
nite, model. 

This fundamental idea, namely, that of ab¬ 
stracting away the complexity of the underlying 
dynamics and environment, is in fact a quite 
natural one. Consider, for example, how you 
give instructions to another person wanting to 
go to a point of interest. It would be absurd 
to provide details at the level of their actua¬ 
tors, namely, with commands to their individual 
muscles (or to carry the example to an even 
more absurd extreme, to the dynamic components 
that make up those muscles). Rather, very high- 
level commands are given, such as “follow the 
road,” “turn right,” and so on. Each of these 
provides a description of what to do with the 
understanding that the person can carry out those 
commands in their own fashion. Similarly, the en¬ 
vironment itself is abstracted, and only elements 
meaningful to the task at hand are described. 
Thus, continuing the example above, rather than 
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providing metric information or a detailed map, 
the instructions may use environmental features 
to determine when particular actions should be 
terminated and the next begun, such as “follow 
the road until the second intersection, then turn 
right.” 

Underlying the idea of symbolic control is 
the notion that rich behaviors can result from 
simple actions. This premise was used in many 
early robots and can be traced back at least to 
the ideas of Norbert Wiener on cybernetics (see 
Arkin 1998). It is at the heart of the behavior- 
based approach to robotics (Brooks 1986). Sim¬ 
ilar ideas can also be seen in the development 
of a high-level language (G-codes) for Computer 
Numerically Controlled (CNC) machines. The 
key technical ideas in the more general setting 
of symbolic control for dynamic systems can be 
traced back to Brockett (1988) which introduced 
ideas of formalizing a modular approach to pro¬ 
gramming motion control devices through the 
development of a Motion Description Language 
(MDL). 

The goal of the present work is to introduce the 
interested reader to the general ideas of symbolic 
control as well as to some of its application 
areas and research directions. While it is not a 
survey paper, a few select references are provided 
throughout to point the reader in hopefully fruit¬ 
ful directions into the literature. 


Models and Approaches 

There are at least two related but distinct ap¬ 
proaches to symbolic control. Both begin with a 
mathematical description of the system, typically 
given as an ordinary differential equation of the 
form 

x = f(x,u,t), y = h(x,t) (1) 

where r is a vector describing the state of the 
system, y is the output of the sensors of the 
system, and u is the control input. 

Under the first approach to symbolic control, 
the focus is on reducing the complexity of the 


space of possible control signals by limiting the 
system to a typically finite collection of control 
symbols. Each of these symbols represents a 
control law that may be open loop or may utilize 
output feedback. For example, follow the road 
could be a feedback control law that uses sensor 
measurements to determine the position relative 
to the road and then applies steering commands 
so that the system stays on the road while simul¬ 
taneously maintaining a constant speed. There 
are, of course, many ways to accomplish the 
specifics of this task, and the details will depend 
on the particular system. Thus, an autonomous 
four-wheeled vehicle equipped with a laser range 
finder, an autonomous motorcycle equipped with 
ultrasonic sensors, or an autonomous aerial vehi¬ 
cle with a camera would each carry out the com¬ 
mand in their own way, and each would have very 
different trajectories. They would all, however, 
satisfy the notion of follow the road. Description 
of the behavior of the system can then be given in 
terms of the abstract symbols rather than in terms 
of the details of the trajectories. 

Typically each of these symbols describes an 
action that at least conceptually is simple. In 
order to generate rich motions to carry out com¬ 
plex tasks, the system is switched between the 
available symbols. Switching conditions are of¬ 
ten referred to as interrupts. Interrupts may be 
purely time-based (e.g., apply a given symbol 
for T seconds) or may be expressed in terms 
of symbols representing certain environmental 
conditions. These may be simple function of 
the measurements (e.g., interrupt when an in¬ 
tersection is detected) or may represent more 
complicated scenarios with history and dynamics 
(e.g., interrupt after the second intersection is 
detected). Just as the input symbols abstract away 
the details of the control space and of the mo¬ 
tion of the system, the interrupt symbols abstract 
away the details of the environment. For example, 
intersection has a clear high-level meaning but 
a very different sensor “signature” for particular 
systems. 

As a simple illustrative example, consider a 
collection of control symbols designed for mov¬ 
ing along a system of roads, {follow road , turn 
right , turn left }, and a collection of interrupt 




778 


Motion Description Languages and Symbolic Control 




Motion Description Languages and Symbolic Con¬ 
trol, Fig. 1 Simple example of symbolic control with a 
focus on abstracting the inputs. Two systems, a snakelike 
robot and an autonomous car, are given a high-level plan 
in terms of symbols for navigating a right-hand turn. 


The systems each interpret the same symbols in their 
own ways, leading to different trajectories due both to 
differences in dynamics and also to different sensors cues 
as caused, for example, by the parked vehicle encountered 
by the car in this scenario 


symbols for triggering changes in such a setting, 
{in intersection , clear of intersection} . Suppose 
there are two vehicles that can each interpret 
these symbols, an autonomous car and a snake¬ 
like robot, as illustrated in Fig. 1. It is reasonable 
to assume that the control symbols each describe 
relatively complex dynamics that allow, for ex¬ 
ample, for obstacle avoidance while carrying out 
the action. Figure 1 illustrates a possible situation 
where the two systems carry out the plan defined 
by the symbolic sequence: 

(Follow the road UNTIL in intersection) 

(Turn right UNTIL clear of intersection) 

The intent of this plan is for the system to nav¬ 
igate a right-hand turn. As shown in the figure, 
the actual trajectories followed by the systems 
can be markedly different due in part to sys¬ 
tem dynamics (the snakelike robot undulates, 
while the car does not) as well as to different 
sensor responses (when the car goes through, 
there is a parked vehicle that it must navigate 
around, while the snakelike robot found a clear 
path during its execution). Despite these dif¬ 
ferences, both systems achieve the goal of the 
plan. 

The collection of control and interrupt 
symbols can be thought of as a language for 
describing and specifying motion and are used 


to write programs that can be compiled into 
an executable for a specific system. Different 
rules for doing this can be established that 
define different languages, analogous to different 
high-level programming languages such as 
C++, Java, or Python. Further details can 
be found in, for example, Manikonda et al. 
(1998). 

Under the second approach, the focus is on 
representing the dynamics and state space (or 
environment) of the system in an abstract, sym¬ 
bolic way. The fundamental idea is to lump all 
the states in a region into a single abstract el¬ 
ement and to then represent the entire system 
with a finite number of these elements. Control 
laws are then defined that steer all the states 
in one element into some state in a different 
region. The symbolic control system is then the 
finite set of elements representing regions to¬ 
gether with the finite set of controllers for moving 
between them. It can be thought of essentially 
as a graph (or more accurately as a transition 
system) in which the nodes represent regions in 
the state space and the edges represent achiev¬ 
able transitions between them. The goal of this 
abstraction step is for the two representations to 
be equivalent (or at least approximately equiv¬ 
alent) in that any motion that can be achieved 
in one can be achieved in the other (in an ap¬ 
propriate sense). Planning and analysis can then 
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Abstraction 


Execution of plan 


Motion Description Languages and Symbolic Con¬ 
trol, Fig. 2 Simple example of symbolic control with a 
focus on abstracting the system dynamics and environ¬ 
ment. The initial environment (left image ) is segmented 
into different regions and simple controllers developed 
for moving from region to region. The image shows two 
possible controllers: one that actuates the robot through 
a tight slither pattern to move forward by one region and 


one that twists the robot to face the cell to the left before 
slithering across and then reorienting. The combination of 
regions and actions yields a symbolic abstraction (<center 
image) that allows for planning to achieve specific goals, 
such as moving through the right-hand turn. Executing 
this plan leads to a physical trajectory of the system (right 
image) 


be done on the (simpler) symbolic model. Fur¬ 
ther details on such schemes can be found in, 
for example, Tabuada (2006) and Bicchi et al. 
(2006). 

As an illustrative example, consider as before 
a snakelike robot moving through a right-hand 
turn. In a simplified view of this second approach 
to symbolic control, one begins by dividing the 
environment up into regions and then defining 
controllers to steer the robot from region to region 
as illustrated in the left image in Fig. 2. This 
yields the symbolic model shown in the center 
of Fig. 2. A plan is then developed on this model 
to move from the initial position to the final 
position. This planning step can take into account 
restrictions on the motion and subgoals of the 
task. Here, for example, one may want the robot 
to stay to the right of the double yellow line that is 
in its lane of traffic. The plan R 2 —> R 4 Re 
R% —> Rg —> ^10 is one sequence that drives 
the system around the turn while satisfying the 
lane requirement. Each transition in the sequence 
corresponds to a control law. The plan is then 
executed by applying the sequence of control 
laws, resulting in the actual trajectory shown in 
the right image in Fig. 2. 


Applications and Connections 

The fundamental idea behind symbolic control, 
namely, mitigating complexity by abstracting a 
system, its environment, and even the tasks to be 
accomplished into a simpler but (approximately) 
equivalent model, is a natural and a powerful one. 
It has clear connections to both hybrid systems 
(Brockett 1993; Egerstedt 2002) and to quantized 
control (Bicchi et al. 2006), and the tools from 
those fields are often useful in describing and 
analyzing systems with symbolic representations 
of the control and of the dynamics. Symbolic 
control is not, however, strictly a subcategory of 
either field, and it provides a unique set of tools 
for the control and analysis of dynamic systems. 

Brockett’s original MDL was intended to 
serve as a tool for describing and planning 
robot motion. Inspired in part by this, languages 
for motion continue to be developed. Some 
of these extend and provide a more formal 
basis for motion programming (Manikonda 
et al. 1998) and interconnection of dynamic 
systems into a single whole (Murray et al. 
1992), while some are designed for specialized 
dynamics or applications such as flight vehicles 
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(Frazzoli et al. 2005), self-assembly (Klavins 
2007), and other areas. In addition to studying 
standard systems and control theoretic ideas, 
including notions of reachability (Bicchi et al. 
2002) and stability (Tarraf et al. 2008), the 
framework of symbolic control introduces 
interesting questions such as how to understand 
the reduction of complexity that can be achieved 
for a given collection of symbols (Egerstedt and 
Brockett 2003). 

While there are many application areas of 
symbolic control, the one that is perhaps most 
active is that of motion planning for autonomous 
mobile robots (Belta et al. 2007). As illustrated 
in Figs. 1 and 2, symbolic control allows the 
planning problem (i.e., the determination of how 
to achieve a desired task) to be separated from 
the complexities of the dynamics. The approach 
has been particularly fertile when coupled with 
symbolic descriptions of the tasks to be achieved. 
While point-to-point commands are useful, and 
can be often thought of as symbols themselves 
from which to build more complicated com¬ 
mands, most tasks that one would want mobile 
robots to carry out involve combinations of spa¬ 
tial goals (move to a certain location), sequencing 
(first do this and then do that) or other tempo¬ 
ral requirements (repeatedly visit a collection of 
regions), as well as safety or other restrictions 
(avoid obstacles or regions that are dangerous 
for the robot to traverse). Such tasks can be 
described using a variety of temporal logics. 
These are, essentially, logic systems that include 
rules related to time in addition to the standard 
Boolean operators. These tasks can be combined 
with a symbolic description of a system and then 
automated tools used both to check whether the 
system is able to perform the desired task and 
to design plans that ensure the system will do 
so (Fainekos et al. 2009). To ensure that results 
on the abstract, symbolic system are valid on 
the original dynamic system, methods exist for 
guaranteeing the equivalence of the two mod¬ 
els, in an appropriate sense (Girard and Pappas 
2007). 


Summary and Future Directions 

Symbolic control proceeds from the basic goal 
of mitigating the complexity of dynamic systems, 
especially in real-world scenarios, to yield a sim¬ 
plification of the problems of analysis and control 
design. It builds upon results from diverse fields 
while also contributing new ideas to those areas, 
including hybrid system theory, formal languages 
and grammars, and motion planning. There are 
many open, interesting questions that are the 
subject of ongoing investigations as well as the 
genesis of future research. 

One particularly fruitful direction is that of 
combining symbolic control with stochasticity. 
Systems that operate in the real world are subject 
to noise with respect both to their inputs (noisy 
actuators) and to their outputs (noisy sensors). 
Recent work along these lines can be found in the 
formal methods approach to motion planning and 
in hybrid systems (Abate et al. 2011; Lahijanian 
et al. 2012). The fundamental idea is to use 
a Markov chain, Markov decision process, or 
similar model as the symbolic abstraction and 
then, as in all symbolic control, to do the analysis 
and planning on this simpler model. 

Another interesting direction is to address 
questions of optimality with respect to the 
symbols and abstractions for a given dynamic 
system. Of course, the notion of “optimal” must 
be made clear, and there are several reasonable 
notions one could define. There is a clear 
trade-off between the complexity of individual 
symbols, the number of symbols used in the 
motion “alphabet,” and the complexity in terms 
of, say, average number of symbols required to 
code programs that achieve a given set of tasks. 
The complexity of a necessary alphabet is also 
related to the variety of tasks the system might 
need to perform. An autonomous vacuuming 
robot is likely to need far fewer symbols in its 
library than an autonomous vehicle that must 
operate in everyday traffic conditions and respond 
to unusual events such as traffic jams. The 
question of the “right” set of symbols can also 
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be of use in efficient descriptions of motion in 
domains such as dance (Baillieul and Ozcimder 
2012 ). 

It is intuitively clear that to handle complex 
scenarios and environments, a hierarchical ap¬ 
proach is likely needed. Organizing symbols into 
progressively higher levels of abstraction should 
allow for more efficient reasoning, planning, and 
reaction to real-world settings. Such structures 
already appear in existing works, such as in 
the behavior-based approach of Brooks (1986), 
in the extended Motion Description Language 
in Manikonda et al. (1998), and in the Spatial 
Semantic Hierarchy of Kuipers (2000). Despite 
these efforts, there is still a need for a rigorous 
approach for analyzing and designing symbolic 
hierarchical systems. 

The final direction discussed here is that of the 
connection of symbolic control to emergent be¬ 
havior in large groups of dynamic agents. There 
are a variety of intriguing examples in nature in 
which large numbers of agents following sim¬ 
ple rules produce large-scale, coherent behavior, 
including in fish schools and termite and ant 
colonies (Johnson 2002). How can one predict 
the global behavior that will emerge from a large 
collection of independent agents following sim¬ 
ple rules (symbols)? How can one design a set of 
symbols to produce a desired collective behavior? 
While there has been some work in symbolic con¬ 
trol for self-assembling systems (Klavins 2007), 
this general topic remains a rich area for research. 

Cross-References 

► Multi-vehicle Routing 

► Robot Motion Control 

► Walking Robots 

► Wheeled Robots 

Recommended Reading 

Brockett’s original paper Brockett (1988) is a sur¬ 
prisingly short but informational read. More thor¬ 


ough descriptions can be found in Manikonda 
et al. (1998) and Egerstedt (2002). An excellent 
description of symbolic control in robotics, par¬ 
ticularly in the context of temporal logics and 
formal methods, can be found in Belta et al. 
(2007). There are also several related articles in 
a 2011 special issue of the IEEE Robotics and 
Automation magazine (Kress-Gazit 2011). 
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Abstract 

In this chapter we review motion planning al¬ 
gorithms for ships, rigs, and autonomous marine 
vehicles. Motion planning includes path and tra¬ 
jectory generation, and it goes from optimized 
route planning (off-line long-range path genera¬ 
tion through operating research methods) to re¬ 
active on-line trajectory reference generation, as 
given by the guidance system. Crucial to the ma¬ 
rine systems case is the presence of environmen¬ 


tal external forces (sea state, currents, winds) that 
drive the optimized motion generation process. 

Keywords 

Configuration space; Dynamic programming; 
Grid search; Guidance controller; Guidance sys¬ 
tem; Maneuvering; Motion plan; Optimization 
algorithms; Path generation; Route planning; 
Trajectory generation; World space 

Introduction 

Marine control systems include primarily ships 
and rigs moving on the sea surface, but also 
underwater systems, manned (submarines) or un¬ 
manned, and eventually can be extended to any 
kind of off-shore moving platform. 

A motion plan consists in determining what 
motions are appropriate for the marine system to 
reach a goal, or a target/final state (LaValle 2006). 
Most often, the final state corresponds to a geo¬ 
graphical location or destination, to be reached by 
the system while respecting constraints of phys¬ 
ical and/or economical nature. Motion planning 
in marine systems hence starts from route plan¬ 
ning , and then it covers desired path generation 
and trajectory generation. Path generation in¬ 
volves the determination of an ordered sequence 
of states that the system has to follow; trajectory 
generation requires that the states in a path are 
reached at a prescribed time. 

Route, path, and trajectory can be generated 
off-line or on-line, exploiting the feedback from 
the system navigation and/or from external 
sources (weather forecast, etc.). In the feedback 
case, planning overlaps with the guidance system , 
i.e., the continuous computation of the reference 
(desired) state to be used as reference input by 
the motion control system (Fossen 2011). 

Formal Definitions and Settings 

Definitions and classifications as in Goerzen et al. 
(2010) and Petres et al. (2007) are followed 
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throughout the section. The marine systems under 
considerations live in a physical space referred 
to as the world space (e.g., a submarine lives in 
a 3-D Euclidean space). A configuration q is a 
vector of variables that define position and orien¬ 
tation of the system in the world space. The set 
of all possible configurations is the configuration 
space , or C-space. The vector of configuration 
and configuration rate of changes is the state 
of the system x = [<7 r # r ] , and the set of 
all the possible states is the state space. The 
kino-dynamic model associated to the system is 
represented by the system state equations. The 
regions of C -space free from obstacles are called 
C-free. 

The path planning problem consists in deter¬ 
mining a curve y: [0,1] —> C-free,s -> y(s), 
with y(0) corresponding to the initial configu¬ 
ration and y( 1) corresponding to the goal con¬ 
figuration. Both initial and goal configurations 
are in C-free. The trajectory planning problem 
consists in determining a curve y and a time 
law : t s (t) s.t. y (s) = y(s (t)). In both 
cases, either the path or the trajectory must be 
compatible with the system state equations. In 
the following, definitions of motion algorithm 
properties are given referring to path planning, 
but the same definitions can be easily extended 
to trajectory planning. 

A motion planning algorithm is complete if 
it finds a path when one exists, and returns a 
proper flag when no path exists. The algorithm is 
optimal when it provides the path that minimizes 
some cost function J. The (strictly positive) cost 
function J is isotropic, when it depends only 
on the system configuration (/ = /(</)), or 
anisotropic, when it depends also on an external 
force field f (e.g., sea currents, sea state, weather 
perturbations ) (/ = J(q, f)). The cost function 
J induces a pseudometric in the configuration 
space; the distance d between configurations q\ 
and q 2 through the path y is the “cost-to-go” from 
q\ioq 2 along y: 

d(qi,q 2 ) = /o' J(y qi qi (s),f)ds (1) 
An optimal motion planning problem is: 


- Static if there is perfect knowledge of the 
environment at any time, dynamic otherwise 

- Time-invariant when the environment does 
not evolve (e.g., coastline that limits the C- 
free subspace), time-variant otherwise (e.g., 
other systems - ships, rigs - in navigation) 

- Differentially constrained if the system state 
equations act as a constraint on the path, 
differentially unconstrained otherwise 

In practice, optimal motion planning problems 
are solved numerically through discretization of 
the C-space. Resolution completeness/optimality 
of an algorithm implies the achievement of the 
solution as the discretization interval tends to 
zero. Probabilistic completeness/optimality im¬ 
plies that the probability of finding the solution 
tends to 1 as the computation time approaches 
infinity. Complexity of the algorithm refers to the 
computational time required to find a solution as 
a function of the dimension of the problem. 

The scale of the motion w.r. to the scale of the 
system defines the specific setting of the problem. 
In cargo ships route planning from one port call 
to the next, the problem is stated first as static, 
time-invariant, differentially unconstrained path 
planning problem; once a large-scale route is thus 
determined, it can be refined on smaller scales, 
e.g., smoothing it, to make it compatible with ship 
maneuverability. Maneuvering the same cargo 
ship in the approaches to a harbor has to be 
casted as a dynamic, time-variant, differentially 
constrained trajectory planning problem. 

Large-Scale, Long-Range Path 
Planning 

Route determination is a typical long-range path 
planning problem for a marine system. The 
geographical map is discretized into a grid, and 
the optimal path between the approaches of the 
starting and destination ports is determined as a 
sequence of adjacent grid nodes. The problem 
is taken as time-invariant and differentially 
unconstrained, at least in the first stages of 
the procedure. It is assumed that the ship will 
cruise at its own (constant) most economical 
speed to optimize bunker consumption, the major 
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source of operating costs (Wang and Meng 
2012). Navigation constraints (e.g., allowed ship 
traffic corridors for the given ship class) are 
considered, as well as weather forecasts and 
predicted/prevailing currents and winds. The 
cost-to-go Eq. (1) is built between adjacent nodes 
either in terms of time to travel or in terms of 
operating costs, both computed correcting the 
nominal speed with the environmental forces. 
Optimality is defined in terms of shortest 
time/minimum operating cost; the anisotropy 
introduced by sea/weather conditions is the 
driving element of the optimization, making 
the difference with respect to straightforward 
shortest route computation. The approach is 
iterated, starting with a coarse grid and then 
increasing grid resolution in the neighborhood of 
the previously found path. 

The most widely used optimization approach 
for surface ships is dynamic programming 
(LaValle 2006); alternatively, since the deter¬ 
mination of the optimal path along the grid nodes 
is equivalent to a search over a graph, the A* 
algorithm is applied (Delling et al. 2009). As the 
discretization grid gets finer, system dynamics are 
introduced, accounting for ship maneuverability 
and allowing for deviation from the constant 
ship speed assumption. Dynamic programming 
allows to include system dynamics at any level 
of resolution desired; however, when system 
dynamics are considered, the problem dimension 
grows from 2-D to 3-D (2-D space plus time). 

In the case of underwater navigation, path 
planning takes place in a 3-D world space, and 
the inclusion of system dynamics makes it a 
4-D problem; moreover, bathymetry has to be 
included as an additional constraint to shape 
the C-free subspace. Dynamic programming 
may become unfeasible, due to the increase 
in dimensionality. Computationally feasible 
algorithms for this case include global search 
strategies with probabilistic optimality, as genetic 
algorithms (Alvarez et al. 2004), or improved 
grid-search methods with resolution optimality, 
as FM* (Petres et al. 2007). 

Environmental force fields are intrinsically dy¬ 
namic fields; moreover, the prediction of such 
fields at the moment of route planning may be 


updated as the ship is in transit along the route. 
The path planning algorithms can/must be rerun, 
over a grid in the neighborhood of the nominal 
path, each time new environmental information 
becomes available. Kino-dynamic model of the 
ship must be included, allowing for deviation 
from the established path and ship speed varia¬ 
tion around the nominal most economical speed. 
The latter case is particularly important: increas¬ 
ing/decreasing the speed to avoid a weather per¬ 
turbation keeping the same route may indeed 
result in a reduced operating cost with respect 
to path modifications keeping a constant speed. 
This implies that the timing over the path must be 
specified. Dynamic programming is well suited 
for this transition from path to trajectory gen¬ 
eration, and it is still the most commonly used 
approach to trajectory (re)planning in reaction to 
environmental predictions update. 

When discretizing the world space, the min¬ 
imum grid size should still be large enough to 
allow for ship maneuvering between grid nodes. 
This is required for safety, to allow evasive ma¬ 
neuvering when other ships are at close ranges, 
and for the generation of smooth, dynamics- 
compliant trajectories between grid points. This 
latter aspect bridges motion planning with guid¬ 
ance. 


Trajectory Planning, Maneuvering 
Generation, and Guidance Systems 

Once a path has been established over a spatial 
grid, a continuous reference has to be generated, 
linking the nodes over the grid. The generation 
of the reference trajectory has to take into ac¬ 
count all the relevant dynamic properties and 
constraints of the marine system, so that the 
reference motion is feasible. In this scenario, the 
path/trajectory nodes are way-points , and the tra¬ 
jectory generation connects the way-points along 
the route. The approaches to trajectory generation 
can be divided between those that do not compute 
explicitly in advance the whole trajectory and 
those that do. 

Among the approaches that do not need ex¬ 
plicit trajectory computation between way-points, 
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Motion Planning for Marine Control Systems, Fig. 1 Generation of a reference trajectory with a system model and 
a guidance controller (Adapted from Fossen (2011)) 


the most common is the line-of -sight (LOS) 
guidance law (Pettersen and Lefeber 2001). LOS 
guidance can be considered a path generation, 
more than a trajectory generation, since it does 
not impose a time law over the path; it computes 
directly the desired ship reference heading 
on the basis of the current ship position and 
the previous and next way-point positions. A 
review of other guidance approaches can be 
found in Breivik and Fossen (2008), where 
maneuvering along the path and steering around 
the way-points are also discussed. From such a 
set of different maneuvers, a library of motion 
primitives can be built (Greytak and Hover 
2010), so that any motion can be specified as 
a sequence of primitives. While each primitive 
is feasible by construction, an arbitrary sequence 
of primitives may not be feasible. An optimized 
search algorithm (dynamic programming, A*) is 
again needed to determine the optimal feasible 
maneuvering sequence. 

Path/trajectory planning explicitly computing 
the desired motion among two way-points may 
include a system dynamic model, or may not. 
In the latter case, a sufficiently smooth curve 
that connects two way-points is generated, for 
instance, as splines or as Dubins paths (LaValle 
2006). Curve generation parameters must be set 
so that the “sufficiently smooth” part is guaran¬ 
teed. After curve generation, a trim velocity is 
imposed over the path (path planning), or a time 
law is imposed, e.g., smoothly varying the system 
reference velocity with the local curvature radius. 

Planners that do use a system dynamic model 
are described in Fossen (2011) as part of the 
guidance system. In practice, the dynamic model 


is used in simulation, with a (simulated) feed¬ 
back controller {guidance controller ), the next 
way-point as input, and the (simulated) system 
position and velocity as output. The simulated 
results are feasible maneuvers by construction 
and can be given as reference position/velocity to 
the physical control system (Fig. 1). 


Summary and Future Directions 

Motion planning for marine control systems em¬ 
ploys methodological tools that range from oper¬ 
ating research to guidance, navigation, and con¬ 
trol systems. A crucial role in marine applica¬ 
tions is played by the anisotropy induced by the 
dynamically changing environmental conditions 
(weather, sea state, winds, currents - the external 
force fields). The quality of the plan will depend 
on the quality of environmental information and 
predictions. 

While motion planning can be considered a 
mature issue for ships, rigs, and even standalone 
autonomous vehicles, current and future research 
directions will likely focus on the following 
items: 

- Coordinated motion planning and obstacle 
avoidance for teams of autonomous surface 
and underwater vehicles (Aguiar and Pascoal 
2012; Casalino et al. 2009) 

- Naval traffic regulation compliant maneuver¬ 
ing in restricted spaces and collision evasion 
maneuvering (Tam and Bucknall 2010) 

- Underwater intervention robotics (Antonelli 
2006; Sanz et al. 2010) 
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Cross-References 

► Control of Networks of Underwater Vehicles 

► Mathematical Models of Marine Vehicle-Ma¬ 
nipulator Systems 

► Mathematical Models of Ships and Underwater 
Vehicles 

► Underactuated Marine Control Systems 


Recommended Reading 

Motion planning is extensively treated in LaValle 
(2006), while the essential reference on marine 
control systems is the book by Fossen (2011). 
Goerzen et al. (2010) reviews motion planning 
algorithms in terms of computational properties. 
The book Antonelli (2006) includes the treatment 
of planning and control in intervention robots. 
The papers Breivik and Fossen (2008) and Tam 
et al. (2009) provide a survey of both termi¬ 
nology and guidance design for both open and 
close space maneuvering. In particular, Tam et al. 
(2009) links motion planning to navigation rules. 
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Abstract 

Motion planning refers to the design of an open- 
loop or feedforward control to realize prescribed 
desired paths for the system states or outputs. For 
distributed-parameter systems described by par¬ 
tial differential equations (PDEs), this requires to 
take into account the spatial-temporal system dy¬ 
namics. Here, flatness-based techniques provide 
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a systematic inversion-based motion planning ap¬ 
proach, which is based on the parametrization of 
any system variable by means of a flat or basic 
output. With this, the motion planning problem 
can be solved rather intuitively as is illustrated for 
linear and semilinear PDEs. 

Keywords 

Basic output; Flatness; Formal integration; For¬ 
mal power series; Trajectory assignment; Trajec¬ 
tory planning; Transition path 

Introduction 

Motion planning or trajectory planning refers to 
the design of an open-loop control to realize 
prescribed desired temporal or spatial-temporal 
paths for the system states or outputs. Examples 
include smart structures with embedded actua¬ 
tors and sensors such as adaptive optics in tele¬ 
scopes, adaptive wings or smart skins, thermal 
and reheating processes in steel industry, and 
deep drawing, start-up, shutdown, or transitions 
between operating points in chemical engineer¬ 
ing, as well as multi-agent deployment and for¬ 
mation control (see, e.g., the overview in Meurer 
2013). 

For the solution of the motion planning and 
tracking control problem for finite-dimensional 
linear and nonlinear systems, differential flat¬ 
ness as introduced in Fliess et al. (1995) has 
evolved into a well-established inversion-based 
technique. Differential flatness implies that any 
system variable can be parametrized in terms of 
a flat or a so-called basic output and its time 
derivatives up to a problem-dependent order. As 
a result, the assignment of a suitable desired 
trajectory for the flat output directly yields the 
respective state and input trajectories to realize 
the prescribed motion. Flatness can be adapted to 
systems governed by partial differential equations 
(PDEs). For this, different techniques have been 
developed utilizing operational calculus or spec¬ 
tral theory for linear PDEs, (formal) power series 
for linear PDEs, and PDEs involving polynomial 


nonlinearities as well as formal integration for 
semilinear PDEs using a generalized Cauchy- 
Kowalevski approach. To illustrate the principle 
ideas and the evolving research results starting 
with Fliess et al. (1997), subsequently different 
techniques are introduced based on selected ex¬ 
ample problems. For this, the exposition is pri¬ 
marily restricted to parabolic PDEs with a brief 
discussion of motion planning for hyperbolic 
PDEs before concluding with possible future re¬ 
search directions. 

Linear PDEs 

In the following, a scalar linear diffusion-reaction 
equation is considered in the state variable x(z, t) 
with boundary control u{t) governed by 

3 t x(z, t ) = 3 2 z x(z, t ) + rx(z, t) (la) 
3 z jc(0,0=0, x(l ,t) = u(t) (lb) 
x(z, 0) = 0. (lc) 

This PDE describes a wide variety of thermal and 
fluid systems including heat conduction and tubu¬ 
lar reactors. Herein, r e M refer to the reaction 
coefficient and the initial state is without loss of 
generality assumed zero. In order to solve the 
motion planning problem for (1), a feedforward 
control t i-> u*(t ) is determined to realize a 
finite-time transition between the initial state and 
a final stationary state Xj{z) to be imposed for 
t > T. 

Formal Power Series 

By making use of the formal power series expan¬ 
sion of the state variable 

°° z n 

x(z,t) -» x(z,t) =Y]x n (t)— (2) 

^ n\ 

n =0 

the evaluation of (1) results in the 2nd-order 
recursion 

x„(t) = d t x n - 2 (t) - rx„- 2 (t), n >2 (3a) 

x\(t) = 0. (3b) 
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In order to be able to solve (3) for x n (t ), it 
is hence required to impose Xo(t) = x(0,0- 
Denoting y(t) = x(0,t) or respectively 

xo(t) = y(t ) (3c) 

implies 

X 2 n(t) = (9, - r) n O y(t), X 2n +i(t ) = 0. (4) 

Hence, any series coefficient in (2) can be differ¬ 
entially parametrized by means of y(t). Taking 
into account the inhomogeneous boundary con¬ 
dition in (lb), i.e., 


The proof of this result can be, e.g., found in 
Laroche et al. (2000) and Lynch and Rudolph 
(2002) and relies on the analysis of the recursion 
(3) taking into account the assumptions on the 
function y(t). 

Trajectory Assignment 

To apply these results for the solution of the 
motion planning problem to achieve finite-time 
transitions between stationary profiles, it is cru¬ 
cial to properly assign the desired trajectory y* (t) 
for the basic output y(t). For this, observe that 
stationary profiles x 5 (z) = x s (z m ,y s ) are due 
to the flatness property (Classically stationary 
solutions are to be defined in terms of stationary 
input values x s (l) = u s .) governed by 


u(t) = x(l,t) 


oo 


E 


X n (t) 

n\ 


E x 2n (0 

(2 n)\ 

n =0 v ’ 


(5) 


0 = d 2 z x s (z) + rx s (z) (6a) 

d z x s (0) = 0, x s (0) = /. (6b) 


yields that y(t) = x(0,t) can be considered as a 
flat or basic output. In particular, by prescribing 
a suitable trajectory t i-> y*(t) G C°°(M) for 
y(t ), the evaluation of (5) yields the feedforward 
control u*(t) which is required to realize the 
spatial-temporal path x*(z,t ) obtained from the 
substitution of y*(t) into (2) with coefficients 
parametrized by (4). This, however, relies on the 
uniform convergence of (2) in view of (4) with at 
least a unit radius of convergence in z. For this, 
the notion of a Gevrey class function is needed 
(Rodino 1993). 

Definition 1 (Gevrey class) The function y(t ) 
is in Gd,u(2\), the Gevrey class of order a in 
A c R, if y(t) G C°°(A) and for every closed 
subset A! of A there exists a D > 0 such that 

19”)V)I < £>" +1 («0“- 


Hence, assigning different y s results in different 
stationary profiles x s (z m , y s )• The connection be¬ 
tween an initial stationary profile vg(z; y^) and a 
final stationary profile x s r {z',y s T ) is achieved by 
assigning y*(t) such that 

J*(0) = yl y*(T) = y f 

d"y*( 0)=0, d"y*(T) = 0, n> 1. 

This implies that y*(t) has to be locally nonan- 
alytic at t G {0, T} and in view of the previous 
discussion has thus to be a Gevrey class function 
of order a G (1,2). For specific problems differ¬ 
ent functions have been suggested fulfilling these 
properties. In the following, the ansatz 

y*(t) = y s o + (y’ T -yo)*TAt) ( 7a ) 


The set Gd^(A) forms a linear vector space and 
a ring with respect to the arithmetic product of 
functions which is closed under the standard rules 
of differentiation. Gevrey class functions of order 
a < 1 are entire and are analytic if a = 1. 

Theorem 1 Let y(t) G Go,a(^) far a < 2, 
then the formal power series (2) with coefficients 
(4) converges uniformly with infinite radius of 
convergence. 


is used with 




o, 

fo h T ,y(r )dr 

< —j - 

Jo h T ,y(r)dz 

1, 


t < 0 
t G (0, T) 
t > T 


(7b) 


for hr, y (t) = exp(-[f/r(l -t/T)]~r) iff e 
(0, T) and /?-/ ,, (f) = 0 else. It can be shown that 
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Motion Planning for 
PDEs, Fig. 1 Simulated 
spatial-temporal transition 
path (top) and applied 
flatness-based feedforward 
control u*(t) and desired 
trajectory y*(t) (bottom ) 
for (1) 




(7b) is a Gevrey class function of order a = 1 + 
l/y (Fliess et al. 1997). Alternative functions are 
presented, e.g., in Rudolph (2003). 


Simulation Example 

In order to illustrate the results of the motion 
planning procedure described above, let r = — 1 
in (1). The differential parametrization (4) of the 
series coefficients is evaluated for the desired 
trajectory y*(t) defined in (7) for y$ = 0 and 
y s T = 1 with the transition time T = 1 and 
the parameter y = 2. With this, the finite¬ 
time transition between the zero initial stationary 
profile Vq (z) = 0 and the final stationary profile 
Xj(z) = x s T (z :) = y s T cosh(z) is realized along the 
trajectory x(0,t) = y*(t). The corresponding 


feedforward control and spatial-temporal transi¬ 
tion path are shown in Fig. 1. 

Extensions and Generalizations 

The previous considerations constitute a first 
systematic approach to solve motion planning 
problems of systems governed by PDEs. The 
underlying techniques can be, however, further 
generalized to address coupled systems of 
PDEs, certain classes of nonlinear PDEs (see 
also section “Semilinear PDEs”), or in-domain 
control. 

While the application of formal power series 
is restricted to boundary control diffusion-con- 
vection-reaction systems, the approach can be 
combined with so-called resummation techniques 
to overcome convergence issues such as slowly 
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converging or even divergent series expansions 
(Laroche et al. 2000; Meurer and Zeitz 2005). 

Flatness-based techniques for motion planning 
can be also embedded into an operator theoretic 
context using semigroup theory by restricting 
the analysis to so-called Riesz spectral operators. 
This enables to analyze coupled systems of linear 
PDEs with both boundary and in-domain control 
in a single and multiple spatial coordinates with 
a common framework (Meurer 2011, 2013). In 
addition, experimental results for flexible beam 
and plate structures with embedded piezoelectric 
actuators confirm the applicability of this design 
approach and the achievable high tracking ac¬ 
curacy when transiently shaping the deflection 
profile (Schrock et al. 2013). 

Semilinear PDEs 


Flatness can be extended to semilinear PDEs. 
This is subsequently illustrated for the diffusion- 

reaction system 


3 t x(z,t) = d 2 z x(z,t) + r(x(z, 0) 

(8a) 

d z x(0 ,t) = 0, x(l,t) = u(t) 

(8b) 

x(z, 0) = 0 

(8c) 


with boundary input u(t). Similar to the previous 
section, the motion planning problem refers to the 
determination of a feedforward control t i-> u*(t) 
to realize finite-time transitions starting at the 
initial profile x£ (z) = x(z, 0) = 0 to achieve a 
final stationary profile x^(z) for t > T. 

Formal Power Series 

If r(x(z,t)) is a polynomial in x(z,t ) or an 
analytic function, then similar to the previous sec¬ 
tion, formal power series can be applied to solve 
the motion planning problem. This, however, re¬ 
lies on the successive evaluation of Cauchy’s 
product formula. As an example, consider 

r(x(z, t)) = nx(z,t ) + r 2 x 2 (z,t), 

then the formal power series ansatz (2) results in 
the recursion 


x n (0 = d t x n - 2 (t) - r\x n - 2 (t) 

-r 2 ^^^jxj(t)x n -j(t), n>2 

(9a) 

jci(0 = 0. (9b) 

Similar to the linear setting in the section “Linear 
PDEs” above, the recursion can be solved for 
x n (t ) by imposing Xo(t) = v(0, t ) or respectively 

xo(t) = y(t). (9c) 

As a result, also in this nonlinear setting any 
series coefficient can be expressed in terms of 
y(t) and its time derivatives. Hence, y(t) = 
x(0,t) denotes a basic output for the semilinear 
PDE (8). The uniform series convergence can 
be analyzed by restricting any trajectory y(t ) to 
a certain Gevrey order a while simultaneously 
restricting the absolute values of d , r\ and r 2 
(Dunbar et al. 2003; Lynch and Rudolph 2002). 
These restrictions can be approached using, e.g., 
resummation techniques to sum slowly converg¬ 
ing or divergent series to a meaningful limit. 
The reader is therefore referred to Meurer and 
Zeitz (2005) or Meurer and Krstic (2011), with 
the latter introducing a PDE-based approach for 
formation control of multi-agent systems. 

Formal Integration 

A generalization of these results has been re¬ 
cently suggested in Schorkhuber et al. (2013) by 
making use of an abstract Cauchy-Kowalevski 
theorem in Gevrey classes. In order to illustrate 
this, solve (8a) for 3 2 x(z,t) and formally inte¬ 
grate with respect to z taking into account the 
boundary conditions (8b). This yields the implicit 
solution 

x(z, t) = x+(0,0 [ [ [3 t x(q,t) 

Jo Jo 

—r(x(q, t))\dqdp (10a) 

u(t) = x(l,t), (10b) 
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which can be used to develop to a flatness-based 
design systematics for motion planning given 
semilinear PDEs. For this, introduce 

y(t)=x(0,t), (11) 

and rewrite (10b) in terms of the sequence of 
functions (x^ n \z, 0)%Lo according to 

v (0) (z, 0 = y(t) (12a) 

x (n+r HzJ) = x (0) (z, t) + f f P [d t x (n \q,t) 

Jo Jo 

-r(x w (q,t))]dqdp. (12b) 

From this, it is obvious that y(t) denotes a ba¬ 

sic output differentially parametrizing the state 
variable x(z,t ) = lim^oo x^ n \z, t) and the 

boundary input u(t ) = provided that 

the limit exists as n -> oo. As is shown in 
Schorkhuber et al. (2013) by making use of 
scales of Banach spaces in Gevrey classes and 
abstract Cauchy-Kowalevski theory, the conver¬ 
gence of the parametrized sequence of functions 
(x (n) (z,r))~ 0 can be ensured in some compact 
subset of the domain z e [0,1]. Besides its 
general setup this approach provides an itera¬ 
tion scheme, which can be directly utilized for 
a numerically efficient solution of the motion 
planning problem. 

Simulation Example 

Fet the reaction be subsequently described by 

r(x(z,t)) = sin(2; tx(z,t)). (13) 

The iterative scheme (12) is evaluated for the 
desired trajectory y*(t) defined in (7) for y£ = 0 
and y s T = 1 with the transition time T = 1 and 
the parameter y = 1 , i.e., the desired trajectory 
is of Gevrey order a = 2. The resulting feed¬ 
forward control u*(t ) and the spatial-temporal 
transition path resulting from the numerical so¬ 
lution of the PDE are depicted in Fig. 2. The 
desired finite-time transition between the zero 
initial stationary profile x^ (z) = 0 and the final 
stationary profile x^ (z) = x s T (z) determined by 


0 = d 2 z x s (z) + r(x s (z)) (14a) 

8 z x s (0) = 0, x s (0) = /. (14b) 

is clearly achieved along the prescribed path 
T*(0- 

Extensions and Generalizations 

Generalizations of the introduced formal integra¬ 
tion approach to solve motion planning problems 
for systems of coupled PDEs are, e.g., provided 
in Schorkhuber et al. (2013). Moreover, linear 
diffusion-convection-reaction systems with spa¬ 
tially and time-varying coefficients defined on 
a higher-dimensional parallelepipedon are ad¬ 
dressed in Meurer and Kugi (2009) and Meurer 
(2013). 

Hyperbolic PDEs 

Hyperbolic PDEs exhibiting wavelike dynam¬ 
ics require the development of a design sys¬ 
tematics explicitly taking into account the finite 
speed of wave propagation. For linear hyperbolic 
PDEs, operational calculus has been success¬ 
fully applied to determine the state and input 
parametrizations in terms of the basic output 
and its advanced and delayed arguments (Pe¬ 
tit and Rouchon 2001, 2002; Rouchon 2001; 
Rudolph and Woittennek 2008; Woittennek and 
Rudolph 2003). In addition, the method of char¬ 
acteristics can be utilized to address both linear 
and quasi-linear hyperbolic PDEs. Herein, a suit¬ 
able change of coordinates enables to reformulate 
the PDE in a normal form, which can be (for¬ 
mally) integrated in terms of a basic output. With 
this, also an efficient numerical procedure can 
be developed to solve motion planning problems 
for hyperbolic PDEs (Woittennek and Mounier 
2010 ). 

Summary and Future Directions 

Motion planning constitutes an important design 
step when solving control problems for systems 
governed by PDEs. This is particularly due to 
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Motion Planning for 
PDEs, Fig. 2 Simulated 
spatial-temporal transition 
path (top) and applied 
flatness-based feedforward 
control u*(t) and desired 
trajectory y*(t) (bottom ) 
for (8) with (13) 



t 


the increasing demands on quality, accuracy, and 
efficiency, which require to turn away from the 
pure stabilization of an operating point toward 
the realization of specific start-up, transition, or 
tracking tasks. In view of these aspects, future re¬ 
search directions might deepen and further evolve 
the following: 

- Semi-analytic design techniques taking into 
account suitable approximation schemes for 
complex-shaped spatial domains 

- Nonlinear PDEs and coupled systems of non¬ 
linear PDEs with boundary and in-domain 
control 

- Applications arising, e.g., in aeroelasticity, 
micromechanical systems, fluid flow, and 
fluid-structure interaction. 


Cross-References 

► Boundary Control of 1-D Hyperbolic Systems 

► Boundary Control of Korteweg-de Vries and 
Kuramoto-Sivashinsky PDEs 

► Control of Fluids and Fluid-Structure Interac¬ 
tions 
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Abstract 

A basic model due to Sharp which is useful in 
the analysis of motorcycle behavior and control 
is developed. This model is based on linearization 
of a bicycle model introduced by Whipple, but is 
augmented with a tire model in which the lateral 
tire force depends in a dynamic fashion on tire be¬ 
havior. This model is used to explain some of the 
important characteristics of motorcycle behavior. 
The significant dynamic modes exhibited by this 
model are capsize, weave, and wobble. 

Keywords 

Bicycle; Capsize; Counter-steering; Motorcycle; 
Single-track vehicle; Tire model; Weave; Wobble 

Introduction 

The bicycle is mankind’s ultimate solution 
to the quest for a human-powered vehicle 
(Herlihy 2006). The motorcycle just makes riding 
more fun. Bicycles, motorcycles, scooters, and 
mopeds are all examples of single-track vehicles 
and have similar dynamics. The dynamics of a 
motorcycle are considerably more complicated 
than that of a four-wheel vehicle such as a 
car. The first obvious difference in behavior 
is stability. An unattended upright stationary 
motorcycle is basically an inverted pendulum 
and is unstable about its normal upright position, 
whereas a car has no stability issues in the 
same configuration. Another difference is that a 
motorcycle must lean when cornering. Although 
a car leans a little due to suspension travel, there 
is no necessity for it to lean in cornering. A 
perfectly rigid car would not lean. Furthermore, 
beyond low speeds, the steering behavior of a 
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motorcycle is not intuitive like that of a car. 
To turn a car right, the driver simply turns the 
steering wheel right; on a motorcycle, the rider 
initially turns the handlebars to the left. This is 
called counter-steering and is not intuitive. 


A Basic Model 

To obtain a basic motorcycle model, we start 
with four rigid bodies: the rear frame (which 
includes a rigidly attached rigid rider), the front 
frame (includes handlebars and front forks), the 
rear wheel, and the front wheel; see Fig. 1. We 
assume that both frames and wheels have a plane 
of symmetry which is vertical when the bike is 
in its nominal upright configuration. The front 
frame can rotate relative to the rear frame about 
the Steering axis; the steering axis is in the plane 
of symmetry of each frame and in the nominal 
upright configuration of the bike, the angle it 
makes with the vertical is called the rake angle 
or caster angle and is denoted by 6 . The rear 
wheel rotates relative to the rear frame about an 
axis perpendicular to the rear plane of symmetry 
and is symmetrical with respect to this axis. The 
same relationship holds between the front wheel 
and the front frame. Although each wheel can 
be three dimensional, we model the wheels as 
terminating in a knife edge at their boundaries 
and contact the ground at a single point. Points 
Q and P are the points on the ground in contact 
with the front and rear wheels, respectively. 



Motorcycle Dynamics and Control, Fig. 1 Basic 
model 


Each of the above four bodies are described by 
their mass, mass center location, and a 3 x 3 in¬ 
ertia matrix. Two other important parameters are 
the wheelbase w and the trail c. The wheelbase 
is the distance between the contact points of the 
two wheels in the nominal configuration, and the 
trail is the distance from the front wheel contact 
point Q to the intersection S of the steering axis 
with the ground. The trail is normally positive, 
that is, Q is behind S. The point G locates the 
mass center of the complete bike in its nominal 
configuration, whereas Ga is the location of the 
mass center of the front assembly (front frame 
and wheel). 


Description of Motion 

Considering a right-handed reference frame e = 
(e\, e 2 , e 3 ) with origin O fixed in the ground, the 
bike motion can be described by the location of 
the rear wheel contact point P relative to O , 
the orientation of the rear frame relative to e , 
and the orientation of the front frame relative 
to the rear frame; see Fig. 2. Assuming the bike 
is moving along a horizontal plane, the location 
of P is usually described by Cartesian coordi¬ 
nates x and y. Let reference frame b be fixed 
in the rear frame with b\ and £3 in the plane 
of symmetry with b\ along the nominal P — S 



Motorcycle Dynamics and Control, Fig. 2 Description 
of motion 
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line; see Fig. 2. Using this reference frame, the 
orientation of the rear frame is described by a 
3-1-2 Euler angle sequence which consists of a 
yaw rotation by 0 about the 3-axis followed by 
a lean (roll) rotation by 0 about the 1 -axis and 
finally by a pitch rotation by 0 about the 2 -axis. 
The orientation of the front frame relative to the 
rear frame can be described by the Steer angle 8 . 
Assuming both wheels remain in contact with the 
ground, the pitch angle 0 is not independent; it is 
uniquely determined by 8 and 0. In considering 
small perturbations from the upright nominal 
configuration, the variation in pitch is usually 
ignored. Here we consider it to be zero. Also the 
dynamic behavior of the bike is independent of 
the coordinates x,y, and 0. These coordinates 
can be obtained by integrating the velocity of P 
and 0 . 

The Whipple Bicycle Model 

The “simplest” model which captures all the 
salient features of a single track vehicle for a 
basic understanding of low-speed dynamics and 
control is that originally due to Whipple (1899). 
We consider the linearized version of this model 
which is further expounded on in Meijaard et al. 
(2007). The salient feature of this model is that 
there is no slip at each wheel. This means that 
the velocity of the point on the wheel which 
is instantaneously in contact with the ground is 
zero; this is illustrated in Fig. 3 for the rear wheel. 
No slip implies that there is no sideslip which 
means that the velocity of the wheel contact 
point ( v p in Fig. 3) is parallel to the intersection 
of the wheel plane with the ground plane; the 
wheel contact point is the point moving along the 
ground which is in contact with the wheel. 

The rest of this entry is based on linearization 
of motorcycle dynamics about an equilibrium 
configuration corresponding to the bike traveling 
upright in a straight line at constant forward 
speed v := v p , the speed of the rear wheel 
contact point P . In the linearized system, the 
longitudinal dynamics are independent of the 
lateral dynamics, and in the absence of driving 
or braking forces, the speed v is constant. 


With no sideslip at both wheels, kinematical 
considerations (see Fig. 4) show that the yaw 
rate xj/ is determined by <5; specifically for small 
angles we have the following linearized relation¬ 
ship: 

0 = w8 + ii8 ( 1 ) 

where v = c € /w, c € = cos 6, and /z = cc € /w is 
the normalized mechanical trail. In Fig. 4, 8 / = 
c € 8 is the effective steer angle; it is the angle 
between the intersections of the front wheel plane 
and the rear frame plane with the ground. Thus 
we can completely describe the lateral bike dy¬ 
namics with the roll angle 0 and the steer angle 
8 . To obtain the above relationship, first note that, 
as a consequence of no sideslip, v p = vb\ and 
v® is perpendicular to f 2 . Taking the dot product 
of the expression, 

V Q = V P + (W + C )0 b 2 ~ C(xjf +8f)f2, 

with f 2 while noting that b\ • f 2 = — sin 8 f and 
b 2 • f 2 — cos 8 f results in 

0 = — v sin 8 f + (w + c)xjs cos 8 f — c(xj/ + 8 /). 

Linearization about <5 = 0 yields the desired 
result. 

The relationship in (1) also holds for four 
wheel vehicles. There one can achieve a desired 
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v p '=0 



Motorcycle Dynamics and Control, Fig. 4 Some kine¬ 
matics 
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pendulum with accelerating support point 

constant yaw rate \j/d by simply letting the steer 
angle 8 = 0 d/vv . However, as we shall see, a 
motorcycle with its steering fixed at a constant 
steer angle is unstable. Neglecting gyroscopic 
terms, it is simply an inverted pendulum. With its 
steering free, a motorcycle can be stable over a 
certain speed range and, if unstable, can be easily 
stabilized above a very low speed by most people. 
Trials riders can stabilize a motorcycle at any 
speed including zero. 

To help understand the effect of steer angle 
on bike behavior, we initially ignore the mass 
and inertia of the front assembly along with 
gyroscopic effects, and we assume that the b\ 
axis is a principle axis of inertia of the rear frame 
with moment of inertia I xx . Angular momentum 
considerations about the b\ axis and linearization 
results in 

I xx (p + mha B = mghcj) + (Nfc € c )8 ( 2 ) 

where Nf is the normal force (vertical and 
upwards) on the front wheel and a B is the lat¬ 
eral acceleration (perpendicular to rear frame) of 
point B which is the projection of G onto the 
b\ axis. By considering a moment balance about 
the pitch axis &2 through P , one can obtain that 
Nf = mgb/w. Notice that, with the steering 
fixed at 8 = 0, Eq. (2) is the equation of motion of 
a simple inverted pendulum whose support axis 
is accelerating horizontally with acceleration a B . 
This is illustrated in Fig. 5. 

Basic kinematics reveal that a B = v\jf + b\[f 
and, recalling relationship (1), Eq. (2) now yields 
the lean equation: 


Ixx4> - mghcj) = -m^8 c^vS - k<p 8 (v)8 

(3) 

where m$ 8 = jimhb > 0 , Cf 8 = mh(ii+bv) > 0 
and kf 8 (v) = —jimgb + mhvv 2 . Note that v is a 
constant parameter corresponding to the nominal 
speed of the rear wheel contact point. 

With 8 = 0, we have a system whose be¬ 
havior is characterized by two real eigenvalues: 
=b y/mgh / 1 xx . This system is unstable due to 
the positive eigenvalue jmgh/1 xx . For v suf¬ 
ficiently large, the coefficient k^siv) is positive 
and one can readily show that the above system 
can be stabilized with positive feedback 8 = 
Kcj) provided K > mgh/kf 8 {v). This helps 
explain the stabilizing effect of a rider turning 
the handlebars in the direction the bike is falling. 
Actually, the rider input is a steer torque T 8 
about the steer axis. 

To explain why an uncontrolled motorcycle 
can be stable or easily stabilized, one also has 
to look at the effect that cj) has on 8 ; in general, 
a lean perturbation results in the front assembly 
turning in the same direction, that is, a positive 
perturbation of 0 results in a positive change in 8. 

The lean equation also explains why a motor¬ 
cycle must lean when cornering above a certain 
speed. Suppose the motorcycle is in a right hand 
corner of radius R at some constant speed v : in 
this scenario, 0 = v/R and, with 8 constant, 
( 1 ) implies that <5 = yjr/vv = 1 /vR ; with 8 
and 0 constant, the lean equation now requires 
that 0 = k ( j ) 8 {v) 8 /mgh = k^iy)/mghvR. For 
higher speeds, k ( j ) 8 (v) ^ mhvv 2 ; hence 0 ^ 
v 2 /gR. Since a B = v 2 /R , the lean angle 0 
is approximately a B / g. Hence, to corner with a 
lateral acceleration a B = v 2 /R, the motorcycle 
must lean at an angle of approximately a B / g. 

The lean equation can also help explain 
counter-steering; that is, at speeds above a 
reasonably low speed, one can initiate a turn by 
turning the handlebars in the opposite direction to 
which one wants to go; to turn right, one initially 
turns the handlebars to the left. See Fimebeer and 
Sharp (2006) for further discussion. 

Taking into account the mass and inertia of 
the front assembly, gyroscopic effects and cross 
products of inertia of the rear frame, one can 
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show (see Meijaard et al. 2007) that the lean 
equation (3) still holds with 

Wl(j)8 = fll xz T" IAex 
c^g = fimh + vl xz + fiS T + 

= ko^s + kty&v 2 

ko</>$ = —Sa 8 , 
ki^s = v(mh + S T ) 

Here I xx is the moment of inertia of the total 
motorcycle and rider in the nominal configuration 
about the b\ axis and !,z is the inertia cross 
product w.r.t the b\ and axes. The term Ia €X 
is the front assembly inertia cross product with 
respect to the steering axis and the b\ axis; see 
Meijaard et al. (2007) for a further description of 
this parameter. Also, Sa = l^mb + thau a where 
m a is the mass of the front assembly (front wheel 
and front frame) and ua is the offset of the mass 
center of the front assembly from the steering 
axis, that is, the distance of this mass center from 
the steering axis; see Fig. 1 . The terms Sp = 
hyy/rF and S T = lR yy /r R + hyy/rp are 
gyroscopic terms due the rotation of the front 
and rear wheels where rp and vr are the radii 
of the front and rear wheels, while Ipyy and 
iRyy are the moments of inertias of the front and 
rear wheels about their axles. It is assumed that 
the mass center of each wheel is located at its 
geometric center. 

By considering an angular momentum balance 
about a vertical axis through P, one can obtain 
an expression for the lateral force at the front 
wheel. Angular momentum considerations about 
the steering axis for the front assembly and lin¬ 
earization then yield the steer equation: 


^8(f)4 ) + Wl88$ + + VCggS 

-^kg^cj) + kgg(v)8 = Tg 


where 

iriScp wi(p 8 

TY188 — I Aee T - 2 [llAtz P ^zz 


k(j>s — kg(p 

kgs(v) = k m + k 2 8sv 2 
km = —s € SAg, 
kiss — v(Sa + s € Sf) 

C8(J) = + CySp) 

C88 — p (Sa + v hz) + vlAez 

Here, I zz is the moment of inertia of the total 
motorcycle and the rider in the nominal config¬ 
uration about the b 3 axis, IAee is the moment of 
inertia of the front assembly about the steering 
axis, and Iac Z i s the front assembly inertia cross 
product with respect to the steering axis and the 
vertical axis through P. The lean equation (3) 
combined with the steer equation (4) provide an 
initial model for motorcycle dynamics. This is 
a linear model with the nominal speed u as a 
constant parameter and the rider’s steering torque 
T§ as an input. 

Modes of Whipple Model 

At v =0, the linearized Whipple model (3)- 
(4) has two pairs of real eigenvalues: ±p\,±p 2 
with p 2 > p\ > 0; see Fig. 6 . The pair ±p\ 
roughly describe inverted pendulum behavior of 
the whole bike with fixed steering, while ±p 2 
describe inverted pendulum behavior of the front 
assembly with the rear frame fixed upright. As 
v increases the real eigenvalues corresponding 
to p\ and p 2 meet and from there on form a 



Motorcycle Dynamics and Control, Fig. 6 Variation of 
eigenvalues of Whipple model with speed v 
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complex conjugate pair of eigenvalues which re¬ 
sult in a single oscillatory mode called the weave 
mode. Initially the weave mode is unstable, but 
is stable above a certain speed v w , and for large 
speeds, its eigenvalues are roughly a linear func¬ 
tion of v\ thus it becomes more damped and its 
frequency increases with speed. The eigenvalue 
corresponding to — P 2 remains real and becomes 
more negative with speed; this is called the cas¬ 
tor mode, because it roughly corresponds to the 
front assembly castoring about the steer axis. The 
eigenvalue corresponding to —p\ also remains 
real but increases, eventually becoming slightly 
positive above some speed v c , resulting in an un¬ 
stable system; the corresponding mode is called 
the capsize mode. Thus the bike is stable in 
the autostable speed range (v w , v c ) and unstable 
outside this speed range. However, above v c , the 
unstable capsize mode is easily stabilized by a 
rider and usually without conscious effort. This is 
because the time constant of the unstable capsize 
mode is very small (Astrom et al. 2005). 

Sharp71 Model 

The Whipple bicycle model is not applicable 
at higher speeds. In particular, it does not con¬ 
tain a wobble mode which is common to bi¬ 
cycle and motorcycle behavior at higher speeds 
(Sharp 1971). A wobble mode is characterized 
mainly by oscillation of the front assembly about 
the steering axis and can sometimes be unstable. 
Also, in a real motorcycle, the damping and 
frequency of the weave mode do not continually 
increase with speed; the damping usually starts 
to decrease after a certain speed; sometimes this 
mode even becomes unstable. At higher speeds, 
one must depart from the simple non-slipping 
wheel model. In the Whipple model, the lateral 
force F on a wheel is simply that force which 
is necessary to maintain the non-holonomic con¬ 
straint which requires the velocity of the wheel 
contact point to be parallel to the wheel plane, 
that is, no sideslip. Actual tires on wheels slip 
in the longitudinal and lateral direction, and the 
lateral force depends on slip in the lateral direc¬ 
tion, that is, sideslip. This lateral slip is defined 



Motorcycle Dynamics and Control, Fig. 7 Lateral 
force, slip angle, and camber angle 

by the slip angle a which is the angle between 
the contact point velocity and the intersection of 
the wheel plane and the ground; see Fig. 7. 

The lateral force also depends on the tire 
camber angle which is the roll angle of the tire; 
motorcycle tires can achieve large camber angles 
in cornering; modern MotoGP racing motorcy¬ 
cles can achieve camber angles of nearly 65°. 
Thus an initial linear model of a tire lateral force 
is given by 

F = N(—k a a + *00) (5) 

where N is the normal force on the tire, k a > 0 is 
called the tire cornering stiffness, and k# > 0 
is called the camber stiffness. Modifying the 
above Whipple model with the tire force model 
results in the appearance of the wobble mode. 
Since lateral forces do not instantaneously re¬ 
spond to changes in slip angle and camber, the 
dynamic model, 

-F + F = N(—k a a + & 00 ), ( 6 ) 

v 

is usually used where cr > 0 is called the 
relaxation length of the tire. This yields more 
realistic behavior (Sharp 1971). In this model 
the weave mode damping eventually decreases at 
higher speeds and the frequency does not con¬ 
tinually increase. The frequency of the wobble 
mode is higher than that of the weave mode and 
its damping decreases at higher speeds. 
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Further Models 

To obtain nonlinear models, resort is usually 
made to multi-body simulation codes. In recent 
years, several researchers have used such codes to 
make nonlinear models which take into account 
other features such as frame flexibility, rider 
models, and aerodynamics; see Cossalter (2006), 
Cossalter and Lot (2002), Sharp and Limebeer 
(2001), and Sharp et al. (2004). The nonlinear 
behavior of the tires is usually modeled with 
a version of the Magic formula; see Pacejka 
(2006), Sharp et al. (2004), and Cossalter et al. 
(2003). Another line of research is to use some 
of these models to obtain optimal trajectories for 
high performance; see Saccon et al. (2012). 


Summary and Future Directions 

We have presented a basic linearized model of a 
motorcycle or bicycle useful for the understand¬ 
ing and control of these two wheeled vehicles. 
It seems that inclusion of further features in the 
model and the consideration of full nonlinear 
behavior require the use of multibody simulation 
software. Future research will consider models 
which will include the engine, transmission, and 
an active pilot. Autonomous control of these 
vehicles will also be considered. 


Cross-References 
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► Transmission 

► Vehicle Dynamics Control 
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Synonyms 

MHE 


Abstract 

Moving horizon estimation (MHE) is a state esti¬ 
mation method that is particularly useful for non¬ 
linear or constrained dynamic systems for which 
few general methods with established properties 
are available. This entry explains the concept of 
full information estimation and introduces mov¬ 
ing horizon estimation as a computable approx¬ 
imation of full information. The basic design 
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methods for ensuring stability of MHE are pre¬ 
sented. The relationships of full information and 
MHE to other state estimation methods such 
as Kalman filtering and statistical sampling are 
discussed. 

Keywords 

Full information estimation; Kalman filtering; 
Statistical sampling 

Introduction 

In state estimation, we consider a dynamic sys¬ 
tem from which measurements are available. In 
discrete time, the system description is 

= f(x,w) y = h(x) + v (1) 

The state of the systems is x G M", the mea¬ 
surement is y e R p , and the notation means 
x at the next sample time. A control input u 
may be included in the model, but it is con¬ 
sidered a known variable, and its inclusion is 
irrelevant to state estimation, so we suppress it in 
the model under consideration here. We receive 
measurement y from the sensor, but the process 
disturbance, w e M g ; measurement disturbance 
v e R p ; and system initial state, v(0), are 
considered unknown variables. 

The goal of state estimation is to construct or 
estimate the trajectory of x from only the mea¬ 
surements y. Note that for control purposes, we 
are usually interested in the estimate of the state 
at the current time, T, rather than the entire tra¬ 
jectory over the time interval [0,7]. In the mov¬ 
ing horizon estimation (MHE) method, we use 
optimization to achieve this goal. We have two 
sources of error: the state transition is affected 
by an unknown process disturbance (or noise), 
w, and the measurement process is affected by 
another disturbance, v. In the MHE approach, we 
formulate the optimization objective to minimize 
the size of these errors thus finding a trajectory of 
the state that comes close to satisfying the (error- 
free) model while still fitting the measurements. 


First, we define some notation necessary to 
distinguish the system variables from the es¬ 
timator variables. We have already introduced 
the system variables (x,w,y,v). In the estima¬ 
tor optimization problem, these have correspond¬ 
ing decision variables, which we denote by the 
Greek letters (/,u;,^,v). The relationships be¬ 
tween these variables are 

X + = /(*,<*>) y=h (x) + v (2) 

and they are depicted in Fig. 1. Notice that v 
measures the gap between the model prediction 
rj = h(x) and the measurement y. The optimal 
decision variables are denoted (x,w,y,v), and 
these optimal decisions are the estimates pro¬ 
vided by the state estimator. 

Full Information Estimation 

The full information objective function is 

T -1 

Vt(x( 0),«) = 4(z(0)-*o)+^4(«(0,v(0) 

i= 0 

(3) 

subject to (2) in which T is the current time, co is 
the estimated sequence of process disturbances, 
(&>(0),..., co(T — 1)), y(i) is the measurement 
at time /, and To is the prior, i.e., available, 
value of the initial state. Full information here 
means that we use all the data on time interval 
[0, T] to estimate the state (or state trajectory) at 
time 7. The stage cost li((o, v) costs the model 
disturbance and the fitting error, the two error 
sources that we reconcile in all state estimation 
problems. 

The full information estimator is then defined 
as the solution to 

min V t (x(0),m) (4) 

Z(0),o> 

The solution to the optimization exists for all 
T e I>o under mild continuity assumptions and 
choice of stage cost. Many choices of (positive, 
continuous) stage costs l x (•) and ii (•) are possi¬ 
ble, providing a rich class of estimation problems 
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Moving Horizon Estimation, Fig. 1 The state, mea¬ 
sured output, and disturbance variables appearing in the 
state estimation optimization problem. The state trajectory 


{gray circles in lower half ) is to be reconstructed given the 
measurements {black circles in upper half ) 


that can be tailored to different applications. Be¬ 
cause the system model (1) and cost function (3) 
are so general, it is perhaps best to start off by 
specializing them to see the connection to some 
classic results. 


Related Problem: The Kalman Filter 

If we specialize to the linear dynamic model 
f(x,w) = Ax -b Gw, h(x) = Cx , andletx(O), 
w, and v be independent, normally distributed 
random variables, the classic Kalman filter is 
known to be the statistically optimal estimator, 
i.e., the Kalman filter produces the state estimate 
that maximizes the conditional probability of 
x{T) given y (0),..., y(T). The full information 
estimator is equivalent to the Kalman filter given 
the linear model assumption and the following 
choice quadratic of stage costs 

4(/(0), Xo) = (1/2) ||/(0) -Xofp-I 

r 0 


v) = (1/2)^ IMIg-] + IMIfi-i ^ 


in which random variable v (0) is assumed to have 
mean xo and variance Po and random variables w 
and v are assumed zero mean with variances Q 
and R , respectively. The Kalman filter is also a 
recursive solution to the state estimation problem 
so that only the current mean x and variance P of 
the conditional density are required to be stored, 
instead of the entire history of measurements 
y(i),i = 0 ,,T. This computational effi¬ 
ciency is critical for success in online application 
for processes with short time scales requiring fast 
processing. 

But if we consider nonlinear models, the max¬ 
imization of conditional density is usually an 
intractable problem, especially in online appli¬ 
cations. So, MHE becomes a natural alternative 
for nonlinear models or if an application calls for 
hard constraints to be imposed on the estimated 
variables. 
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Moving the Horizon 

An obvious problem with solving the full infor¬ 
mation optimization problem is that the number 
of decision variables grows linearly with time T , 
which quickly renders the problem intractable for 
continuous processes that have no final time. A 
natural alternative to full information is to con¬ 
sider instead a finite moving horizon of the most 
recent N measurements. Figure 2 displays this 
idea. The initial condition /(0) is now replaced 
by the initial state in the horizon, /(T — N ), 
and the decision variable sequence of process 
disturbances is now just the last N variables 
co = (co(T — N ),... ,co(T — 1)). Now, the 
big question remaining is what to do about the 
neglected, past data. This question is strongly 
related to what penalty to use on the initial state in 
the horizon /(T — N). If we make this initial state 
a free variable, that is equivalent to completely 
discounting the past data. If we wish to retain 
some of the influence of the past data and keep the 
moving horizon estimation problem close to the 
full information problem, then we must choose 
an appropriate penalty for the initial state. We 
discuss this problem next. 

Arrival Cost. When time is less than or equal 
to the horizon length, T < N, we can simply 
do full information estimation. So we assume 
throughout that T > N . For T > N, we express 
the MHE objective function as 


y(T-N) 


y(T ) 


•. A 




x(T - N) 


I 


moving horizon 


x(T) 

—H 


full information 


0 T-N T 

Moving Horizon Estimation, Fig. 2 Schematic of the 
moving horizon estimation problem 


Vt(x(T - N),<0 ) = IWOKT - AO) 

T -1 

i=T—N 


subject to (2). The MHE problem is defined to be 
min V t (x(T- N),co) (5) 

X(T—N),(o 


in which co = {co(T — N ),..., co(T — 1)} and the 
hat on V distinguishes the MHE objective func¬ 
tion from full information. The designer must 
now choose this prior weighting T^ (•) for k > N . 

To think about how to choose this prior 
weighting, it is helpful to first think about solving 
the full information problem by breaking it 
into two non-overlapping sequences of decision 
variables: the decision variables in the time 
interval corresponding to the neglected data 
(<z>(0), ..., co(T — N — 1)) and those in 

the time interval corresponding to the considered 
data in the horizon (co(T — N ),..., co(T — 1)). If 
we optimize over the first sequence of variables 
and store the solution as a function of the terminal 
state x(T — N), we have defined what is known 
as the arrival cost. This is the optimal cost to 
arrive at a given state value. 

Definition 1 (arrival cost) The (full informa¬ 
tion) arrival cost is defined for k > 1 as 


Z k (x) = min V k (x(0),o)) ( 6 ) 

subject to ( 2 ) and /(k; /( 0 ), co) = x. 

Notice the terminal constraint that / at time k 
ends at value v. Given this arrival cost function, 
we can then solve the full information problem by 
optimizing over the remaining decision variables. 
What we have described is simply the dynamic 
programming strategy for optimizing over a sum 
of stage costs with a dynamic model (Bertsekas 
1995). 

We have the following important equivalence. 

Lemma 1 (MHE and full information estima¬ 
tion) The MHE problem (5) is equivalent to 
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the full information problem (4) for the choice 
1*0 = Zkifforallk > N and N > 1. 

Using dynamic programming to decompose the 
full information problem into an MHE prob¬ 
lem with an arrival cost penalty is conceptu¬ 
ally important to understand the structure of the 
problem, but it doesn’t yet provide us with an 
implementable estimation strategy because we 
cannot compute and store the arrival cost when 
the model is nonlinear or other constraints are 
present in the problem. But if we are not too 
worried about the optimality of the estimator and 
are mainly interested in other properties, such 
as stability of the estimator, we can find simpler 
design methods for choosing the weighting (•). 
We address this issue next. 


Estimator Properties: Stability 

An estimator is termed stable if small distur¬ 
bances (w, v ) lead to small estimate errors v — 
x as time increases. Precise definitions of this 
basic idea are available elsewhere (Rawlings and 
Ji 2012), but this basic notion is sufficient for 
the purposes of this overview. In applications, 
properties such as stability and insensitivity to 
model errors are usually more important than 
optimality. It is possible for a filter to be optimal 
and still not stable. In the linear system context, 
this cannot happen for “nice” systems. Such nice 
systems are classified as detectable. Again, the 
precise definition of detectability for the linear 
case is available in standard references (Kwaker- 
naak and Sivan 1972). Defining detectability for 
nonlinear systems is a more delicate affair, but 
useful definitions are becoming available for the 
nonlinear case as well (Sontag and Wang 1997). 

If we lower our sights and do not worry if 
MHE is equivalent to full information estimation 
and require only that it be a stable estimator, then 
the key result is that the prior penalty T\ (•) need 
only be chosen smaller than the arrival cost as 
shown in Fig. 3. See Rawlings and Mayne (2009, 
Theorem 4.20) for a precise statement of this 
result. Of course this condition includes the flat 
arrival cost, which does not penalize the initial 



Moving Horizon Estimation, Fig. 3 Arrival cost Z k (x), 
underbounding prior weighting U(x), and MHE optimal 
value V fe °; for all x and k > N, Z k (x) > T k (x) > V fe °, 
and Z k (x(k)) = T k (x(k)) = V° 

state in the horizon at all. So neglecting the past 
data completely leads to a stable estimator for 
detectable systems. If we want to improve on this 
performance, we can increase the prior penalty, 
and we are guaranteed to remain stable as long as 
we stay below the upper limit set by the arrival 
cost. 

Related Problem: Statistical Sampling 

MHE is based on optimizing an objective func¬ 
tion that bears some relationship to the condi¬ 
tional probability of the state (trajectory) given 
the measurements. As discussed in the section 
on the Kalman filter, if the system is linear with 
normally distributed noise, this relationship can 
be made exact, and MHE is therefore an optimal 
statistical estimator. But in the nonlinear case, 
the objective function is chosen with engineering 
judgment and is only a surrogate for the condi¬ 
tional probability. By contrast, sampling methods 
such as particle filtering are designed to sam¬ 
ple the conditional density also in the nonlinear 
case. The mean and variance of the samples then 
provide estimates of the mean and variance of 
the conditional density of interest. In the limit 
of infinitely many samples, these methods are 
exact. The efficiency of the sampling methods 
depends strongly on the model and the dimension 
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of the state vector n , however. The efficiency of 
the sampling strategy is particularly important 
for online use of state estimators. Rawlings and 
Bakshi (2006) and Rawlings and Mayne (2009, 
pp. 329-355) provide some comparisons of par¬ 
ticle filtering with MHE and also describe some 
hybrid methods combining MHE and particle 
filtering. 

Summary and Future Directions 

MHE is one of few state estimation methods that 
can be applied to nonlinear models for which 
properties such as estimator stability can be es¬ 
tablished (Rao et al. 2003; Rawlings and Mayne 
2009). The required online solution of an opti¬ 
mization problem is computationally demanding 
in some applications but can provide signifi¬ 
cant benefits in estimator accuracy and rate of 
convergence (Patwardhan et al. 2012). Current 
topics for MHE theoretical research include treat¬ 
ing bounded rather than convergent disturbances 
and establishing properties of suboptimal MHE 
(Rawlings and Ji 2012). The current main focus 
for MHE applied research involves reducing the 
online computational complexity to reliably han¬ 
dle challenging large dimensional, nonlinear ap¬ 
plications (Kuhl et al. 2011; Lopez-Negrete and 
Biegler 2012; Zavala and Biegler 2009; Zavala 
et al. 2008). 

Cross-References 

► Bounds on Estimation 

► Estimation, Survey on 

► Extended Kalman Filters 

► Nonlinear Filters 

► Particle Filters 


Recommended Reading 

Moving horizon estimation has by this point 
a fairly extensive literature; a recent overview 
is provided in Rawlings and Mayne (2009, 
pp. 356-357). The following references provide 


either (i) general background required to 
understand MHE theory and its relationship to 
other methods or (ii) computational methods for 
solving the real-time MHE optimization problem 
or (iii) challenging nonlinear applications that 
demonstrate benefits and probe the current limits 
of MHE implementations. 
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Multi-domain Modeling and 
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Germany 

Abstract 

One starting point for the analysis and design 
of a control system is the block diagram 
representation of a plant. Since it is nontrivial to 
convert a physical model of a plant into a block 
diagram, this can be performed manually only 
for small plant models. Based on research from 
the last 35 years, more and more mature tools 
are available to achieve this transformation fully 
automatically. As a result, multi-domain plants, 
for example, systems with electrical, mechanical, 
thermal, and fluid parts, can be modeled in a 
unified way and can be used directly as input- 
output blocks for control system design. An 
overview of the basic principles of this approach 
is given. This provides also the possibility to use 
nonlinear, multi-domain plant models directly in 
a controller. Finally, the low-level “Functional 
Mockup Interface” standard is sketched to 
exchange multi-domain models between many 
different modeling and simulation environments. 

Keywords 

Block diagram; Bond graph; Differential- 
algebraic equation (DAE) system; Flow variable; 


FMI for Co-Simulation; FMI for Model 
Exchange; Functional Mockup Interface; Inverse 
models; Modelica; Object-oriented modeling; 
Potential variable; Stream variable; Symbolic 
transformation; VHDL-AMS 

Introduction 

Methods and tools for control system analysis 
and design usually require an input-output block 
diagram description of the plant to be controlled. 
Apart from small systems, it is nontrivial to de¬ 
rive such models from first principles of physics. 
Since a long time, methods and tools are available 
to construct such models automatically for one 
domain, for example, a mechanical model, an 
electronic, or a hydraulic circuit. These domain- 
specific methods and tools are, however, only 
of limited use for the modeling of multi-domain 
systems. 

In the dissertation (Elmqvist 1978), a suitable 
approach for multi-domain, object-oriented 
modeling has been developed by introducing 
a modeling language to define models on 
a high level based on first principles. The 
resulting DAE (differential-algebraic equation) 
systems are transformed with proper algorithms 
automatically in a block diagram description 
with input and output signals based on ODEs 
(ordinary differential equations). 

In 1978, the computers were not powerful 
enough to apply this method on larger systems. 
This changed in the 1990s, and then the tech¬ 
nology has been substantially improved, many 
different modeling languages appeared (and also 
disappeared), and the technology was introduced 
in commercial simulation environments. 

In Table 1 , an overview of the most important 
standards, languages, and tools in the year 2013 
for multi-domain modeling is given: 

The Modelica language is a standard from 
The Modelica Association (Modelica Associa¬ 
tion 2012). The first version was released in 
1997. Also a large free library is provided with 
about 1,300 model components from many do¬ 
mains. There are several software tools support¬ 
ing this modeling language and the free Modelica 
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Multi-domain Modeling and Simulation, Table 1 Multi-domain modeling and simulation environments 

Tool name 

Web (accessed December 2013) 

Environments based on the Modelica Standard (https://www.Modelica.org) 

CyModelica 

http://cydesign.com/ 

Dymola 

http://www.dymola.com/ 

JModelica.org 

http: //w w w.jmodelica. org/ 

LMS Imagine.Lab AMESim 

http: //w w w.lmsintl. com/LMS - Imagine- Lab- AMESim 

MapleSim 

http://www.maplesoft.com/products/maplesim 

MWorks 

http://en.tongyuan.ee/ 

OpenModelica 

https://openmodelica.org/ 

SimulationX 

http ://w w w. iti sim. com/ simulationx/ 

Wolfram SystemModeler 

http ://www. wolfram, com/ system- modeler/ 

Environments based on the VHDL-AMS Standard (http://www.eda.ora/twiki/bin/view.cai/P10761) 

ANSYS Simplorer 

http://www.ansys.com/Products 

Saber 

http ://www. sy nop sy s. com/Systems/S aber 

SMASH 

http://www.dolphin.fr/medal/products/smash/smashoverview.php 

SystemVision 

http ://w w w. mentor, com/products/ sm/ sy stemvi sion 

Virtuoso AMS designer 

http://www.cadence.com 

Environments with vendor-specific multi-domain modeling languages 

EcosimPro 

http://www.ecosimpro.com/ 

gPROMS 

http://www.psenterprise.com/gproms 

OpenMAST 

http ://www. openmast. org/ 

Simscape 

https://www.mathworks.com/products/simscape 

Environments based on the Bond Graph Methodology 

20-sim 

http://www.20sim.com/ 


Standard Library. The examples of this entry are 
mostly provided from this standard. 

The following registered trademarks are refer¬ 
enced: 


Registered 

trademark 

Owner of trademark 

AMESim 

IMAGINE SA 

ANSYS 

ANSYS Inc. 

Dymola 

Dassault Systemes AB 

EcosimPro 

Empresarios Agrupados A.I.E. 

gPROMS 

Process Systems Enterprise Limited 

MATLAB 

The Math Works Inc 

Modelica 

Modelica Association 

Saber 

Sabremark Limited partnership 

SimulationX 

ITI GmbH 

Simulink 

The Math Works Inc 

SystemVision 

Mentor Graphics Corporation 

Virtuoso 

Cadence Design 


• The VHDL-AMS language is a standard from 
IEEE (IEEE 1076.1-2007 2007), first released 


in 1999. It is an extension of the widely used 
VHDL hardware description language. This 
language is especially used in the electronics 
community. 

• There are several vendor-specific modeling 
languages, notably Simscape from Math- 
Works as an extension to Simulink, as well 
as MAST, the underlying modeling language 
of Saber (Mantoolh and Vlach 1992). In 2004, 
MAST was published as OpenMAST under 
an open source license. 

• Bond graphs (see, e.g., Karnopp et al. 2012) 
are a special graphical notation to define 
multi-domain systems based on energy flow. 
It was invented in 1959 by Henry M. Paynter. 
In the section “Modeling Language Princi¬ 
ples”, the principles of multi-domain modeling 
based on a modeling language are summarized. 
In the section “Models for Control Systems”, 
it is shown how such models can be used not 
only for simulation but also as components in 
nonlinear control systems. Finally, in the section 
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“The Functional Mockup Interface”, an overview 
about a low-level standard for the exchange of 
multi-domain systems is described. 


Modeling Language Principles 

Schematics: The Graphical View 

Modelers nowadays require a simple to use 
graphical environment to build up models. With 
very few exceptions, multi-domain environments 
define models by schematic diagrams. A typical 
example is given in Fig. 1, showing a simple 
direct-current electrical motor in Modelica. 

In the lower left part, the electrical circuit 
diagram of the DC motor is visible, consisting 
mainly of the armature resistance and inductance 
of the motor, a voltage source, and component 
“emf” to model in an idealized way the electro- 
motoric forces in the air gap. On the lower right 
part, the motor inertia, a gear box, and a load 
inertia are present. In the upper part, the heat 
transfer of the resistor losses to the environment 
is modeled with lumped elements. 


A component, like a resistor, rotational inertia, 
or convective heat transfer, is shown as an icon 
in the diagram. On the border of a component, 
small rectangular or circular signs are present 
representing the “physical ports.” Ports are con¬ 
nected by lines and model the (idealized) physical 
or signal interaction between ports of different 
components, for example, the flow of electrical 
current or heat or the rigid mechanical coupling. 

Components are built up hierarchically from 
other components. On the lowest level, compo¬ 
nents are described textually with the respec¬ 
tive modeling language (see section “Component 
Equations”). 

Coupling Components by Ports 

The ports define how of a component can interact 
with other components. A port contains (a) a def¬ 
inition of the variables that describe the interface 
and (b) defines in which way a tool can automat¬ 
ically construct the equations of connections. A 
typical scenario is shown in Fig. 2 where the ports 
of the three components A, B, C are connected 
together at one point P: 


thermalCond 


convection 


r 

heatCapacitor 


R=t3.8 



G=G 


inductor 
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k i 
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ground 


loadlnertia 



J=100 


Multi-domain Modeling and Simulation, Fig. 1 Modelica schematic of DC motor with mechanical load and heat 
losses 
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When cutting away the connection lines, the 
resulting system consists of three decoupled com¬ 
ponents A, B, C and a new component around 
P describing the infinitesimally small connection 
point. The balance equations and the boundary 
conditions of the respective domain must hold at 
all these components. When drawing the connec¬ 
tion lines, enough information must be available 
in the port definitions so that the tool can con¬ 
struct the equations of the infinitesimally small 
connection points automatically. 

To summarize, the component developer is 
responsible that the balance equations and bound¬ 
ary conditions are fulfilled for every component 
(A, B, C in Fig. 2), and the tool is responsible that 
the balance equations and boundary conditions 
are also fulfilled at the points where the compo¬ 
nents are connected together (P in Fig. 2). As a 


V 





Multi-domain Modeling and Simulation, Fig. 2 

Cutting the connections around the connection point P 
results in three decoupled components A, B , C and a new 
component around P describing the infinitesimally small 
connection point 


consequence, the balance equations and bound¬ 
ary conditions are fulfilled in the overall model 
containing all components and all connections. 

In order that a tool can automatically construct 
the equations at a connection point, every port 
variable needs to be associated to a port variable 
type. In Table 2, some port variable types of 
Modelica are shown. In this table it is assumed 
that u\,u 2 ,...u n ,y, v\, v 2 ,..., v n , fu f 2 ,../„, 
s i, s 2 ,..., s n are corresponding port variables 
from different components that are connected 
together at the same point P. 

Port variable types “input” and “output” define 
the “usual” signal connections in block diagrams. 

“Potential variables” and “flow variables” are 
used to define standard physical connections. For 
example, an electrical port contains the electrical 
potential and the electrical current at the port, 
and when connecting electrical ports together, 
the electrical potentials are identical and the sum 
of the electrical currents is zero, according to 
Table 2. This corresponds exactly to Kirchhoff’s 
voltage and current laws. 

“Stream variables” are used to describe the 
connection semantics of intensive quantities in 
bidirectional fluid flow, such as specific enthalpy 
or mass fraction. Here, the idealized balance 
equation at a connection point states, for exam¬ 
ple, that the sum of the port enthalpy flow rates is 
zero and the port enthalpy flow rate is computed 
as the product of the mass flow rate (a flow 
variable f ) and the directional specific enthalpy 
Si, which is either the (yet unknown) mixing- 
specific enthalpy s m [ x when the flow is from 
the connection point to the port or the specific 
enthalpy Si in the port when the flow is from 
the port to the connection point. More details 


Multi-domain Modeling and Simulation, Table 2 

Some port variable types in Modelica 

Port variable type 

Connection semantics 

Input variables u t , output variable y 

u\ = u 2 = ... = u n = y (exactly one output variable can 
be connected to n input variables) 

Potential variables v t 

vi = v 2 = ... = v n 

Flow variables f 

o 

II 

M 

Stream variables Si (with associated flow variables f) 

q _ y fis-- s- — { Smix if > ° 

u ^ 7,51,51 [Si iff <0 

(0 = £/,) 
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Multi-domain Modeling and Simulation, Table 3 

Some port definitions from Modelica 

Domain 

Port variables 

Electrical analog 

Electrical potential in [V] (pot.) 
electrical current in [A] (flow) 

Elec, multiphase 

Vector of electrical ports 

Electrical quasi-stationary 

Complex elec, potential (pot.) 
complex elec, current (flow) 

Magnetic flux tubes 

Magnetic potential in [A] (pot.) 
magnetic flux in [Wb] (flow ) 

Translational (1-dim. mechanics) 

Distance in [m] (pot.) 
cut-force in [N] (flow ) 

Rotational (1-dim. mechanics) 

Absolute angle in [rad] (pot.) 
cut-torque in [Nm] (flow ) 


2-dim. mechanics Position in x-direction in [m] (pot.) 

position in y-direction in [m] (pot.) 
absolute angle in [rad] (pot.) 


cut-force in x-direction in [N] (flow) 
cut-force in y-direction in [N] (flow ) 
cut-torque in z-direc. in [Nm] (flow ) 


3-dim. mechanics 

Position vector in [m] (pot.) 
transformation matrix in [1] (pot.) 
cut-force vector in [N] (flow ) 
cut-torque vector in [Nm] (flow) 

1-dim. heat transfer 

Temperature in [K] (pot.) 
heat flow rate in [W] (flow) 

1-dim. thermo-fluid pipe flow 

Pressure in [Pa] (pot.) 
mass flow rate in [kg/s] (flow) 
spec, enthalpy in [J/kg] (stream) 
mass fractions in [1] (stream) 


0 — ip + i n 

u = Vp — v n 
du 

c di = ip 



and explanations are available from Franke et al. 
(2009). In Table 3 some of the port definitions are 
shown that are defined in the Modelica Standard 
Library. 

Component Equations 

Implementing a component in a modeling lan¬ 
guage means to (a) define the ports of the com¬ 
ponent and (b) provide the equations describing 
the relationships between the port variables. For 
example, an electrical capacitor with constant 
capacitance C can be defined by the equations in 
the right side of Fig. 3. 

Such a component has two ports, the pins 
“p” and “n,” and the port variables are the elec¬ 
trical currents i p , i n flowing into the respective 
ports and the electrical potentials v p , v n at the 
ports. The first component equation states that 
if the current i p at port “p” is positive, then the 
current i n at port “n” is negative (therefore, the 
current flowing into “p” is flowing out of “n”). 


Multi-domain Modeling and Simulation, Fig. 3 

Equations of a capacitor component 


Furthermore, the two remaining equations state 
that the derivative of the difference of the port 
potentials is proportional to the current flowing 
into port “p ” 

One important question is how many equa¬ 
tions are needed to describe such a component? 
For an input-output block, this is simple: all input 
variables are known, and for all other variables, 
one equation per unknown is needed. Count¬ 
ing equations for physical components, such as 
a capacitor, is more involved: the requirement 
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that any type of component connections shall 
always result in identical numbers of unknowns 
and equations of the overall system leads to the 
following counting rule (for a proof, see Olsson 
et al. 2008): 

1. The number of potential and the number of 
flow variables in a port must be identical. 

2. Input variables and variables that appear dif¬ 
ferentiated are treated as known variables. 

3. The number of equations of a component must 
be equal to the number of unknowns minus the 
number of flow variables. 

In the example of the capacitor, there are 5 un¬ 
knowns (i p , i n , v p , v n , du/dt) and 2 flow variables 
(i p , i n ). Therefore, 5—2 = 3 equations are needed 
to define this component. 

Modeling languages are used to provide a tex¬ 
tual description of the ports and of the equations 
in a specific syntax. For example, in Modelica 
the capacitor from Fig. 3 can be defined as Fig. 4 
(keywords of the Modelica language are written 
in boldface): 

In VHDL-AMS the capacitor model can be 
defined as shown in Fig. 5. 

One difference between Modelica and VHDL- 
AMS is that in Modelica all equations need to be 
explicitly given and port variables (such as p.i) 
can be directly accessed in the model (Fig. 4). In¬ 


type Voltage = Real (unit="V M ); 
type Current = Real (unit="A"); 

connector Pin 

Voltage v; 
flow Current i; 
end Pin; 

model Capacitor 

parameter Real C(unit="F"); 

Pin p,n; 

Voltage u; 

equation 

0 = p.i + n.i; 
u = p . v - n . v ; 

C*der(u) = p.i; 
end Capacitor; 


Multi-domain Modeling and Simulation, Fig. 4 

Modelica model of capacitor component 


stead, in VHDL-AMS (and some other modeling 
languages), port variables cannot be accessed in 
a model, and instead via the “quantity .. across 
.. through .. to ..” construction, the relationships 
between the port variables are implicitly defined 
and correspond to the Modelica equations “0 = 
p.i + n.i” and “u = p.v — n. v.” 

Simulation of Multi-domain Systems 

Collecting all the component equations of 
a multi-domain system model together with 
all connection equations results in a DAE 
(differential-algebraic equation) system: 

0 = f(x, x, w, y, u, 0 (1) 

where t e M is time, x(t) e are vari¬ 
ables appearing differentiated, w (t) e are 
algebraic variables, y (t) e M"? are outputs, 
u(t) e are inputs, and f e M n x +n w +n y are 
the DAE equations. Equation (1) can be solved 
numerically with an integrator for DAE systems; 
see, for example, Brenan et al. (1996). For DAEs 
that are linear in their unknowns, a complete 
theory for solvability is available based on matrix 
pencils (see, e.g., Brenan et al. 1996) and also 
reliable software for their analysis (Varga 2000). 

Unfortunately, only certain classes of nonlin¬ 
ear DAEs can be directly solved numerically 


subtype voltage is real; 
subtype current is real; 
nature electrical is 
voltage across 
current through 
electrical_ref reference; 

entity Capacitorlnterface IS 

generic(C: real); 

port (terminal p, n: electrical); 
end entity Capacitorlnterface; 

architecture SimpleCapacitor of 

Capacitorlnterface is 

quantity u across i through p to n; 
begin 

i == C*u' dot; 

end architecture SimpleCapacitor; 


Multi-domain Modeling and Simulation, Fig. 5 

VHDL-AMS model of capacitor component 
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in a reliable way. Domain-specific software, 
as, e.g., for mechanical systems, transforms 
the underlying DAE into a form that can be 
more reliably solved, using domain-specific 
knowledge. This is performed by differentiating 
certain equations of the DAE analytically and 
utilizing special integration methods for the 
resulting overdetermined set of differential- 
algebraic equations. Multi-domain simulation 
software uses the following approaches: 

(a) The DAE (1) is directly solved numerically 
using an implicit integration method, such 
as a linear multistep method. Typically, all 
VHDL-AMS simulators use this approach. 

(b) The DAE (1) is symbolically transformed in 
a form that is equivalent to a set of ODEs 
(ordinary differential equations), and then 
either explicit or implicit ODE or DAE in¬ 
tegration methods are used to numerically 
solve the transformed system. The transfor¬ 
mation is based on the algorithms of Pan- 
telides (1988) and of Mattsson and Soderlind 
(1993) and might require to analytically dif¬ 
ferentiate equations. Typically, all Modelica- 
based simulators, but also EcosimPro, use 
this approach. 

For many models both approaches can be applied 
successfully. There are, however, systems where 
approach (a) is successful and fails for (b) or vice 
versa. 

DAEs (1) derived from modeling languages 
usually have a large number of equations but with 
only a few unknowns in every equation. In order 
to solve DAEs of this kind efficiently, both with 
(a) or (b), typically graph theory and/or sparse 
matrix methods are utilized. For method (b) the 
fundamental algorithms have been developed in 
Elmqvist (1978) and later improved in further 
publications. For a recent survey and comparison 
of some of the algorithms, see Frenkel et al. 
( 2012 ). 

Solving the DAE (1) means to solve an ini¬ 
tial value problem. In order that this can be 
performed, a consistent set of initial variables 
X 0 = X (to) , X 0 = X (to) . W() = w (to) , y 0 = 
y (to ), uo = u (to) has to be determined first at 
the initial time to. In general, this is a nontrivial 
task. For example, often (1) shall start in steady 


state, that is, it is required that xo = 0 and 
therefore at the initial time (1) is required to 
satisfy 

0 = f (0, x 0 , w 0 , y 0 , u 0 , t 0 ) (2) 

Equation (2) is a nonlinear algebraic system of 
equations in the unknowns xo, wo, yo, uo. These 
are n x + n w + n y equations for n x + n w + n y + n u 
unknowns. Therefore, n u further conditions must 
be provided (usually some elements of Uo and/or 
yo are fixed to desired physical values). Solving 
(2) for the unknowns is also called “DC operat¬ 
ing point calculation” or “trimming.” Nonlinear 
equation solvers are based on iterative methods 
that require usually a sufficiently accurate initial 
guess for all unknowns. In a large multi-domain 
system model, this is not practical, and therefore, 
methods are needed to solve (2) even if generic 
guess values in a library are provided that might 
be far from the solution of the system at hand. 

For analog electronic circuit simulations, a 
large body of theory, algorithms, and software is 
available to solve (2) based on homotopy meth¬ 
ods. The basic idea is to solve a sequence of non¬ 
linear algebraic equation systems by starting with 
an easy to solve simplified system, characterized 
by the homotopy parameter A = 0. This system 
is continuously “deformed” until the desired one 
is reached at A = 1. The solution at iteration i 
is used as guess value for iteration i + 1, and at 
every iteration, the solution is usually computed 
with a Newton-Raphson method. 

The simplest such approach is “source step¬ 
ping”: the initial guess values of all electrical 
components are set to “zero voltage” and/or “zero 
current.” All (voltage and current) sources start 
at zero, and their values are gradually increased 
until the desired source values are reached. This 
method may not converge, typically due to the 
severe nonlinearities at switching thresholds in 
logical circuits. 

There are several, more involved approaches, 
called “probability one homotopy” methods. For 
these method classes, proofs exist that they con¬ 
verge with probability one (so practically al¬ 
ways). These algorithms can only be applied for 
certain classes of DAEs; see, for example, the 
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“Variable Stimulus Probability One Homotopy” 
from Melville et al. (1993). 

Although strong results exist for analog elec¬ 
trical circuit simulators, it is difficult to generalize 
them to the large class of multi-domain systems 
covered by a modeling language. In Modelica 
a “homotopy” operator was introduced into the 
language (Sielemann et al. 2011) in order that a 
library developer can formulate simple homotopy 
methods like the “source stepping” in a com¬ 
ponent library. A generalization of probability 
one methods for multi-domain systems was de¬ 
veloped in the dissertation of Sielemann (2012) 
and was successfully applied to air distribution 
systems described as 1-dim. thermo-fluid pipe 
flow. 


Models for Control Systems 

Models for Analysis 

The multi-domain models from section “Mod¬ 
eling Language Principles” can be utilized to 
evaluate the properties of a control system by 
simulation. Also control systems can be designed 
by nonlinear optimization where at every opti¬ 
mization step one or several simulations of a plant 
model are executed. Furthermore, modeling en¬ 
vironments usually provide a means to linearize 
the nonlinear DAE (1) of the underlying model 
around an operating point: 

x(0 ^ x 0 p + Ax(f), w(?) « w op + Aw (t), 
y(0 ^ y op + Ay(0, u(0 ss u op + Au(0 

(3) 

resulting in 

Ax red = A Ax red + B Au 
Ay = C Ax red + D Au 

where Ax red is a vector consisting of elements of 
the vector of Ax, the vector Aw is eliminated by 
exploiting the algebraic constraints, and A, B, C, 
D are constant matrices. Simulation tools provide 
linear analysis and synthesis methods on this 
linearized system and/or export it for usage in an 
environment like Matlab, Maple, Mathematica, 
or Python. 


Multi-domain models might also be used 
directly in nonlinear Kalman filters, moving 
horizon estimators, or nonlinear model predictive 
control. For example, the company ABB is using 
moving horizon estimation and nonlinear model 
predictive control based on Modelica models 
to significantly improve the start-up process of 
power plants (Franke and Doppelhamer 2006). 

Inverse Models 

A large body of literature exists about the theory 
of nonlinear control systems that are based on 
inverse plant models; see, for example, Isidori 
(1995). Methods such as feedback linearization, 
nonlinear dynamic inversion, or flat systems use 
an inverse plant model in the control loop. How¬ 
ever, a major obstacle is how to automatically 
utilize an inverse plant model in a controller with¬ 
out being forced to manually set up the equations 
in the needed form which is not practical for 
larger systems. Modeling languages can solve 
this problem as discussed below. 

Nonlinear inverse models can be utilized in 
various ways in a control system. The simplest 
approach, as feed forward controller, is shown in 
Fig. 6. 

Under the assumption that the models of the 
plant and of the inverse plant are completely 
identical and start at the same initial state, then 
from the construction the control error e is zero 
and y = T (s) * y re f where T is a diagonal matrix 
with the transfer functions of the low-pass filters 
on the diagonal (so y w y re f for reference signals 
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Multi-domain Modeling and Simulation, Fig. 6 

Controller with inverse plant model in the feed forward 
path. The inverse plant model needs usually also 
derivatives of y re f as inputs. These derivatives are 
provided by appropriate low-pass filters 
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that have a frequency spectrum below the cutoff 
frequency of the low-pass filters). Since actually 
the assumption is usually not fulfilled, there will 
be a nonzero control error e and the feedback 
controller has to cope with it. This controller 
structure with a nonlinear inverse plant model 
has the advantage that the feed forward part is 
useful over the complete operating range of the 
plant. 

Various other structures with nonlinear plant 
models are discussed in Looye et al. (2005), 
such as compensation controllers, feedback lin¬ 
earization controllers, and nonlinear disturbance 
observers. 

It turns out that nonlinear inverse plant 
models can be generated automatically with 
the techniques that have been developed for 
modeling languages; see section “Modeling 
Language Principles”. In particular, constructing 
an inverse model from (1) means that the inputs 
u are defined to be outputs, so they are no longer 
knowns but unknowns, and outputs y are defined 
to be inputs, so they are no longer unknowns 
but knowns. The resulting system is still a 
DAE and can therefore be handled as any other 
DAE. 

Therefore, defining an inverse model with a 
modeling language just requires exchanging the 
definition of input and output signals. In Model- 
ica, this can be graphically performed with the 
nonstandard input-output block from Fig. 8. 

This block has two inputs and two outputs and 
described by the equations 

u 1 = u2; y 1 = y 2 

From a block diagram point of view this looks 
strange. However, from a DAE point of view, 
this just states constraints between two input and 
two output signals. In Fig. 8, it is shown how this 
block can be used to invert a simple second order 
system. 

The output of the low-pass filter is connected 
to the output of the second-order system and 
therefore this model computes the input of the 
second-order system, from the input of the filter. 

A Modelica environment will generate from 
this type of definition the inverse model, thereby 


differentiating equations analytically and solving 
algebraic variables of the model in a different way 
as for a simulation model. The whole transforma¬ 
tion is nontrivial, but it is just the standard method 
used by Modelica tools as for any other type of 
DAE system. 

The question arises whether a solution of the 
inverse model exists, is unique, and whether the 
model is stable (otherwise, it cannot be applied 
in a control system). In general, a nonlinear 
inverse model consists of linear and/or nonlinear 
algebraic equation systems and of linear and/or 
nonlinear differential equations. Therefore, from 
a formal point of view, the same theorems as for 
a general DAE apply; see, for example, Brenan 
et al. (1996). Furthermore, all these equations 
need to be solved with a numerical method. 
For some classes of systems, it can be shown 
that mathematically a unique solution exists and 
that the system is stable. However, in general, 
one cannot expect that it is possible to provide 
such a proof for complex inverse plant models. 
Still, inverse plant models have been successfully 
utilized by automatic generation from a Modelica 
tool, e.g., for robots, satellites, aircrafts, vehicles, 
and thermo-fluid systems. 

The Functional Mockup Interface 

Many different types of simulation environments 
are in use. One cannot expect that a generic 
approach as sketched in section “Modeling 
Language Principles” will replace all these 
environments with their rich set of domain- 
specific knowledge, analysis, and synthesis 
features. Practically, all simulation environments 
provide a vendor-specific interface in order 
that a user can import components that are not 
describable by the simulation environment itself. 
Typically, this requires to provide a component 
as a set of C or Fortran functions with a particular 
calling interface. In the control community, 
the most widely used approach of this kind is 
the S-Function interface from The Math Works, 
where Simulink is used as integration platform, 
and model components from other environments 
are imported as S-Functions. 
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Multi-domain Modeling 
and Simulation, Fig. 7 

Modelica 

InverseBlockConstraint 

block 



Multi-domain Modeling 
and Simulation, Fig. 8 

Inversion of a second-order 
system in Modelica 



In 2010 the vendor-independent standard 
“Functional Mockup Interface 1.0” was 
published (FMI Group 2010). This is a low- 
level standard for the exchange of models 
between different simulation environments. This 
standard allows to exchange only either the model 
equations (called “FMI for Model Exchange”) or 
the model equations with an embedded solver 
(called “FMI for Co-Simulation”). This standard 
was quickly adopted by many simulation 
environments, and in 2013 there are more than 
40 tools that support it (for an actual list of 
tools, see https://www.fmi-standard.org/tools). 
In particular nearly all Modelica environments 
can export Modelica models in this format, and 
therefore, Modelica multi-domain models can be 
imported in other environments with low effort. 

A software component which implements the 
FMI is called Functional Mockup Unit (FMU). 
An FMU consists of one zip-file with extension 
“.fmu” containing all necessary components to 
utilize the FMU either for Model Exchange, for 
Co-Simulation, or for both. The following sum¬ 
mary is an adapted version from Blochwitz et al. 
( 2012 ): 

1. An XML-file contains the definition of all 
exposed variables of the FMU, as well as 
other model information. It is then possible to 
run the FMU on a target system without this 
information, i.e., without unnecessary over¬ 
head. Furthermore, this allows determining all 


properties of an FMU from a text file, without 
actually loading and running the FMU. 

2. A set of C-functions is provided to execute 
model equations for the Model Exchange case 
and to simulate the equations for the Co- 
Simulation case. These C-functions can be 
provided either in binary form for different 
platforms or in source code. The different 
forms can be included in the same model zip- 
file. 

3. Further data can be included in the FMU zip- 
file, especially a model icon (bitmap file), 
documentation files, maps and tables needed 
by the model, and/or all object libraries or 
DLLs that are utilized. 


Summary and Future Directions 

Multi-domain modeling based on a DAE descrip¬ 
tion and defined with a modeling language is an 
established approach, and many tools support it. 
This allows to conveniently define plant models 
from many domains for the design and evalua¬ 
tion of control systems. Furthermore, nonlinear 
inverse plant models can be easily constructed 
with the same methodology and can be utilized 
in various ways in nonlinear control systems. 

Current research focuses on the support of 
the complete life cycle: defining requirements 
of a system formally on a “high level,” 
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considerably improving testing by checking these 
requirements automatically when evaluating a 
system design by simulations, and providing 
complete tool chains from nonlinear multi- 
domain models to embedded systems. The 
latter will allow convenient and fast target code 
generation of nonlinear controllers, extended and 
unscented Kalman filters, optimization-based 
controllers, or moving horizon estimators. 

Furthermore, the methodology itself is further 
improved. For example, in 2012, Modelica was 
extended with language elements to define multi¬ 
rate sampled data systems in a precise way, as 
well as state machines. 


Cross-References 

► Computer-Aided Control Systems Design: In¬ 
troduction and Historical Overview 

► Extended Kalman Filters 

► Feedback Linearization of Nonlinear 
Systems 

► Interactive Environments and Software Tools 
for CACSD 

► Model Building for Control System Synthesis 
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Abstract 

Dynamic processes, both continuous and 
batch, are characterised by autocorrelated 
measurements which are allied to the effects 
of process dynamics and disturbances. The 
common multivariate statistical process control 
(MSPC) approaches have been to use principal 
component analysis (PCA) or projection to latent 
structures (PLS) to build a model that captures the 
simultaneous correlations amongst the variables, 
but that ignores the serial correlation in the 
data during normal operations. Under such 
conditions it is difficult to perform efficient fault 
detection and diagnosis. An alternative approach 
to account for the process dynamics in MSPC is 
to use multiresolution analysis (MRA) by way 
of wavelet decomposition. Here, the individual 
measurements are decomposed into different 


scales (or frequencies) and the signals in each 
decomposed scale are then used for MSP which 
provides an indirect way of handling process 
dynamics. 

Keywords 

Multiresolution analysis; Partial least squares 
(PLS); Principal component analysis (PCA); 
Projection to latent structures (PLS); Wavelet 
transform 

Definition 

Multiscale principal component analysis 
(MSPCA) and its extension multiscale projection 
to latent structures (MSPLS) combine the 
abilities of these multivariate tools to de¬ 
correlate the variables by extracting linear 
relationships with that of wavelet analysis, to 
extract deterministic features and approximately 
de-correlate autocorrelated measurements. 
Multiscale modeling makes use of the wavelet 
transform which allows a signal (measurement) 
to be viewed in multiple resolutions with each 
resolution representing a different frequency. 
That is, wavelet transform allows complex 
information to be decomposed into basic 
components at different positions and scales. 

Motivation and Background 

One of the drawbacks of the conventional 
PCA (or PLS)-based MSPC is that although 
the PCA/PLS model captures the correlations 
among the variables, it ignores the serial 
(auto)correlation in the process variables and 
measurements. One way to overcome this issue 
is to include time-lagged variables in the PCA 
or PLS model. In this way, PCA and PLS will 
explicitly model both the correlations among 
the variables and the serial correlations in the 
individual variables. The impact is an increase in 
the number principal components required, but 
the multivariate monitoring model will be able 
to detect any changes in the serial correlation of 
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the variables as well as changes in relationships 
among the variables. This article focuses on 
multiscale-multiway PCA using batchwise 
data unfolding. However, the methodology 
can equally be applied to PLS-based process 
performance monitoring (MSPC). 

In multivariate statistical process control 
(MSPC), the multivariate statistical techniques 
of principle component analysis (PCA) and 
projection to latent structures {Partial Least 
Squares} (PLS) together with monitoring metrics 
based on Hotelling’s T 2 (directly related to 
the Mahalanobis distance that monitors the 
fit of new observations to the model space) 
and the squared prediction error (SPE) or Q 
statistic (that monitors the residual space-model 
mismatch) are used to simultaneously monitor 
the process variables (Kourti and MacGregor 
1996; Qin 2003). A recent survey provides an 
excellent state-of-the-art review of the methods 
and applications of data-driven fault detection 
and diagnosis that have been developed over the 
last two decades (Qin 2012). 

Process measurements typically exhibit multi¬ 
scale behavior as a consequence of representing 
the cumulative effect of a number of underlying 
process phenomena including process dynamics, 
measurement noise, and disturbances. To address 
these issues, methodologies are required to ad¬ 
dress (i) the multiscale nature of process data and 
(ii) the inability of some existing algorithms to 
handle autocorrelation. One approach is through 
the use of multiresolution analysis and wavelets 
Mallat (1998). Informative discussions and ap¬ 
plication studies related to using multiresolution 
analysis and wavelet decompositions to enhance 
PCA-based process monitoring and fault detec¬ 
tion have been presented, for example, by Bakshi 
(1998), Misra et al. (2002), Aradhye et al. (2003), 
Lu et al. (2003), Yoon and MacGregor (2004) and 
Reis and Saraiva (2006). Yoon and MacGregor 
in their comprehensive MSPCA study discussed 
their approach in the context of other multiscale 
approaches and illustrated the methodology using 
simulated data from a continuous stirred-tank 
reactor system. A major contribution of the paper 
was to extend fault isolation methods based on 
contribution plots to multiscale PCA approaches. 
Although some 9 years old, Ganesan et al. (2004) 


provided review of wavelet-based multiscale sta¬ 
tistical process monitoring. 

The Approach 

Multiresolution analysis (MRA) provides the 
theoretical basis for the derivation of a com¬ 
putationally efficient algorithm for the wavelet 
transform Mallat (1998). MRA allows the 
dynamic aspects of the data in to be taken into 
account in MSPC. The individual signals are 
decomposed into different scales (frequencies), 
and data in each decomposed scale are then used 
for MSPC which provides an indirect approach 
to handling process dynamics. Multiscale MSPC 
(MSPCA) enables the simultaneous extraction 
of process correlations across data as well as 
accounting for autocorrelation within sensor 
data. In this way, it captures correlations among 
the process variables made by various events 
occurring at different scales. 

MSPCA calculates the principal components 
of wavelet coefficients at each scale and com¬ 
bines these at the relevant wavelet scales. Due 
to its multiscale nature, MSPCA is very useful 
for the modeling of data containing contribu¬ 
tions from events whose behavior changes over 
both time and frequency. Process monitoring by 
MSPCA, and process prediction by MSPLS, in¬ 
volves combining those scales where significant 
events are detected. Approximate de-correlation 
of wavelet coefficients also makes MSPCA ef¬ 
fective for the monitoring of autocorrelated mea¬ 
surements. 

The Algorithm 

Wavelets are a family of basis functions that 
provide a mapping from the time domain to the 
time-frequency domain. They can be used to 
decompose the signal into different resolutions by 
projecting onto the corresponding wavelet basis 
functions using multiresolution analysis (MRA). 
A wavelet set is constructed from a fundamen¬ 
tal basis function or the mother wavelet by a 
process of translation and dilation. The wavelet 
set is defined as wavelet analysis which provides 
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methodologies for the extraction of the time and 
frequency content of a signal. Conventional fre¬ 
quency analysis based on the Fourier transform 
consists of decomposing a signal into sine waves 
of different frequencies. Wavelet analysis decom¬ 
poses the original signal in a similar manner. The 
major difference is that while Fourier analysis 
uses sine waves of infinite length, multiresolution 
analysis uses waveforms of finite length. The 
finite length of the wavelets allows them to de¬ 
scribe local events in both the time and frequency 
domain. 

The wavelet transform, an extension to the 
Fourier transform, projects the original signal 
down onto wavelet basis functions, providing a 
mapping from the time domain to the timescale 
plane. The wavelet functions, which are localized 
in the time and frequency domain, are obtained 
from a single prototype wavelet, the mother 
wavelet , by dilation and translation. The wavelet 
set is defined as 



where \fr is the mother wavelet function, a the 
dilation parameter, and b the translation param¬ 
eters, and the factor - 4 = is used to ensure that 

vM 

each wavelet function has the same energy as the 
mother wavelet. The discrete wavelet transform 
with dyadic dilation and translation is used in this 
overview. A definition of continuous and discrete 
wavelet transforms can be found in Daubechies 
(1992). In the discrete case, the dilation and 
translation parameters are discretized as a = a™ 
and b = kb^a ™. If ao = 2 and bo = 1, a dyadic 
dilation and translation is carried out; however, 
ao and bo are not restricted to these values. The 
discrete wavelet form, which is widely used in 
process monitoring and chemical signal analy¬ 
sis, is 

^jk (0 = ay /2 ir(ay - kb 0 ) 

A recursive algorithm for wavelet decompo¬ 
sition and the reconstruction of a discrete signal 
of dyadic length is often used Mallat (1998) 
and is known as the pyramid algorithm. The 
fast discrete wavelet decomposition consists of 


three components, low-pass filters L(n ), high- 
pass filters H(n ), and dyadic decimation. By 
passing the input signal through this pair of 
filters, the projection of the original signal 
onto the scaling and wavelet functions for the 
multiresolution analysis is performed. Dyadic 
decimation, or down-sampling, removes every 
odd member of a sequence, thus halving the 
original number of samples. The low-pass filter 
resembles a moving average, while the high-pass 
filter extracts the detailed information contained 
in the signal. The discrete wavelet transform 
operates by taking a sequence of values, applying 
L(n ) and H(n) and then repeating this same 
procedure to the approximation coefficients. In 
this way, the original signal vector is smoothed 
and halved through L, and the vector of 
approximation coefficients is again smoothed 
and halved through L. Successive application of 
the low-pass filter results in the approximation 
coefficients, becoming an increasingly smooth 
version of the original signal. At the same time 
as smoothing the signal, each iteration extracts 
the high frequency information in the data. 
The repeated application of L, followed by H 
is, in effect, a band-pass filter. The result of 
applying high-/low-pass filters to a signal is 
a set of coefficients describing the details of 
the signals Y>l and a second set describing the 
approximations of the signals A^. The original 
signal s can then be represented by 

L 

x(t) = ^ Dj(t) + A L (t) 

7 = 1 

where D } and Aj are referred to as the yth level 
wavelet details and approximation, respectively. 

Figure 1 shows schematically the multi¬ 
resolution-based wavelet decomposition. 

One of the most popular choices of wavelets 
are those of the Daubechies’ family. These 
wavelets are compactly supported in the time 
domain and have good frequency domain 
decay. Moreover, Daubechies’ wavelets (DaubN) 
possess a different type of smoothness which 
is determined by the vanishing moments N . 
This makes it possible to match the wavelet 
smoothness to the smoothness of the signals to 
be analyzed. The signal can then be decomposed 
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Multiscale Multivariate 
Statistical Process 
Control, Fig. 1 Schematic 
of multiresolution wavelet 
decomposition 
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into its contributions from multiple scales 
as a weighted sum of dyadically discretized 
orthonormal basis functions: 

L N N 

x(t) = EE dmktymk (0 T" ^ ' ^Ik^Plk (0 

m =1 /c=l A: = l 

where x(t) represents the process measurements, 
d m k represents the wavelet or detail signal coeffi¬ 
cient at scale m and location k , and aLk represent 
the scaled signal or scaling function coefficient 
of cj){t ) at the coarsest scale L and location 
k. The scaling function, or father wavelet, (p m k 
captures the low-frequency content of the original 
signal that is not captured by wavelets at the 
corresponding or finer scales. 

The wavelet transformation is applied to 
decompose a multivariate signal X into its 
approximate, Ai to Al, and detail, Di to Dl, 
coefficients for the first to Lth level, respectively. 
For more information, see Bakshi (1998), Misra 
et al. (2002) and Aradhye et al. (2003). Figure 2 
shows a schematic representation of a typical 
MSPCA multivariate statistical process control 
scheme. 

An example of the application of multiway- 
multiscale MPCA to a benchmark-fed batch 


fermentation process (Birol et al. http:// 
www.chee.iit.edu-control/software.html) was 
presented by Alawi and Morris (2007). The 
application used a combination of multiblock 
statistical modeling approaches together with 
multiscale-multiway batch monitoring. Figure 3 
shows the multiscale-multiway monitoring 
scheme for process monitoring and fault 
detection. At every time point, the batch process 
variables are decomposed into scales to the 
wavelet domain and then reconstructed back 
to the time domain. The scales/details and the 
approximations are collected into separate ma¬ 
trices (blocks). Multiblock PC A is then applied 
to the wavelets details and approximation. Fault 
detection based on the Tf and Q s statistics was 
used along with contribution plots incorporating 
confidence bounds to enhance fault diagnosis. 

Figure 4 compares the monitoring statistics 
for the multiscale-multiway PCA and conven¬ 
tional multiway PCA for a slowly drifting sen¬ 
sor fault showing the potential for multiscale 
MPCA (MSPCA) in being able to detect faster 
subtle process and sensor faults than conventional 
multiway MSPC. It is noted that sensor drift is 
confined to one scale band at low frequency. It has 
been observed that multiscale approaches appear 
to provide little improvement if a fault effect is 
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Multiscale Multivariate Statistical Process Control, Fig. 2 Schematic representation of multiscale PCA-based 
MSPC 
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Multiscale Multivariate Statistical Process Control, Fig. 3 Multiscale-multiway batch process monitoring scheme 


spread over more than one frequency band or 
the fault effect occurs mainly in a scale with 
dominant variance. Thus, a monitoring method 
that gives the best detection and identification 
of faults will depend on the fault characteristics 
with multiscale approaches, providing an advan¬ 
tage when the faults localized in frequency or 
that appear in scales that normally have small 
variance. 

Other Applications of Multiscale 
MPCA 

There have a number of nonlinear extensions. For 
example, multiscale PLS approaches have been 


developed, e.g., Teppola and Minkkinen (2000) 
and Lee et al. (2009). Nonlinear approaches have 
also been explored. For example, Lee et al. (2004) 
proposed a batch monitoring approach using 
multiway kernel principal component analysis, 
Shao et al. (1999) proposed a wavelet-based 
nonlinear PCA algorithm, Choi et al. (2008) 
described a study of a kernel-based MSPCA 
algorithm for nonlinear multiscale monitoring, 
and most recently Zhang and Ma (2011) com¬ 
pared fault diagnosis of nonlinear processes using 
multiscale KPCA and multiscale KPLS. Wavelet 
multiscale approaches have also been widely 
discussed in spectroscopic data processing (Shao 
et al. 2004). 
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Abstract 

Multi-vehicle routing problems in systems and 
control theory are concerned with the design of 
control policies to coordinate several vehicles 
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moving in a metric space, in order to complete 
spatially localized, exogenously generated tasks 
in an efficient way. Control policies depend on 
several factors, including the definition of the 
tasks, of the task generation process, of the vehi¬ 
cle dynamics and constraints, of the information 
available to the vehicles, and of the performance 
objective. Ensuring the stability of the system, 
i.e., the uniform boundedness of the number of 
outstanding tasks, is a primary concern. Typical 
performance objectives are represented by mea¬ 
sures of quality of service, such as the average 
or worst-case time a task spends in the system 
before being completed or the percentage of tasks 
that are completed before certain deadlines. The 
scalability of the control policies to large groups 
of vehicles often drives the choice of the informa¬ 
tion structure, requiring distributed computation. 

Keywords 

Cooperative control; Decentralized control; Dy¬ 
namic routing; Networked robots; Task allocation 

Introduction 

Multi-vehicle routing problems in systems and 
control theory are concerned with the design of 
control policies to coordinate several vehicles 
moving in a metric space, in order to complete 
spatially localized, exogenously generated tasks 
in an efficient way. Key features of the prob¬ 
lem are that tasks arrive sequentially over time 
and planning algorithms should provide control 
policies (in contrast to preplanned routes) that 
prescribe how the routes should be updated as a 
function of those inputs that change in real time. 
This problem is usually referred to as dynamic 
vehicle routing (DVR). In DVR problems, ensur¬ 
ing the stability of the system, i.e., the uniform 
boundedness of the number of outstanding tasks, 
is a primary concern. 

Motivation and Background 

As a motivating example, consider the following 
scenario: a team of unmanned aerial vehicles 


(UAVs) is responsible for investigating possible 
threats over a region of interest. As possible 
threats are detected, by intelligence, high-altitude 
or orbiting platforms, or by ground sensor net¬ 
works, one of the UAVs must visit its location 
and investigate the cause of the alarm, in order 
to enable an appropriate response if necessary. 
Performing this task may require the UAV not 
only to fly to the possible threat’s location but 
also to spend additional time on site. The objec¬ 
tive is to minimize the average time between the 
appearance of a possible threat and the time one 
of the UAVs completes the close-range inspection 
task. Variations may include priority levels, time 
windows during which the inspection task must 
be completed, and sensors with limited range. 

In order to perform the required mission, the 
UAVs (or, more in general, mission control) 
need to repeatedly solve three coupled decision¬ 
making problems: 

1. Task allocation: which UAV shall pursue 
each task? What policy is used to assign tasks 
to UAVs? How often should the assignment 
be revised? 

2. Service scheduling: given the list of tasks to 
be pursued, what is the most efficient ordering 
of these tasks? 

3. Loitering paths: what should UAVs without 
pending assignments do? 

The optimization process must take into account, 
for example, algebraic or differential constraints 
(such as obstacle avoidance or bounded cur¬ 
vature, respectively), sensing constraints, com¬ 
munication constraints, and energy constraints. 
Furthermore, one might require a decentralized 
control architecture. 

DVR problems, including the above UAV 
routing problem, are generally intractable due 
to their multifaceted combinatorial, differential, 
and stochastic nature, and consequently solution 
approaches have been devised that look either 
at heuristic algorithms or at approximation 
algorithms with some guarantee on their 
performance. 

Related Problems 

DVR problems represent the dynamic counter¬ 
part of the well-known static vehicle routing 
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problem (VRP), whereby (i) a team of n vehicles 
is required to service a set of nj “static” tasks in 
a metric space, (ii) each task requires a certain 
amount of on-site service, (iii) and the goal is to 
compute a set of routes that minimizes the cost 
of servicing the tasks; see Toth and Vigo (2001) 
for a thorough introduction to this problem. The 
VRP is static in the sense that vehicle routes 
are computed assuming that no new tasks arrive. 
The VRP is an important research topic in the 
operations research community. 

Approaches for Multi-vehicle Routing 

Broadly speaking, there are three main 
approaches available in the literature to tackle 
dynamic vehicle routing problems. The first 
approach relies on heuristic algorithms. In the 
second approach, called “competitive analysis 
approach,” routing policies are designed to 
minimize the worst-case ratio between their 
performance and the performance of an optimal 
off-line algorithm which has a priori knowledge 
of the entire input sequence. In the third 
approach, the routing problem is embedded 
within the framework of queueing theory. 
Routing policies are then designed to stabilize 
the system in terms of uniform boundedness of 
the number of outstanding tasks and to minimize 
typical queueing-theoretical cost functions such 
as the expected time the tasks remain in the queue. 
Since the generation of tasks and motion of the 
vehicles is within an Euclidean space, one can 
refer to this third approach as “spatial queueing 
theory.” 

Heuristic Approach 

The main aspect of the heuristic approach is that 
routing algorithms are evaluated primarily via nu¬ 
merical, statistical and experimental studies, and 
formal performance guarantees are not available. 
A naive, yet reasonable approach to design a 
heuristic algorithm for DVR would be to adapt 
classic queueing policies. However, perhaps sur¬ 
prisingly, this adaptation is not at all straightfor¬ 
ward. For example, routing algorithms based on a 
first-come-first-served policy, whereby tasks are 


fulfilled in the order in which they arrive, are un¬ 
able to stabilize the system for all stabilizable task 
arrival rates, in the sense that with such routing al¬ 
gorithms the average number of tasks grows over 
time without bound, even though there exist alter¬ 
native routing algorithms that would maintain the 
number of tasks uniformly bounded (Bertsimas 
and van Ryzin 1991). 

The most widely applied approach is 
to combine static routing methods (e.g., 
VRP-like methods, nearest neighbor strategies, 
or genetic algorithms) and sequential re¬ 
optimization, where the re-optimization horizon 
is chosen heuristically. In particular, greedy 
nearest neighbor strategies, whose formal 
characterization still represents an open problem, 
are known to perform particularly well in some 
notable cases (Bertsimas and van Ryzin 1991). 
However, the joint selection of a static routing 
method and of the re-optimization horizon in 
the presence of vehicle and task constraints 
(e.g., differential motion constraints, or task 
priorities) makes the application of this approach 
far from trivial. For example, one can show that 
an erroneous selection of the re-optimization 
horizon can lead to pathological scenarios where 
no task ever receives service (Pavone 2010). 
Additionally, performance criteria in dynamic 
settings commonly differ from those of the 
corresponding static problems. For example, in 
a dynamic setting, the time needed to complete 
a task may be a more important factor than the 
total vehicle travel cost. 


Competitive Analysis Approach 

The distinctive feature of the competitive analy¬ 
sis approach is the method used to evaluate an 
algorithm’s performance, which is called compet¬ 
itive analysis. In competitive analysis, the per¬ 
formance of a (causal) algorithm is compared 
to the performance of a corresponding off-line 
algorithm (i.e., a non-causal algorithm that has 
a priori knowledge of the entire input) in the 
worst-case scenario. Specifically, an algorithm is 
c -competitive if its cost on any problem instance 
is at most c times the cost of an optimal off-line 
algorithm: 
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CoSt causa ] (/) ^ C CoStoptimai off-line (J ), 
for all problem instances I. 

In the recent past, several dynamic vehicle 
routing problems have been successfully studied 
in this framework, under the name of the online 
traveling repairman problem (Jaillet and Wagner 
2006), and many interesting insights have been 
obtained. However, the competitive analysis 
approach has some potential disadvantages. First, 
competitive analysis is a worst-case analysis; 
hence, the results are often overly pessimistic 
for normal problem instances, and potential 
statistical information about the problem (e.g., 
knowledge of the spatial distribution of future 
tasks) is often neglected. Second, the worst- 
case analysis usually requires a finite horizon 
problem formulation, which precludes the 
study of useful properties such as stability. 
Third, competitive analysis is used to bound 
the performance relative to an optimal off-line 
algorithm, which, by being non-causal, does not 
belong to the feasible set of routing algorithms 
one is optimizing over. Hence, with this approach 
one minimizes the “cost of causality” in 
the worst-case scenario, but not necessarily 
the worst-case cost (which would require 
comparison with an optimal causal routing 
algorithm). Finally, many important real-world 
constraints for DVR, such as time windows, 
priorities, differential constraints on vehicle’s 
motion, and the requirement of teams to fulfill 
a task, have so far proved to be too complex 
to be considered in the competitive analysis 
framework (Golden et al. 2008, page 206). Some 
of these drawbacks have been recently addressed 
by Van Hentenryck et al. (2009) where a 
combined stochastic and competitive analysis 
approach is proposed for a general class of 
combinatorial optimization problems and is 
analyzed under some technical assumptions. 

Spatial Queueing Theory 

Spatial queueing theory embeds the dynamic ve¬ 
hicle routing problem within the framework of 
queueing theory. Spatial queueing theory consists 
of three main steps, namely, development of a 


spatial queueing model, establishment of funda¬ 
mental limitations of performance, and design of 
algorithms with performance guarantees. More 
specifically, the formulation of a model entails 
detailing four main aspects: 

1. A model for the dynamic component of the 
environment: this is usually achieved by as¬ 
suming that new events are generated (either 
adversarially or stochastically) by an exoge¬ 
nous process. 

2. A model for targets/tasks: tasks are usually 
modeled as points in a physical environment 
distributed according to some (possibly un¬ 
known) distribution, might require a certain 
level of on-site service time, and can be sub¬ 
ject to a variety of constraints, e.g., time win¬ 
dows, priorities, etc. 

3. A model for the vehicles and their motion: 
besides their number, one needs to specify 
whether the vehicles are subject to alge¬ 
braic (e.g., obstacles) or differential (e.g., 
minimum turning radius) constraints, sensing 
constraints, and fuel constraints. Also, the 
control could be centralized (i.e., coordinated 
by a central station) or decentralized and 
subject to communication constraints. 

4. Performance criterion: examples include the 
minimization of the waiting time before ser¬ 
vice, loss probabilities, expectation-variance 
analysis, etc. 

Once the model is formulated, one seeks 
to characterize fundamental limitations of 
performance (in the form of lower bounds for 
the best achievable cost); the purpose of this step 
is essentially twofold: it allows the quantification 
of the degree of optimality of a routing algorithm 
and provides structural insights into the problem. 
As for the last step, the design of a routing 
algorithm usually relies on a careful combination 
of static routing methods with sequential re¬ 
optimization. Desirable properties for the static 
methods are the following: (i) the static problem 
can be solved (at least approximately) in 
polynomial time and (ii) the static method is 
amenable to a statistical characterization (this is 
essential for the computation of performance 
bounds). Formal performance guarantees on 
a routing algorithm are then obtained by 
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quantifying the ratio between an upper bound on 
the cost delivered by that algorithm and a lower 
bound for the best achievable cost. Such a ratio, 
being an estimate of the degree of optimality of 
the algorithm, should be close to one and possibly 
independent of system parameters. The proposed 
algorithms are finally evaluated via numerical, 
statistical and experimental studies, including 
Monte-Carlo comparisons with alternative 
approaches. 

An interesting feature of this approach is that 
the performance analysis usually yields scaling 
laws for the quality of service in terms of model 
data, which can be used as useful guidelines 
to select system parameters when feasible (e.g., 
number of vehicles). 

In order to make the model tractable, the ar¬ 
rival process of tasks is assumed stationary (with 
possibly unknown parameters) with statistically 
independent arrival times. These assumptions, 
however, can be unrealistic in some scenarios, in 
which case the competitive analysis approach 
may represent a better alternative. From a 
technical standpoint, one should note that spatial 
queueing models are inherently different from 
traditional, nonspatial queueing models. The 
main reason is that in spatial queueing models, 
the “service time” per task has both a travel 
and an on-site component. Although the on¬ 
site service requirements can often be modeled 
as “statistically” independent, the travel times 
are inherently statistically coupled. Hence, in 
contrast to standard queueing models, service 
times in spatial queueing models are statistically 
dependent , and this deeply affects the solution to 
the problem. 

Pioneering work in this context is that of Bert- 
simas and van Ryzin (1991), who introduced 
queueing methods to solve the baseline DVR 
problem (a vehicle moves along straight lines and 
visits tasks whose time of arrival, location, and 
on-site service are stochastic; information about 
task location is communicated to the vehicle upon 
task arrival). Next section provides an overview 
of the application of spatial queueing theory to 
such simplified DVR problem, referred to in 
the literature as dynamic traveling repairman 
problem (DTRP). 


Applying Spatial Queueing Theory 
to DVR Problems 

Spatial Queueing Theory Workflow for 
DTRP 

Model 

The DTRP, which, incidentally, captures well the 
salient features of the UAV scenario outlined 
in the Motivation Section, can be modeled as 
follows. In a geographical region Q of area A, 
a dynamic process generates spatially localized 
tasks. The process generating tasks is modeled 
as a spatio-temporal Poisson process, i.e., (i) the 
time between consecutive generation instants has 
an exponential distribution with intensity A > 
0 and (ii) upon arrival, the locations of tasks 
are independently and uniformly distributed in 
Q. The location of the new tasks is assumed 
to be immediately available to a team of n ser¬ 
vicing vehicles. The vehicles provide service in 
Q, traveling at a speed at most equal to v; the 
vehicles are assumed to have unlimited fuel and 
task-servicing capabilities. Each task requires an 
independent and identically distributed amount of 
on-site service with finite mean duration s > 0. 
A task is completed when one of the vehicles 
moves to its location and performs its on-site 
service. The objective is to design a routing policy 
that maximizes the quality of service delivered 
by the vehicles in terms of the average steady- 
state time delay T between the generation of a 
task and the time it is completed (in general, in 
a dynamic setting, the focus is on the quality 
of service as perceived by the “end user,” rather 
than, for example, fuel economies achieved by 
the vehicles). Other quantities of interest are the 
average number iff of tasks waiting to be com¬ 
pleted and the waiting time IT of a task before its 
location is reached by a vehicle. These quantities, 
however, are related according to T = W + s 
(by definition) and by Little’s law, stating that 
iff = XW, for stable queues. 

The system is considered stable if the expected 
number of waiting tasks is uniformly bounded at 
all times, or equivalently, that tasks are removed 
from the system at least at the same rate at 
which they are generated. In the case at hand, 
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the time to complete a task is the sum of the 
time to reach its location (which depends on the 
routing policy) plus the time spent at that location 
in on-site service (which is independent of the 
routing policy). Since, by definition, the service 
time is no shorter than the on-site service time 
s, then a weaker necessary condition for stability 
is q := Xs/n < 1; the quantity q measures 
the fraction of time the vehicles are performing 
on-site service. Remarkably, it turns out that this 
is also a sufficient condition for stability, in the 
sense that, if this condition is satisfied, one can 
find a stabilizing policy. Note that this stability 
condition is independent of the size and shape of 
Q and of the speed of the vehicles. 


Vi = \qe Q\ ||# — Pi || < \\q~ Pj\\, V; ± i, 
j e {1,... ,n}j, 

where Vi is the region associated with the / -th 
“generator” point pi (see also ► Optimal Deploy¬ 
ment and Spatial Coverage). The distance H* 
certainly provides a lower bound on the expected 
distance traveled by a vehicle to reach a task, and 
hence one obtains the lower bound 

77 * 

T > — +s. 
v 


Fundamental Limitations of Performance 
To derive lower bounds, the main difficulty con¬ 
sists in bounding (possibly in a statistical sense) 
the amount of time spent to reach a target lo¬ 
cation. The derivation of these bounds becomes 
simpler in asymptotic regimes, i.e., looking at 
cases when £ —> 0 + and £ —> 1 _ , which are often 
called “light-load” and “heavy-load” conditions, 
respectively. 

Consider first the case in which q —> 0 + 
(light-load regime). A set of n points is called the 
n -median of Q if it globally minimizes the ex¬ 
pected distance between a random point sampled 
uniformly from Q and the closest point in such 
set. In other words, the n -median of Q globally 
minimizes the function 


H n (pu p 2 , . . . , Pn) 

:= E [minted,...^} \\p k -#||] 

= 7 ( , min \\Pk~q\\ dq. 

A Jq ke{l,...,n } 

Let H* be the global minimum of this function. 
Geometric considerations show that H* scales 
proportionally to yjA/n. 

Incidentally, the ft-median of Q induces a 
Voronoi partition that is called Median Voronoi 
Tessellation , whose importance will become clear 
in the next section. Recall that the Voronoi di¬ 
agram of Q induced by points (p \,..., p n ) is 
defined by 


This lower bound is tight in light-load conditions 
(q -* o+), as it will be seen in the next section. 

Consider now the case in which q l~ 
(heavy load). Let D be the average travel distance 
per task for some routing policy. By using ar¬ 
guments from geometrical probability (indepen¬ 
dent of algorithms), one can show that D > 
/32\/A/ V2ftT as q l ~, where /?2 is a constant 
that will be specified later. As discussed, for 
stability, one needs s + D/v < n/X. Combining 
the stability condition with the bound on the 
average travel distance per task, one obtains 


V2ftT 


ft 

X' 


Since, by Little’s law, m = XW and T = W +s, 
one finally obtains (recall that q = Xs/n): 


T > 


Pi A X 

2 v 2 ft 2 (1 — q ) 2 + S ' 


(as Q 


n. 


A salient feature of the above lower bound is 
that it scales quadratically with the number of ve¬ 
hicles (as opposed to the square-root scaling law 
one has in light-load conditions); note, however, 
that congestion effects are not included in this 
model. This bound also shows that the quality 
of service, which is proportional to 1/(1 — q) 2 , 
degrades much faster as the target load increases 
than in nonspatial queueing systems (where the 
growth rate is proportional to 1/(1 — q)). 
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Design of Routing Algorithms 
The design of an optimal light-load policy es¬ 
sentially relies on mimicking the proof strategy 
employed for the light-load lower bound. Specif¬ 
ically, a routing policy whereby (1) one vehicle 
is assigned to each of the n median locations 
of Q, (2) new tasks are assigned to the nearest 
median location and its corresponding vehicle, 
and (3) each vehicle services tasks according to 
a first-come-first-served policy is asymptotically 
optimal, i.e., 

T — + s, (as q -> 0 + ). 
v 


nj points independently and uniformly sampled 
in Q is known to satisfy the following property: 

lim ETSP(«t)/\/^t— P 2 * v/A almost surely, 

nj^oo 

where f >2 ~ 0.712 is a constant (the same 
f >2 constant that appeared in the previous sec¬ 
tion) (Steele 1990). 

It can be shown that, using the above routing 
policy, the average system time T satisfies 

T < y(p) 4 ,. A , 2 +:f ’ (as q 1 _ ), 

V 2 (1 — Q) Z 


Note that under this strategy “regions of domi¬ 
nance” are implicitly assigned to vehicles accord¬ 
ing to a Median Voronoi Tessellation. 

The heavy-load case is more challenging. 
Consider, first, the following single-vehicle 
routing policy, based on a partition of Q into 
p > 1 subregions {Q\, Q 2 , Q p } of equal 

area A/p. Such a partition can be obtained, e.g., 
as sectors centered at the median of Q. Define a 
cyclic ordering for the subregion, such that, e.g., 
if the vehicle is in region Qj , the “next” region is 
Qj , where j follows i in the cyclic ordering (in 
other words, j = (i + l)mod p). 


1. If there are no outstanding tasks, move to 
the median of the region Q. 

2. Otherwise, visit the “next” subregion; 
subregions with no tasks are skipped. 
Compute a minimum-length path from 
the vehicle’s current position through all 
the outstanding tasks in that subregion. 
Complete all tasks on this path, ignoring 
new tasks generated in the meantime. 
Repeat. 


The problem of computing the shortest path 
through a number of points is related to the well- 
known traveling salesman problem (TSP). While 
the TSP is a prototypically hard combinatorial 
optimization problem, it is well known that the 
Euclidean version of the problem can be approx¬ 
imated efficiently (Vazirani 2001). Furthermore, 
the length ETSP(^t) of a Euclidean TSP through 


where y( 1) = f>\ and y(p) P>\/2 for large p. 
These results critically exploit the statistical char¬ 
acterization of the length of an optimal TSP tour. 
Hence, the proposed policy achieves a quality 
of service that is arbitrarily close to the optimal 
one, in the asymptotic regime of heavy load (and, 
indeed, also of light load). 

The above single-vehicle routing policies can 
be fairly easily lifted to an efficient multi-vehicle 
routing policy. The key idea (akin to the one in the 
light-load case) is to (1) partition the workspace 
into n regions of dominance (with disjoint interi¬ 
ors and whose union is Q), (2) assign one vehicle 
to each region, and (3) have each vehicle follow 
a single-vehicle routing policy within its own re¬ 
gion. This approach leads to the following multi¬ 
vehicle routing policy for the DTRP problem: 


1. Partition Q into n regions of dominance 
of equal area and assign one vehicle to 
each region. 

2. Each vehicle executes a single-vehicle 
DTRP policy in its own subregion. 


Using as single-vehicle policy the routing pol¬ 
icy described above, the average system time T 
in heavy-load satisfies 


T < y(p) 


A X 
v 2 n 2 (1 — q) 2 




(Q 


n. 


Hence, by comparing this result with the cor¬ 
responding lower bound, one concludes that a 
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simple partitioning strategy leads to a multi¬ 
vehicle routing policy whose performance is ar¬ 
bitrarily close to the optimal one in heavy load. 

Mode of Implementation 

The scalability of the control policies to large 
groups of vehicles often requires a distributed 
implementation of multi-vehicle routing strate¬ 
gies. For the DTRP, a distributed implementa¬ 
tion can be obtained by devising decentralized 
algorithms for environment partitioning. In the 
solution proposed in Pavone (2010), power dia¬ 
grams are the key geometric concept to obtain, 
in a decentralized fashion, partitions suitable for 
both the light-load case (requiring, as seen before, 
a Median Voronoi Tessellation) and the heavy¬ 
load case (requiring an equal-area partition). The 
power diagram of Q is defined as 

Vi = \q e Q\ II q - Pi II 2 - Wi<\\q - pj || 2 - wj , 
Vj + i, j e {1,... 

where ( pt,Wj ) e Q x R are a set of “power 
points” and Vi is the subregion associated with 
the i -th power point. Note that power diagrams 
are a generalization of Voronoi diagrams: when 
ah weights are equal, the power diagram and the 
Voronoi diagram are identical. The basic idea, 
then, is to associate to each vehicle i a virtual 
power point, which is an artificial (or logical) 
variable whose value is locally controlled by the 
i- th vehicle. The cell Vi becomes the region of 
dominance for vehicle i , and each vehicle updates 
its own power point according to a decentralized 
gradient-descent law with respect to a cover¬ 
age function (► Optimal Deployment and Spatial 
Coverage), until the desired partition is achieved. 
The reader is referred to Pavone (2010) for more 
details. 

Extensions and Discussion 

By integrating additional ideas from dynamics, 
teaming, and distributed algorithms, the spatial 
queueing theory approach has been recently ap¬ 
plied to scenarios with complex models for the 
tasks such as time constraints, service priori¬ 


ties, translating tasks, and adversarial generation; 
has been extended to address aspects concerning 
robotic implementation such as complex vehi¬ 
cle dynamics, limited sensing range, and team 
forming; and has even been tailored to integrate 
humans in the design space; see Bullo et al. 
(2011) and references therein. Despite the sig¬ 
nificant modeling differences, the “workflow” is 
essentially the same as in the DTRP: a queueing 
model that captures the salient features of the 
problem at hand, characterization of the funda¬ 
mental limitations of performance, and design of 
algorithms with provable performance bounds. 
The last step, as for the DTRP, often involves 
lifting a single-vehicle policy to a multi-vehicle 
policy through the strategy of environment par¬ 
titioning. Within this context, a number of parti¬ 
tioning schemes and corresponding decentralized 
partitioning algorithms relevant to a large variety 
of DVR problems are discussed in Pavone et al. 
(2009). 

This workflow efficiently and transparently 
decouples the three decision-making problems 
mentioned in the Introduction Section, i.e., “task 
allocation,” “service scheduling,” and “loitering 
paths.” In fact, task allocation is addressed via 
the strategy of environment partitioning, service 
scheduling is addressed by applying a single¬ 
vehicle routing policy within the individual 
regions of dominance, and the loitering paths 
resolve in placing the vehicles at or around 
specific points within the dominance regions 
(e.g., the median). Note, however, that in some 
important cases, e.g., DVR problems where 
goods have to be transported from a pickup 
location to a delivery location or where vehicles 
are differentially constrained and operate in a 
“congested” workspace, multi-vehicle policies 
that rely on static partitions perform poorly or 
are not even feasible (Pavone et al. 2009), and 
task allocation and service scheduling need to be 
addressed as tightly coupled. 

Through spatial queueing theory one is 
usually able to characterize the performance 
of multi-vehicle routing policies in asymptotic 
regimes. To ensure “satisfactory” performance 
under general operation conditions, a common 
strategy is to consider heuristic modifications 
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to a baseline asymptotically efficient routing 
policy in such a way that, on the one hand, 
asymptotic performance is preserved, and, on the 
other hand, light- and heavy-load performances 
are “smoothly” and efficiently blended in the 
intermediate load case. The interested reader 
can find more information in Bullo et al. 
( 2011 ). 

Summary and Future Directions 

The three main approaches available to tackle 
DVR problems are (i) heuristic algorithms, (ii) 
competitive analysis, and (iii) spatial queueing 
theory. Broadly speaking, the competitive anal¬ 
ysis approach is well suited when worst-case 
guarantees are sought, e.g., because there is not 
enough statistical information about the problem 
at hand. Spatial queueing theory represents a 
powerful alternative in cases where it is possi¬ 
ble to leverage statistical information and one 
seeks average-case guarantees. Finally, for some 
problems the complexity of the model makes 
an analytical treatment very difficult, in which 
case the only option is to resort to an heuristic 
approach (possibly relying on insights derived 
by applying competitive analysis and/or spatial 
queueing theory to a simplified version of the 
problem). 

Future directions include the extension of the 
three aforementioned approaches to increasingly 
complex problem setups, for example, higher- 
fidelity vehicle dynamics and environments 
and sophisticated sensing and communication 
constraints, novel applications (e.g., search and 
rescue missions, map maintenance, and pursuit- 
evasion), and inclusion of game-theoretical tools 
to address adversarial scenarios. Specifically, for 
the spatial queueing theory approach, key future 
directions include the problem of addressing 
optimality of performance in intermediate 


regimes (current optimality results are only 
available either in the light or heavy-load 
regimes), online estimation of the statistical 
parameters (e.g., spatial distribution of the tasks), 
and formulations that take into account second- 
order moments and large-deviation probabilities. 

Cross-References 

► Averaging Algorithms and Consensus 

► Flocking in Networked Systems 

► Networked Systems 

► Optimal Deployment and Spatial Coverage 

► Particle Filters 


Bibliography 

Bertsimas DJ, van Ryzin GJ (1991) A stochastic and 
dynamic vehicle routing problem in the Euclidean 
plane. Oper Res 39:601-615 

Bullo F, Frazzoli E, Pavone M, Savla K, Smith SF (2011) 
Dynamic vehicle routing for robotic systems. Proc 
IEEE 99(9): 1482-1504 

Golden B, Raghavan S, Wasil E (2008) The vehicle 
routing problem: latest advances and new challenges. 
Volume 43 of operations research/computer science 
interfaces. Springer, New York 
Jaillet P, Wagner MR (2006) Online routing problems: 
value of advanced information and improved competi¬ 
tive ratios. Transp Sci 40(2):200-210 
Pavone M (2010) Dynamic vehicle routing for robotic 
networks. PhD thesis, Department of Aeronautics and 
Astronautics, Massachusetts Institute of Technology 
Pavone M, Savla K, Frazzoli E (2009) Sharing the load. 

IEEE Robot Autom Mag 16(2): 52-61 
Steele JM (1990) Probabilistic and worst case analyses 
of classical problems of combinatorial optimization in 
Euclidean space. Math Oper Res 15(4): 749 
Toth P, Vigo D (eds) (2001) The vehicle routing problem. 
Monographs on discrete mathematics and applica¬ 
tions. SIAM, Philadelphia. ISBN:0898715792 
Van Hentenryck P, Bent R, Upfal E (2009) Online stochas¬ 
tic optimization under time constraints. Ann Oper Res 
177(1): 151—183 

Vazirani V (2001) Approximation algorithms. Springer, 
New York 




N 


Nash equilibrium 

► Strategic Form Games and Nash Equilibrium 


Network Games 

R. Srikant 

Department of Electrical and Computer 
Engineering and the Coordinated Science Lab, 
University of Illinois at Urbana-Champaign, 
Champaign, IL, USA 

Abstract 

Game theory plays a central role in studying sys¬ 
tems with a number of interacting players com¬ 
peting for a common resource. A communication 
network serves as a prototypical example of such 
a system, where the common resource is the net¬ 
work, consisting of nodes and links with limited 
capacities, and the players are the computers, 
web servers, and other end hosts who want to 
transfer information over the shared network. In 
this entry, we present several examples of game- 
theoretic interaction in communication networks 
and present a simple mathematical model to study 
one such instance, namely, resource allocation in 
the Internet. 


Keywords and Phrases 

Congestion games; Network economics; Price¬ 
taking users; Routing games; Strategic users 


Introduction 

A communication network can be viewed as a 
collection of resources shared by a set of compet¬ 
ing users. If the network were totally unregulated, 
then each user would attempt to grab as many 
resources in the network as possible, resulting in 
poor network performance, a situation commonly 
referred to as the tragedy of the commons (Hardin 
1968). In reality, there is a carefully designed 
set of network protocols and pricing mechanisms 
which provide incentives to users to act in a 
socially responsible manner. Since game theory 
is the mathematical discipline which studies the 
interactions between selfish users, it is a natu¬ 
ral tool to use to design these network control 
mechanisms. We now provide a few examples 
of network problems which naturally lend them¬ 
selves to game-theoretic analysis. Later, we will 
elaborate on the game-theoretic formulation of 
one of these examples. 

• Resource Allocation: A network such 
as the Internet is a collection of links, 
where each link has a limited data-carrying 
capacity, usually measured in bits per second. 
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The Internet is shared by billions of users, 
and the actions of these users have to be 
regulated so that they share the resources in 
the network in a fair manner. Equivalently, 
this problem can be viewed as one in which 
a network designer has to design a collection 
of protocols so that the users of the network 
can equitably allocate the available resources 
among themselves without the intervention of 
a central authority. Such protocols are built 
into every computer connected to the Internet 
today, to allow for seamless operation of the 
network. The problem of designing such 
protocols can be posed as a game-theoretic 
problem in which the players are the network 
and the traffic sources using the network 
(Kelly 1997). 

• Routing Games: Finding appropriate routes 
for each user’s data traffic is a particular form 
of resource allocation mentioned above. How¬ 
ever, routing has applications beyond commu¬ 
nication networks (with the other major ap¬ 
plication area being transportation networks), 
so it is useful to discuss routing separately. 
In communication networks, each user may 
attempt to find the minimum-delay route for 
its traffic, with help from the network, to 
minimize the delay experienced by its packets. 
In a transportation network, each automobile 
on the road attempts to take the path of least 
congestion through the network. An active 
area of research in game theory is one which 
tries to understand the impact of individual 
user decisions on the global performance of 
the network (Roughgarden 2005). An interest¬ 
ing result in this regard is the Braess paradox 
which is an example of a road transporta¬ 
tion network in which the addition of a road 
leads to increased delays when each user self¬ 
ishly choose a route to minimize its delay. Of 
course, if routes are chosen to minimize the 
overall delay experienced in the network such 
a paradox will not arise. 

• Peer-to-Peer Applications: Many studies have 
indicated that file sharing between users (also 
known as peers) directly, without using a 
centralized web site such as YouTube, is a 
dominant source of traffic in the Internet. 


For such a peer-to-peer service to work, 
each peer should not only download files 
from others, but should also be willing to 
sacrifice some of its resources to upload files 
to others. Naturally, peers would prefer to 
only download and not upload to minimize 
their resource usage. The design of incentive 
schemes to induce users to both download 
and upload files is another example of a 
game-theoretic problem in a network (Qiu 
and Srikant 2004). 

• Network Economics: In addition to end-user 
interaction, Internet service providers (ISPs) 
have to interact with each other to allow their 
customers access to all the web sites in the 
world. For example, one ISP may have a 
customer who wants to access a web site 
connected to another ISP. In this case, the data 
traffic must cross ISP boundaries, and thus, 
one ISP has to transport data destined for a 
customer of another ISP. Thus, ISPs must be 
willing to contribute resources to satisfy the 
needs of customers who do not directly pay 
them. In such a situation, ISPs must have bilat¬ 
eral agreements (commonly known as peering 
agreements) to ensure that the selfish interest 
of each ISP to minimize its resource usage 
is aligned with the needs of its customers. 
Again, game theory is the right tool to study 
such inter-ISP interactions (Courcoubetis and 
Weber 2003). 

• Spectrum Sharing: Farge portions of the ra¬ 
dio spectrum are severely underutilized. Typ¬ 
ically, portions of the spectrum are assigned 
to a primary user, but the primary user does 
not use it most of the time. There has been 
a surge of interest recently in the concept of 
cognitive radio, whereby radios are cognitive 
of the presence or absence of the primary 
user, and when the primary user is absent, 
another radio can use the spectrum to transmit 
its data. When there are many users and the 
available spectrum is split into many channels, 
it is impossible for users to perfectly coordi¬ 
nate their transmissions to achieve maximum 
network utilization. In these situations, game- 
theoretic protocols which take into account the 
noncooperative behavior of the users can be 
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designed to allow secondary users to access 
the available channels as efficiently as possi¬ 
ble (Saad et al. 2009). 

In the next section, we will elaborate on one of 
the applications above, namely, resource alloca¬ 
tion in the Internet, and show how game-theoretic 
modeling can be used to design fair resource 
sharing. 

Resource Allocation and Game 
Theory 

Consider a network consisting of L links, with 
link / having capacity c/. Suppose that there are R 
users sharing the network, with each user r being 
characterized by a set of links which connect the 
user’s source to its destination. Since each user 
uses a fixed route in our model, we will use r to 
denote both the user and the route used by the 
user. We use the notation / e r to denote that 
link / is a part of route r. Let x r denote the rate 
at which user r transmits data. Thus, we have the 
following natural constraints, which state that the 
total data rate on any link must be less than or 
equal to the capacity of the link: 

y>< cu v/. (i) 

r:l€r 

Associated with each user is a concave utility 
function U r (x r ) which is the utility that user 
r derives by transmitting data at rate x r . The 
network utility maximization problem is to solve 

max Y U r (x r ), (2) 

x>0 z —' 

r 

subject to the constraint (1). In (2), x denotes the 
vector (x\,X 2 ,... ,xr) and x > 0 means that 
each component of x must be greater than or 
equal to zero. Note that the goal of the network 
in (2) is to maximize the sum of the utilities of 
the users in the network. 

Let pi be the Lagrange multiplier correspond¬ 
ing to the capacity constraint in (1) for link /. 
Then the Lagrangian for the problem is given by 

L(x,p) = 'Y^,U r (x r )-'Y^pi(yi -ci), (3) 
r l 


where we have used the notation yi := J] r . /Er x r 
to denote the total data rate on link /.If p is 
known, then the optimal v can be calculate by 
solving 

maxL(x, p). 

x>0 

Notice that the optimal solution for each x r can 
be obtained by solving 

max U r ( x r ) — q r x r , (4) 

x r >0 

where q r = pi. Thus, if the Lagrange 

multipliers are known, then the network utility 
maximization can be interpreted as a game in 
the following manner. Suppose that the network 
charges each user q r dollars for every bit trans¬ 
mitted by user r though the network. Then, q r x r 
is the dollars per second spent by the user if x r is 
measured in bits per second. Interpreting U r (x r ) 
as the dollars per second that the user is willing 
to pay for transmitting at rate x r , the optimization 
problem in (4) is the problem faced by user 
r which wants to maximize its net utility, i.e., 
utility minus cost. Thus, the individual optimal 
solution for each user is also the solution to 
the network utility maximization problem. The 
above game-theoretic interpretation of the net¬ 
work utility maximization problem is somewhat 
trivial since, given the pi's or q r ' s, there is no 
interaction between the users. Of course, this 
interpretation relies on the ability of the network 
to compute p. We next present a scheme to com¬ 
pute p , which couples the users closely and thus 
allows for a richer game-theoretic interpretation. 

Suppose that the network wants to compute p 
but does not have access to the utility functions 
of the users. The network asks each user r to bid 
an amount w r which is interpreted as the dollars 
per second that the user is willing to pay. The 
network then assumes that user r’s utility func¬ 
tion is w r log x r and solves the network utility 
maximization. While this choice of utility func¬ 
tion may seem arbitrary, the resulting solution v 
has a number of attractive properties, including a 
form of fairness called proportional fairness. The 
proportionally fair resource allocation solution 
to (4) is given by 
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The network then allocates rate x r to user r and 
charges q r dollars per bit. From (5), the amount 
charged to user r per second is w r , thus satisfying 
the original interpretation of w r . Knowing that the 
network charges users in this manner, how might 
a user choose its bid w r ? Recall that user r’s goal 
is to solve (4). Substituting from (5), the problem 
in (4) can be rewritten as 

f w r \ 

max U r — ) — w r . (6) 

w r >o \q r ) 

Thus, the users’ problem of selecting w can be 
viewed as a game, with each user’s objective 
given by (6). Note that q r is given by (5) and 
thus depends on all the w r ’s. Depending upon the 
application, the game can be solved under one of 
two assumptions: 

• Price-Taking Users: Under this assumption, 
users are assumed to take the price q r as given, 
i.e., they do not attempt to infer the impact of 
their actions on the price. This is a reasonable 
assumption in a large network such as the In¬ 
ternet, where the impact of a single user on the 
link prices is negligible, and it is practically 
impossible for any user to infer the impact 
of its decisions on the prevailing price of the 
network resources. When the users are price 
taking, the socially optimal solution, i.e., the 
solution to the network utility maximization 
problem, coincides with the Nash equilibrium 
of the game. To see this, note that the solution 
to (6) is given by 



under the assumption that the utility function 
is differentiable and the solution is bounded 
away from zero. Using (5), this equation re¬ 
duces to 

U r (-Vr) — qr •> 

which maximizes the Lagrangian (3). It is not 
difficult to see that the complementary slack¬ 
ness equations in the Karush-Kuhn-Tucker 


conditions are satisfied since the constraints 
for (2) and the proportionally fair solution 
are the same. Thus, under the price-taking 
assumption, the equilibrium of the game so¬ 
lution is the same as the socially optimal 
solution provided the network computes q r us¬ 
ing the proportionally fair resource allocation 
formulation. 

• Strategic Users: In networks where the num¬ 
ber of users is small, it may be possible for 
each user to know the topology of the network, 
and thus, each user may be able to solve for 
the proportionally fair resource allocation if 
it has access to other users’ bids. In other 
words, it may be possible to compute a Nash 
equilibrium by taking into account the impact 
of the w r ’s on the qf s. When the users are 
strategic, the socially optimal solution could 
be quite different from the Nash equilibrium. 
The ratio of the network utility under the 
socially optimal solution to the network utility 
under a Nash equilibrium is called the price of 
anarchy. 

There is a rich literature associated with both 
interpretations of the network congestion game. 
In the case of price-taking users, much of the 
emphasis in the literature has been on designing 
distributed algorithms to achieve the socially op¬ 
timal solution (Shakkottai and Srikant 2007). In 
the case of strategic users, the focus has been on 
characterizing the price of anarchy (Johari and 
Tsitsiklis 2004; Yang and Hajek 2007). 

Summary and Future Directions 

We have presented a number of applications 
which involve the interactions of selfish users 
over a network. For the resource allocation 
application, we have also described how simple 
mathematical models can be used to provide 
incentives for users to act in a socially optimal 
manner. In particular, we have shown that, under 
the reasonable price-taking assumption and an 
appropriate computation of link prices, selfish 
users automatically maximize network utility. In 
the case where the users are strategic, the goal is 
to characterize the price of anarchy. 
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Moving forward, two areas which require con¬ 
siderable further research are the following: (i) 
inter-ISP routing and (ii) spectrum sharing. The 
Internet is a fairly reliably network, and any 
unreliability often arises due to routing issues 
among ISPs. As mentioned in the introduction, 
peering arrangements between ISPs are necessary 
to make sure that ISPs carry each others’ traffic 
and are appropriately compensated for it, either 
through reciprocal traffic-carrying agreements or 
actual monetary transfer. Thus, the policy that 
an ISP uses to route traffic may be governed by 
these peering agreements. The more complicated 
these policies are, the more chances there are 
for routing misconfigurations that lead to service 
interruptions. This interplay between policies and 
technology in the form of routing algorithms is an 
interesting topic for further study. 

Cognitive radios and spectrum sharing are 
expected to be significant technological com¬ 
ponents of future wireless networks. Designing 
algorithms for selfish radios to share the avail¬ 
able spectrum while respecting the rights of the 
primary user of the spectrum is a challenge that 
requires considerable further attention. This area 
of research requires one to combine sensing tech¬ 
nologies to sense the presence of other users 
with game-theoretic models to ensure fair chan¬ 
nel access to the secondary users, subject to the 
constraint that the primary user should not be 
affected by the presence of the secondary users. 


Cross-References 
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Abstract 

When shared, band-limited, real-time communi¬ 
cation networks are employed in a control system 
to exchange information between spatially dis¬ 
tributed components, such as controllers, actua¬ 
tors, and sensors, it is categorized as a networked 
control system (NCS). The primary advantages 
of a NCS are reduced complexity and wiring, 
reduced design and implementation cost, ease of 
system maintenance and modification, and effi¬ 
cient data sharing. In addition, this unique archi¬ 
tecture creates a way to connect the cyberspace 
to the physical space for remote operation of sys¬ 
tems. The NCS architecture allows for perform¬ 
ing more complex tasks, but also requires taking 
the network effects into account when designing 
control laws and stability conditions. In this entry, 
we review significant results on the architecture 
and stability analysis of a NCS. The results pre¬ 
sented address communication network-induced 
challenges such as time delays, scheduling, and 
information packet dropouts. 
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Introduction 

From the washing machine, air conditioner, and 
microwave oven to the telephone, stereo, and 
automobile, embedded computers are present in 
the modem home. In a factory environment, there 
are thousands of networked smart sensors and 
actuators with embedded processors, working to 
complete a coordinated task. The trend in manu¬ 
facturing plants, homes, buildings, aircraft, and 
automobiles is toward distributed networking. 
This trend can be inferred from many proposed 
or emerging network standards, such as con¬ 
troller area network (CAN) for automotive and 
industrial automation, BACnet for building au¬ 
tomation, PROFIBUS and WorldFIP fieldbus for 
process control, and IEEE 802.11, and Blue¬ 
tooth wireless standards for applications such 
as mobile sensor networks, HVAC systems, and 
unmanned aerial vehicles. 

The traditional dedicated point-to-point wired 
connection in control systems has been success¬ 
fully implemented in industry for decades. With 
the advance of communication network and hard¬ 
ware technologies, it is common to integrate the 
communication network into the control system 
to replace the dedicated point-to-point connection 
to achieve reduced weight and power, lower cost, 
simpler installation and maintenance, and higher 
reliability, to name a few advantages. For exam¬ 
ple, a typical new automobile has two controller 
area networks (CANs): a high-speed one in front 
of the firewall for the engine, transmission, and 
traction control and a low-speed one for locks, 
windows, and other devices (Johansson et al. 
2005). 

The conventional definition of a networked 
control system (NCS) is as follows: When a 
feedback control system is closed via a com¬ 
munication channel, which may be shared with 
other nodes outside the control system, then the 
control system is called a NCS. A NCS can also 


be described as a feedback control system where 
the control loops are closed through a real-time 
communication network. 


Architecture of Networked Control 
Systems 

The architecture of a NCS consists of a band- 
limited, digital communication network physi¬ 
cally and electronically integrated with a spatially 
distributed control system, operated on a given 
plant. Digital information, such as controller sig¬ 
nals, actuator signals, sensor signals, and operator 
input, is transmitted via the network. The compo¬ 
nents connected by the network include all nodes 
of the control system, such as the supervisory 
(or “network owner”) computer, controller soft¬ 
ware and hardware, actuators, and sensors. In this 
structure, the feedback control system’s loops are 
closed over the shared communication network. 

The communication network can be wired or 
wireless and may be shared with other unrelated 
nodes outside the control system. As illustrated in 
Fig. 1, the shared communication channel, which 
multiplexes signals from the sensors to the con¬ 
trollers and/or from the controllers to the actua¬ 
tors, serves many other uses besides control. Each 
of the system components directly connected to 
the network via a network interface is denoted 
a physical node. Besides the network interface, 
the sensors and actuator nodes are typically smart 
nodes with embedded microprocessors. Some¬ 
times, the controller is colocated with the smart 
actuator. Several key issues make networked con¬ 
trol systems distinct from traditional control sys¬ 
tems (Hespanha et al. 2007; Yang 2006). 

Band-Limited Channels 

Bandwidth limitation of the shared communica¬ 
tion channel requires that all nodes in the network 
must share (e.g., time sharing or frequency shar¬ 
ing, etc.) the common network resource without 
interfering with each other. 

Sampling and Delays 

In a NCS, the plant outputs are sampled by 
the sensors, which can convert continuous-time 
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Networked Control Systems: Architecture and Stability Issues, Fig. 1 A general networked control system (NCS) 
architecture 


analog signals to digital signals; perform prepro¬ 
cessing, filtering, and encoding; and package the 
data signal so that it is ready for transmission. 
After winning the medium access control and 
being transmitted over the network, the package 
containing the sampled data signal arrives at the 
receiver side, which could be a controller or a 
smart actuator with a controller collocated with 
it. The receiver unpacks and decodes the signal. 
This process is quite different from the traditional 
periodic sampling in digital control. The overall 
delay between sampling and the eventual decod¬ 
ing of the transmitted packet at the receiver can be 
time varying and random due to both the network 
access delay (i.e., the time it takes for a shared 
network to accept the data) and the transmission 
delay (i.e., the time during which data are in 
transit inside the network). This also depends on 
the highly variable network conditions, such as 
congestion and channel quality. In some NCSs, 
the data transmitted are time stamped, which 
means that the receiver may have an estimate of 
the delay’s duration and could take appropriate 
corrective action. Given the rapid advance of em¬ 
bedded computation and communication hard¬ 
ware technology today, the transmission delay in 


many embedded systems can be neglected when 
compared with the magnitude of network access 
delay. 

Packet Dropouts 

It is possible in a NCS that a packet may be 
lost while it is in transit through the network. 
The packet that contains important sampling data 
or control signals may drop occasionally due 
to transmission errors of the physical network 
link, message collision, or node failures, to 
name a few. Overflow in queue or buffer can 
lead to network congestion and package loss. 
Thus, the use of queues is not favored by NCSs 
in general. Packet dropouts also happen if the 
receiver discards outdated arrivals that have long 
delays. Most network protocols are equipped 
with transmission-retry mechanisms, such as 
TCP, that guarantee the eventual delivery of 
packets. These protocols, unfortunately, are not 
appropriate for a NCS since the retransmission 
of old sensor data or calculated control signals 
is generally not very useful when new, time- 
critical data are available. Using selected old 
data for estimation or prediction is an exception, 
where old data may be packaged with the new 
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data in one packet. It is advantageous to discard 
the old, un-transmitted data and transmit a new 
packet if and when it becomes available. In this 
way, the controller always receives fresh data for 
its control calculation, and the actuator always 
executes the up-to-date command to control the 
plant. 

Modeling Errors, Uncertainties, and 
Disturbances 

In a distributed NCS, modeling errors, uncertain¬ 
ties, and disturbances always exist when using 
the mathematical model to describe the physical 
process. These factors may lead to a major impact 
on the overall system performance and cause 
failure in fulfilling the desired objectives. Wang 
and Hovakimyan (2013) proposed a reference 
model-based architecture to decouple the design 
of controller and communication schemes. A ref¬ 
erence model is introduced in each subsystem 
as a bridge to build the connection between the 
real system and an ideal model, free of uncer¬ 
tainties. The closeness between the real system 
and the reference model is associated only with 
plant uncertainties, and the difference between 
the reference model and the ideal model is only 
in the communication constraints. 


Stability of Networked Control 
Systems 

The stability of a control system is often ex¬ 
tremely important and is generally a safety re¬ 
quirement. Examples include the control of rock¬ 
ets, robots, airplanes, automobiles, or ships. In¬ 
stability in any one of these systems can result 
in an unimaginable accident and loss of life. 
The stability of a general dynamical system with 
no input can be described with the Lyapunov 
stability criteria, which is stated as follows: A 
linear system is stable if its impulse response 
approaches zero as time approaches infinity or if 
every bounded input produces a bounded output. 

When sensors, controllers, and actuators 
are not colocated and use a shared network 
to communicate, the feedback loop of a NCS 
is closed over the network. Network-induced, 


variable delays, and packet dropouts can degrade 
the performance of a NCS. For example, the 
NCS may have a longer settling time or bigger 
overshoot in the step response. Furthermore, 
the NCS may become unstable when delays 
and/or packet dropouts exceed a certain range. 
Designers choosing to use a NCS architecture, 
however, are motivated not by performance but 
by cost, maintenance, and reliability gains. 

Band-Limited Channels 

Inspired by Shannon’s results on the maximum 
bit rate that a communication channel can carry 
reliably, a significant research effort has been 
devoted to the problem of determining the min¬ 
imum bit rate that is needed to stabilize a system 
through feedback over a finite capacity channel 
(Baillieul 1999; Nair and Evans 2000; Tatikonda 
and Mitter 2004; Wong and Brockett 1999; Bail¬ 
lieul and Antsaklis 2007). This has been solved 
exactly for linear plants, but only conservative re¬ 
sults have been obtained for nonlinear plants. The 
data-rate theorem that quantifies a fundamental 
relationship between unstable physical systems 
and the rate at which information must be pro¬ 
cessed in order to stably control them was proved 
independently under a variety of assumptions. 
Minimum bit rate and quantization becomes es¬ 
pecially important for networks designed to carry 
very small packets with little overhead, because 
encoding measurements or actuation signals with 
less bits can save network bandwidth. 

Most of the NCS stability results presented 
here, however, are based on the observation that 
the channel can transmit a finite number of pack¬ 
ets per unit of time (packet rate) and each packet 
can carry certain number of bits in the data 
field. The packets on a real-time control network 
typically are frequent and have small data seg¬ 
ments compared to their headers. For example, a 
CAN II packet with a single 16-bit data sample 
has fixed 64 bits of overhead associated with 
identifier, control field, CRC, ACK field, and 
frame delimiter, resulting in 25 % utilization, and 
this utilization can never exceed 50 % (data field 
length is limited to 64 bits). Thus, the quan¬ 
tization effects imposed by the communication 
networks are generally ignored. 



Networked Control Systems: Architecture and Stability Issues 


839 


Network-Induced Delays 

A significant number of results have attempted to 
characterize a maximum upper bound on the sam¬ 
pling or transmission interval for which stability 
of the NCS can be guaranteed. The upper bound 
is sometimes called the maximum allowable 
transfer interval (MATI) (Walsh 2001a). These 
results implicitly attempt to minimize the packet 
rate or schedule the traffic of the control network 
that is needed to stabilize a system through 
feedback. The general approach is to design 
the controller using established techniques, 
considering the network to be transparent, and 
then to analyze the effect of the network on 
closed-loop system performance and stability. 

The NCS with a linear time-invariant (LTI) 
plant/controller pair and one-channel feedback 
(see Fig. 2) can be modeled by the following 
continuous-time system, where x includes the 
states of the plant and the controller, x(t) = 

[.X p (t),X c (t)] T \ 


x = Ax + By, y = C(x) (1) 


_ ( 9k-iA £ [tkJk + *k) 
( yk-> t \fk T Tk,tk- |-l) 


( 2 ) 


The signal y is a vector of sensor measurements 
and y is the input to a continuous-time controller 
collocated with the actuators. Alternatively, y 
can be viewed as the input to the actuators and 
y as the desired control signal computed by a 
controller collocated with the sensors. The signal 
y(t) is sampled at times {tk : k e N} and 
the samples y(k) := y(tk) are sent through the 
network. But the samples arrive at the destination 



Networked Control Systems: Architecture and Sta¬ 
bility Issues, Fig. 2 A NCS architecture with one- 
channel feedback (controller collocated with actuator) 


after a (possibly variable) delay of tk, where 
we assume that the network delays are always 
smaller than one sampling interval. For periodic 
sampling and constant delays, a sufficient and 
necessary condition for exponential stability of 
the NCS (Eqs. 1 and 2) was derived (Zhang et al. 
2001). By using the augmented state space model 
and based on the stability of nonlinear hybrid 
systems, they also proved the sufficient condition 
for stability of the NCS in the time-invariant 
case. 

If we now assume the sampling intervals are 
constant and the computation and transmission 
delays are negligible, then the variable network 
access delays serve as the main source of delays 
in a NCS (Lin et al. 2003, 2005). Using average 
dwell time results for discrete switched systems, 
Zhai et al. (2002) provided conditions such that 
NCS stability is guaranteed. Also, the authors 
consider robust disturbance attenuation analysis 
for this class of NCSs. 

When the network delay is not constant or 
when the signal y(t) is sampled in a nonperiodic 
fashion, the system (1) and (2) is not time invari¬ 
ant and one needs a Lyapunov-based argument to 
prove its stability. Zhang and Branicky (2001) de¬ 
rived the sufficient condition to ensure the NCS in 
Lig. 2 is exponentially stable. They also proposed 
a randomized algorithm to find the largest value 
of sampling interval for which stability can be 
guaranteed. 

Lor a model-based NCS with state and output 
feedback, an explicit model of the plant is used 
to produce an estimate of the plant state behavior 
between transmission times (Montestruque 
and Antsaklis 2004). Sufficient conditions for 
Lyapunov stability are derived for a model-based 
NCS when the controller/actuator is updated 
with the sensor information at nonconstant time 
intervals. A NCS with transmission times that are 
driven by a stochastic process with identically 
independently distributed and Markov-chain- 
driven transmission times almost sure stability 
and mean-square sufficient conditions for 
stability are introduced. Onat et al. (2011) 
adapted above stability results to model- 
based predictive NCSs with realistic structure 
assumptions. 
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Networked Control Systems: Architecture and Sta¬ 
bility Issues, Fig. 3 A NCS architecture with two- 
channel feedback 

Control Network Scheduler 

In general a Multi-Input/Multi-Output (MIMO) 
NCS with two-channel feedback, both the sam¬ 
pled plant output and controller output are trans¬ 
mitted via a network (see Fig. 3). Because of the 
network, only the reported output y(t) is available 
to the controller and its prediction processes; 
similarly, only u(t) is available to the actuators on 
the plant. We label the network-induced error 

e(0 := [y(t),u(t)] T - [y(t),u(t)] T 

and the combined state of controller and plant 
x{t) = [x p (t), x c (t)] T . The state of the entire 
NCS is given by z(t ) = [x(t), e(t)] T . Following 
this general approach, the controller is designed 
using established techniques without considering 
the presence of the network. 

The behavior of the network-induced error 
e(t ) is mainly determined by the architecture 
of the NCS and the scheduling strategy. In the 
special case of one-package transmission, there is 
only one node transmitting data on the network; 
therefore, the entire vector e(t) is set to zero at 
each transmission time. For multiple nodes trans¬ 
mitting measured outputs y{t) and/or computed 
inputs u(t ), the transmission order of the nodes 
depends on the scheduling strategy chosen for 
the NCS. In other words, the scheduling strategy 
decides which components of e(t) are set to zero 
at the transmission times. 

Static and dynamic schedulers (a.k.a. proto¬ 
cols) are two main categories used in a NCS. 
When the network resource or transmission order 
are pre-allocated or determined before run-time, 


it is called a static scheduler, such as round- 
robin scheduling. A dynamic scheduler deter¬ 
mines the network allocation while the system 
runs. A novel dynamic network scheduler, try- 
once-discard (TOD) and several variations were 
introduced for wired and wireless NCSs (Walsh 
and Ye 2001; Ye et al. 2001). For linear and 
nonlinear NCSs with the new dynamic and com¬ 
monly used static schedulers, an analytic proof 
of global exponential stability of a MIMO NCS 
was provided (Walsh 2001a; Walsh et al. 2001b). 
Simulation and experiment results showed that 
the dynamic schedulers outperform static sched¬ 
ulers in terms of NCS performance, e.g., a bigger 
MATI. 

Nesic and Teel (2004a, b) generalize the above 
results by considering a nonlinear NCS with 
external disturbances and more general class of 
protocols (or schedulers). They considered a new 
class of Lyapunov uniformly globally asymptot¬ 
ically stable (UGAS) protocols in a NCS. It is 
shown that if the controller is designed without 
taking into account the network, it yields input- 
to-state stability (ISS) with respect to external 
disturbances (not necessarily with respect to the 
network-induced error), and then the same con¬ 
troller will achieve semi-global practical ISS for 
the NCS when implemented via the network 
with a Lyapunov UGAS protocol. Moreover, the 
ISS gain is preserved. The adjustable parameter 
with respect to which semi-global practical ISS 
is achieved is the MATI between transmission 
times. The authors also studied the input-output 
L p stability of a NCS for a large class of network 
scheduling protocols. It is shown that polling, 
static protocols, and dynamic protocol such as 
TOD belong to this class. Results in Nesic and 
Teel (2004a) provide a unifying framework for 
generating new scheduling protocols that pre¬ 
serve L p stability properties of the system, if 
a design parameter is chosen to be sufficiently 
small. The most general version of these results 
can also be used to model a NCS with data packet 
dropouts. The proof technique used is based on 
the small gain theorem and lends itself to an easy 
interpretation. 

A framework for analyzing the stability of 
a general nonlinear NCS with disturbances in 
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the setting of L p stability was provided by Tab- 
bara et al. (2007). Their presentation provides 
sharper results for both gain and MATI than 
previously obtainable and details the property of 
uniformly persistently exciting scheduling pro¬ 
tocols. This class of protocols was shown to 
lead to stability for high enough transmission 
rates. This was a natural property to demand, 
especially in the design of wireless scheduling 
protocols. The property is used directly in a novel 
proof technique based on the notions of vector 
comparison and (quasi)-monotone systems. Via 
simulations, analytical, and numerical compari¬ 
son, it is verified that the uniform persistence of 
excitation property of protocols is, in some sense, 
the “finest” property that can be extracted from 
wireless scheduling protocols. 

Delays and Packet Dropouts 

Packet dropouts can be modeled as either 
stochastic or deterministic phenomena. For a 
one-channel feedback NCS, Zhang and Branicky 
(2001) consider a deterministic dropouts model, 
with packet dropouts occurring at an asymptotic 
rate. Stability conditions were studied for a NCS 
with deterministic and stochastic dropouts (Seiler 
and Sengupta 2005). 

Sometimes, the NCS was characterized as 
a continuous-time delayed differential equation 
(DDE) with the time-varying delay x(t). One 
important advantage is that the equations are 
still valid even when the delays exceed the sam¬ 
pling interval. Researchers successfully used the 
Lyapunov-Krasovskii (Yue et al. 2004) and the 
Razumikhin theorems (Yu et al. 2004) to study 
the stability of a NCS that is modeled as DDEs. 

Summary and Future Directions 

This article introduced the concept of a net¬ 
worked control system and its general architec¬ 
ture. Several key issues specific to a NCS, such as 
band-limited channels, network-induced delays, 
and information packet dropouts, were explained. 
The stability condition of a NCS with various net¬ 
work effects was discussed with several common 
modeling techniques. 


In terms of future directions, there has been 
significant effort in analyzing networked control 
systems with variable sampling rate, but most 
results investigate the stability for a given worst- 
case interval between consecutive sampling 
times, leading to conservative results. An open 
area of research would be to look at methods that 
take into account a stochastic characterization for 
the inter-sampling times. Substantial work has 
also been devoted to determining the stability 
of a NCS, as described in this article. Possible 
open areas of research would be to consider 
design issues related to the joint stability and 
performance of the system. The design and 
development of controllers for a NCS is also an 
open area of research. In designing a controller 
for a NCS, one has to take into account the 
challenges introduced by the communication 
network. Only afterward can analysis of the 
whole system take place. 
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Abstract 

This entry discusses optimal estimation and con¬ 
trol for lossy networks. Conditions for stability 
are provided both for two-link and multiple-link 
networks. The online adaptation of network re¬ 
sources (controlled communication) is also con¬ 
sidered. 
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Introduction 

Network Control Systems (NCSs) are spatially 
distributed systems in which the communication 
between sensors, actuators, and controllers 
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occurs through a shared band-limited digital 
communication network. In this entry, we 
consider the problem of estimation and control 
over such networks. 

A significant difference between NCSs and 
standard digital control is the possibility that 
data may be lost while in transit through the 
network. Typically, packet dropouts result from 
transmission errors in physical network links 
(which is far more common in wireless than 
in wired networks) or from buffer overflows 
due to congestion. Long transmission delays 
sometimes result in packet reordering, which 
essentially amounts to a packet dropout if the 
receiver discards “outdated” arrivals. Reliable 
transmission protocols, such as TCP, guarantee 
the eventual delivery of packets. However, these 
protocols are not appropriate for NCSs since 
the retransmission of old data is generally not 
useful. Another important difference between 
NCSs and standard digital control systems is 
that, due to the nature of network traffic, delays 
in the control loop may be time varying and 
nondeterministic. 

In this entry, we concentrate on the problem of 
control and estimation in the presence of packet 
losses, leaving other important features of NCSs 
(such as quantization and random delays) to be 
addressed in other entries of this encyclopedia. 
Consequently, we assume that the network can be 
viewed as a channel that can carry real numbers 
without distortion, but that some of the messages 
may be lost. This network model is appropriate 
when the number of bits in each data packet is 
sufficiently large so that quantization effects can 
be ignored, but packet dropouts cannot. For more 
general channel models, see, for example, Imer 
and Basar (2005). 

This entry also does not address network trans¬ 
mission delays explicitly. In general, network 
delays have two components: one that is due to 
the time spent transmitting packets and another 
due to the time packets wait in buffers waiting 
to be transmitted. Delays due to packet transmis¬ 
sion present little variation and may be modeled 
as constants. For control design purposes, these 
delays may be incorporated into the plant model. 
Delays due to buffering depend on the network 


traffic and are typically random; they can be ana¬ 
lyzed using the techniques developed in Antunes 
et al. (2012). 

Notation and Basic Definitions. Throughout 
the entry, R stands for real numbers and N for 
nonnegative integers. For a given matrix A e 
R nxn and vector v eR n , ||x|| := Jx'x denotes 
the Euclidean norm of x, and X(A) the set of 
eigenvalues of A. Random variables are gener¬ 
ally denoted in boldface. For a random variable 
y, E[y] stands for the expectation of y. 


Two-Link Networks 

Here, we consider a control/estimation problem 
when all network effects can be modeled using 
two erasure channels: one from the sensor to the 
controller and the other from the controller to the 
actuator (see Fig. 1). 

We restrict our attention to a linear time- 
invariant (LTI) plant with intermittent observa¬ 
tion and control packets: 

X£ +1 = Ax k + v k B u k + Wit, (la) 
y k = 6 k Cx k +v k , (lb) 

Vk e N,x k ,w k e R n ,y k ,\ k e R p , where 
(xo, w k ,v k ) are mutually independent, zero-mean 
Gaussian with covariance matrices (Po, R w , R v ), 
and 0 k , v k e {0,1} are i.i.d. Bernoulli random 
variables with Pr {6 k = 1} = 9 and Pr{v k = 
1} = v. The variable 6 k models the packet loss 
between sensor and controller, whereas v k mod¬ 
els the packet loss between controller and actua¬ 
tor. When there is a packet drop from controller 


^ctuato^ 

1— H 

^ Plant ^ 

1— H 

^Sensor^ 


Erasure channel f A Erasure channel 

■ -I Controller U- 


Networked Control Systems: Estimation and Control 
Over Lossy Networks, Fig. 1 Control system with two 
network links 
















844 


Networked Control Systems: Estimation and Control Over Lossy Networks 


to actuator, we set the actuator’s output to zero. 
Different strategies, such as holding the control 
input, could still be modeled using (1) by aug¬ 
menting of the state vector. 

The information available to the controller up 
to time k is given by the information set: 

l k = {P 0 } U {y^, Ot : l <k) U :l<k- 1}. 

Here, we make an important assumption that 
acknowledgment packets from the actuator 
are always received by the controller so that 
vi, i < k — 1 is available at time k to the remote 
estimator. 

Optimal Estimation with Remote 
Computation 

The optimal mean-square estimate of x k , given 
the information known to the remote estimator at 
time k , is given by 

X k \ k := E[x k \l k ]. 

This estimate can be computed recursively using 
the following time-varying Kalman filter (TVKF) 
(Sinopoli et al. 2004): 

x 0 |-i = 0, (2a) 

Xfc| k = h\k-\ + 0 k F k (y k - Cx k \ k - 0, (2b) 

Xfc+i| k = Ax k \ k + v k Bu k , (2c) 

with the gain matrix F k calculated recursively as 
follows 

F k = P k C\CP k C' + R V )~\ 

Pk +1 = AP k A ' + R W - e k AF k (CP k C’ + R v ) 

HA'- 

Each P k corresponds to the estimation error co- 
variance matrix 

Pk= E [(x fc - x k \ k ^)(x k - x i | A -_ 1 ) / ]. 

For this estimator, there exists a critical 
value 9 C for the dropout rate 6 , above which 


the estimation error covariance becomes 
unbounded: 

Theorem 1 (Sinopoli et al. 2004) Assume that 
(t4, is controllable, ( A,C ) is observable, 

and A is unstable. Then there exists a critical 
value 6 C G (0,1] such that 

E [P k ] < M, Wk e N 0 > 6 C 

where M is a positive definite matrix that may 
depend on Pq. Furthermore, the critical value 6 C 
satisfies 0 m m < 9 C < 6 m ax , where the lower bound 
is given by 

0mi„ = 1---j > (3) 

(max{|A(y4)|}) 2 

and the upper bound is given by the solution to 
the following (quasi-convex) optimization prob¬ 
lem: 

0max=min{0 >0:^(7, Z) > 0, 

0 < Y < / for some Y, Z}, 

where 

Ve(Y,Z) = 

Y V0(YA + ZC) Vl^YA' 

V0(A'Y + C'Z') Y 0 

Vl^A'Y 0 Y 


Remark 1 In some special cases, the upper 
bound in (3) is tight in the sense that 6 C — #min- 
The largest class of systems known for which 
this occurs is that of nondegenerate systems 
defined in Mo and Sinopoli (2012). Examples of 
systems in this class include (1) those for which 
the matrix C is invertible and (2) those with a 
detectable pair (A, C) and such that the matrix 
A is diagonalizable with unstable eigenvalues 
having distinct absolute values. 
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Optimal Control with Remote 
Computation 

From a control perspective, one may also be 
interested in finding control sequences = 
{ui,... as functions of the information 

set which minimize cost functions of the 
form 


Theorem 2 (Schenato et al. 2007) Assume that 
(A, B) and (A, R l J 2 ) are controllable, (A, C) 
and ( A , W 1//2 ) are observable, and A is unstable. 
Then, finite control costs J are achievable if 
and only if 0 > 6 C and v > v c , where the 
critical value v c is given by the (quasi-convex) 
optimization problem 


- lim —E 

N^oo N 


m N -1 

Y (x'k Wx k + V k \s! k Uu k )\I k 
,k =0 


v c = min{v > 0 : ^ y (F, Z) > 0, 
0 < Y < I for some Y, Z}, 


where 


Vv(Y,Z) = 


Y Y 

Y W~ l 

VvC/ 1/2 Z' 0 

Jv(AY + BZ’) 0 
VT ~AY 0 


VvZC/ 1/2 Jv(YA! + ZB ') Vi - vYA'~ 
0 0 0 

7 0 0 

0 Y 0 

0 0 Y 


Moreover, under the above conditions, the 
separation principle holds in the sense that the 
optimal control is given by 

u k =-(B , SB + U)- l B'SAx k]k , 

where x ^ is an optimal state estimate given 
by (2) and the matrix S is the solution to the 
modified algebraic Riccati (MARE) equation 

S = A'SA + W- vA'SB(B’SB + U)~ l B'SA. 

Solutions to the MARE may be obtained itera¬ 
tively when v > v c . 

Estimation with Local Computation 

To reduce the gap between the bounds Q^n and 
#max on the critical value of the drop probability 
in Theorem 1 and to allow for larger probabil¬ 
ities of drop, one may choose to compute state 
estimates at the sensor and transmit those to 
the controller/actuator. This scheme is motivated 
by the growing number of smart sensors with 
embedded processing units that are capable of 
local computation. For the LTI plant 


x/f+1 = Ax k + Bu k + w*, 

y k = Cx k + \ k , 

the smart sensor can compute locally an opti¬ 
mal state estimate using a standard stationary 
Kalman filter and transmits this estimate to the 
controller. We model packet dropouts as before 
using the process 9k and assume that the process 
6k is known to the smart sensor by means of an 
perfect acknowledgment mechanism. This allows 
the sensor to know exactly and to use it in the 
Kalman filter. 

Let Xk\k = E[xk\yi,9i,£ < k] denote the local 
estimates transmitted by the sensor. Using the 
messages successfully received up to time k, the 
remote estimator computes the optimal estimate 

Xfc|jfc -1 = E[xk\6t,xe\e,i < k - 1]. 

recursively by 

x 0 |-i = 0, 

x k\k = (1 - 0 k )%+ 9 k x k \ k , k € N, 

x k+i\k = Ax k \ k + Bu k 
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Notice that now we are applying the (TVKF) 
to estimate x&, which is fully observable. Since 
fl min and 0 max in Theorem 2 are equal for fully 
observable processes (Schenato et al. 2007), the 
local computation scheme grants a minimal criti¬ 
cal value 6 C as stated in the theorem below. 

Theorem 3 Assume that (A, Rll 2 ) is control¬ 
lable, (A, C) is Observable, and A is unstable. 
Then the critical value 6 C is given by 6 m in in (3), 
i.e., 

E [P k ] <M,Vk e N 0 > 0min 

where M is a positive definite matrix that may 
depend on Po. 

Drops in the Acknowledgement Packets 

When there are drops in the acknowledgment 
channel from the actuator to the controller, the 
controller does not always know Vk, and there¬ 
fore, it might not always have access to the 
control inputs that are actually applied to the 
plant. In this case, the posterior state proba¬ 
bility becomes a Gaussian mixture distribution 
with infinitely many components, and the sepa¬ 
ration principle no longer holds (Schenato et al. 
2007). This makes the estimation and control 
problems computationally more difficult, and, 
due to the smaller information set, some perfor¬ 
mance degradation in the control performance 
should be expected. For this reason, it is generally 
a good design choice to keep controller and actua¬ 
tor collocated when drops in the acknowledgment 
channels are significant. 

Buffering 

As an alternative to the approach described in 
section “Estimation with Local Computation” to 
use local computation at a smart sensor to allow 
for larger probabilities of drop, the designer may 
also consider the transmission of a sequence 
of previous measurements yk,yk -\, • • •, y k-N 
in each packet. This approach is motivated 
by the fact that often data packets can carry 
much more than one vector of measured outputs. 
When N is reasonably large, one should expect 
similar estimation/control performances as in 


the approach described in section “Estimation 
with Local Computation”, but with a reduced 
computational effort at the sensor. 

Analogously, an improvement to zeroing or 
simply holding the control input in case of 
packet drops between controller and actuator 
is for the controller to transmit a control 
sequence U&, u^+i,..., u^+^v that contains not 
only the control to be used at the current 
time instant but also a few future controls 
Ujt+i,Ujfc+ 2 ,...,U]fc+tf. In the case of packet 
drops between controller and actuator, the 
actuator can use previously received “future” 
control inputs in lieu of the one contained in the 
lost packet. The sequence of future control inputs 
may be obtained, e.g., by an optimal receding 
horizon control strategy (Gupta et al. 2006). 

Estimation with Markovian Drops 

When 0k is a Markov process, we no longer 
have a separation principle, and the optimal con¬ 
troller may depend on the drops sequence. Yet, 
optimal state estimates are obtained using the 
same TVKF presented earlier. Below, we give 
conditions for the stability of the error covariance 
when drops are governed by the Gilbert-Elliot 
model: Pr{6» fc+ i = j\0 k = /'} = py, i, j e 
{ 0 , 1 }. 

Theorem 4 (Mo and Sinopoli 2012) Assume 
that (A, RJ ) is controllable, A is unstable, and 
the system given by the pair ( A,C ) is nonde¬ 
generate as discussed in Remark 1. Moreover, 
suppose that the transition probabilities for the 
Gilbert-Elliot model satisfy po\ ; p io > 0. Then 
the expected error covariance L [Pk] is uniformly 
bounded if 

P0l > $min 

and it is unbounded for some initial condition if 
Poi < ftnin- 

Networks with Multiple Links 

We now consider feedback loops that are closed 
over a network of communication links, each of 
which drops packets according to a Bernoulli 
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process. The sensor communicates with a con¬ 
troller across the network, and we assume that 
controller and actuator are collocated. The net¬ 
work may be represented by a graph Q with nodes 
in the set V and edges in the set £, where edges 
are drawn between two communicating nodes. 
We denote by pq the probability of a drop when 
node i transmits to node j . Drops are assumed to 
be independent across links and time. 

To maximize robustness with respect to drops, 
sensors use a Kalman filter to compute an optimal 
estimate for the state of the process based on 
their measurements and transmit this estimate 
across the network. When the sensors do not 
have access to the process input, they can take 
advantage of the linearity of the Kalman filter: 
as the output of a Kalman filter is the sum of a 
term due to measurements with another term due 
to control inputs, sensors may compute only the 
contribution due to measurements and transmit 
it to the controller, which can subsequently add 
the contribution due to the control inputs. This 
guarantees that optimal state estimates can still 
be computed at the control node, even when the 
sensors do not know the control input (Gupta 
et al. 2009). 

The communication in the network goes as 
follows. Sensors time stamp their estimates and 
broadcast them to all nodes in their communi¬ 
cation ranges. After receiving information from 
their neighbors, nodes compare time stamps and 
keep only the most recent estimates. These esti¬ 
mates are broadcasted to all neighboring nodes. 
When the controller receives new information, 
the optimal Kalman estimate is reconstructed, 
taking into account the total transmission delay 
(learned from the packet time stamps), and a 
standard LQG control can be used (Gupta et al. 
2009). 

To determine whether or not this procedure 
results in a stable closed loop, one defines a cut 
C = (S,T) to be a partition of the node set 
V such that the sensor node is in S and the 
controller node is in T. The cut-set is then defined 
as the set of edges (/, j ) e £ such that i e S 
and j e T, i.e., the set of edges that connect 
the sets S and T. The max-cut probability is then 
defined as 


^max-cut = max FT p tj . 

all cuts (S,T) (iJ)65xr 

The above maximization can be rewritten as a 
minimization over the sums of —log pq , which 
leads to a linear program known as the mini¬ 
mum cut problem in network optimization theory 
(Cook 1995). 

Theorem 5 (Gupta et al. 2009) Assume that 
R w , R v > 0, that ( A , B) is stabilizable, that 
(A, C) is observable, and that A is unstable. Then 
the control and communication policy described 
above is optimal for quadratic costs, and the 
expected state covariance is bounded if and only 

if 

/^max-cut * (niclx{|A(;4)|}) < 1. 

Estimation with Controlled 
Communication 

To actively reduce network traffic and power 
consumption, sensor measurements may not be 
sent to the remote estimator at every time step. In 
addition, one may have the ability to somewhat 
control the probability of packet drops by varying 
the transmit power or by transmitting copies of 
the same message through multiple channel real¬ 
izations. This is known as controlled communi¬ 
cation , and it allows the designer to establish a 
trade-off between communication and estimation 
performance. 

We consider the local estimation scenario de¬ 
scribed in section “Estimation with Local Com¬ 
putation” with the difference that the Bernoulli 
drops are now modulated as follows 

1 withprob. Ak 
0 with prob. 1 — Ak 

where the sensor is free to choose Ak e [0, p m ax ] 
as a function of the information available up to 
time k. With its choice, the sensor incurs on a 
communication cost c(Ak) at time k , where c(-) 
is some increasing function that may represent, 
for example, the energy needed in order to trans¬ 
mit with a probability of drop equal to Ak. Note 
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that transmission scheduling, where Ak is either 
0 or Pmaxi is a special case of this framework. 

In order to choose Ak, the sensor considers 
the estimation error := — x^-i between 

the local and the remote estimators. This error 
evolves according to 


When computing and Ak in (5) is com¬ 
putationally too costly for the sensor, one may 
prefer to make Ak a function of the number 
of consecutive dropped packets Ik- In this case, 
minimizing J in (4) is equivalent to minimizing 
the cost 


^/c+l — 


d* 

Ae k + d k 


with prob. Ak 
with prob. 1 — Ak 


J := lim —E 

K^oo K 


K -1 


j: trace (E 4 ) + Xc(A k ) 


.k =0 


where dk is the innovations process arising from 
the standard Kalman filter in the smart sensor. 

Our objective is to find a “communication 
policy” that minimizes the long-term average cost 

k- 1 

j2\m 2 +xc(A k ) , 

.k =0 

A > 0, (4) 

which penalizes a linear combination of the re¬ 
mote estimation error variance E^||e^|| 2 j and 
the average communication cost E [c(Ak)\. In 
this context, a communication policy should be 
understood as a rule that selects Ak as a function 
of the information available to the sensor. 

When 

(1 - jCmax)max{|A(^)|} 2 < 1, 

there exists an optimal communication policy that 
chooses Ak as a function of ek, which may be 
computed via dynamic programming and value 
iteration (Mesquita et al. 2012). While this pro¬ 
cedure can be computationally difficult, it is of¬ 
ten possible to obtain suboptimal but reasonable 
performance with rollout policies such as the 
following one: 

A k = arg min [(/w - A)e' k A'HAe k 

4e[0,/?rnax] 

+ Xc(A)] (5) 

where H is the positive semidefinite solution to 
the Lyapunov equation (1 — p max )A'HA — H = 
—I (Mesquita et al. 2012). 


where 

1 

:= J2 A ' mR " Am - 

m =0 

Since Ik belongs to a countable set, one can very 
efficiently solve this optimization using dynamic 
programming (Mesquita et al. 2012). 

Summary and Future Directions 

Most positive results in the subject rely on the 
assumption of perfect acknowledgments and on 
actuators and controllers being collocated. Future 
research should address ways of circumventing 
these assumptions. 

Cross-References 

► Data Rate of Nonlinear Control Systems and 
Feedback Entropy 

► Information and Communication Complexity 
of Networked Control Systems 

► Networked Control Systems: Architecture and 
Stability Issues 

► Networked Systems 

► Quantized Control and Data Rate Constraints 
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Abstract 

This entry provides a brief overview on net¬ 
worked systems from a systems and control per¬ 
spective. We pay special attention to the nature 
of the interactions among agents; the critical 
role played by information sharing, dissemina¬ 
tion, and aggregation; and the distributed control 
paradigm to engineer the behavior of networked 
systems. 


Keywords 
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Introduction 

Networked systems appear in numerous scientific 
and engineering domains, including communica¬ 
tion networks (Toh 2001), multi-robot networks 
(Arkin 1998; Balch and Parker 2002), sensor 
networks (Santi 2005; Schenato et al. 2007), 
water irrigation networks (Cantoni et al. 2007), 
power and electrical networks (Chow 1982; 
Chiang et al. 1995; Dorfler et al. 2013), camera 
networks (Song et al. 2011), transportation 
networks (Ahuja et al. 1993), social networks 
(Jackson 2010), and chemical and biological 
networks (Kuramoto 1984; Strogatz 2003). 
Their applications are pervasive, ranging from 
environmental monitoring, ocean sampling, 
and marine energy systems, through search 
and rescue missions, high-stress deployment in 
disaster recovery, health monitoring of critical 
infrastructure to science imaging, the smart grid, 
and cybersecurity. 

The rich nature of networked systems makes 
it difficult to provide a definition that, at the 
same time, is comprehensive enough to capture 
their variety and simple enough to be expressive 
of their main features. With this in mind, we 
loosely define a networked system as a “system of 
systems,” i.e., a collection of agents that interact 
with each other. These groups might be hetero¬ 
geneous, composed by human, biological, or en¬ 
gineered agents possessing different capabilities 
regarding mobility, sensing, actuation, commu¬ 
nication, and computation. Individuals may have 
objectives of their own or may share a common 
objective with others - which in turn might be ad¬ 
versarial with respect to another subset of agents. 

In a networked system, the evolutions of the 
states of individual agents are coupled. Coupling 
might be the result of the physical interconnec¬ 
tion among the agents, the consequence of the im¬ 
plementation of coordination algorithms where 
agents use information about each other, or a 
combination of both. There is diversity too in the 
nature of agents themselves and the interactions 
among them, which might be cooperative, adver¬ 
sarial, or belong to the rich range between the 
two. Due to changes in the state of the agents, the 
network, or the environment, interactions among 
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agents may be changing and dynamic. Such inter¬ 
actions may be structured across different layers, 
which themselves might be organized in a hi¬ 
erarchical fashion. Networked systems may also 
interact with external entities that specify high- 
level commands that trickle down through the 
system all the way to the agent level. 

A defining characteristic of a networked sys¬ 
tem is the fact that information, understood in a 
broad sense, is sparse and distributed across the 
agents. As such, different individuals have access 
to information of varying degrees of quality. 
As part of the operation of the networked sys¬ 
tem, mechanisms are in place to share, transmit, 
and/or aggregate this information. Some informa¬ 
tion may be disseminated throughout the whole 
network or, in some cases, all information can 
be made centrally available at a reasonable cost. 
In other scenarios, however, the latter might turn 
out to be too costly, unfeasible, or undesirable 
because of privacy and security considerations. 
Individual agents are the basic unit for decision 
making, but decisions might be made from in¬ 
termediate levels of the networked system all the 
way to a central planner. The combination of in¬ 
formation availability and decision-making capa¬ 
bilities gives rise to an ample spectrum of possi¬ 
bilities between the centralized control paradigm, 
where all information is available at a central 
planner who makes the decisions, and the fully 
distributed control paradigm, where individual 
agents only have access to the information shared 
by their neighbors in addition to their own. 

Perspective from Systems 
and Control 

There are many aspects that come into play when 
dealing with networked systems regarding com¬ 
putation, processing, sensing, communication, 
planning, motion control, and decision making. 
This complexity makes their study challenging 
and fascinating and explains the interest that, 
with different emphases, they generate in a large 
number of disciplines. In biology, scientists 
analyze synchronization phenomena and self- 
organized swarming behavior in groups with 


distributed agent-to-agent interactions (Okubo 
1986; Parrish et al. 2002; Conradt and Roper 
2003; Couzin et al. 2005). In robotics, engineers 
design algorithmic solutions to help multivehicle 
networks and embedded systems coordinate 
their actions and perform challenging spatially 
distributed tasks (Arkin 1998; Committee on 
Networked Systems of Embedded Computers 
2001 ; Balch and Parker 2002; Howard et al. 2006; 
Kumar et al. 2008). Graph theorists and applied 
mathematicians study the role played by the 
interconnection among agents in the emergence 
of phase transition phenomena (Bollobas 2001; 
Meester and Roy 2008; Chung 2010). This 
interest is also shared in communication and 
information theory, where researchers strive to 
design efficient communication protocols and 
examine the effect of topology control on group 
connectivity and information dissemination 
(Zhao and Guibas 2004; Giridhar and Kumar 
2005; Lloyd et al. 2005; Santi 2005 ; Franceschetti 
and Meester 2007). Game theorists study the gap 
between the performance achieved by global, 
network-wide optimizers and the configurations 
that result from selfish agents interacting locally 
in social and economic systems (Roughgarden 
2005; Nisan et al. 2007; Easley and Kleinberg 
2010; Marden and Shamma 2013). In mechanism 
design, researchers seek to align the objectives 
of individual self-interested agents with the 
overall goal of the network. Static and mobile 
networked systems and their applications to the 
study of natural phenomena in oceans (Paley 
et al. 2008; Graham and Cortes 2012; Zhang 
and Leonard 2010; Das et al. 2012; Ouimet and 
Cortes 2013), rivers (Ru and Martinez 2013; 
Tinka et al. 2013), and the environment (DeVries 
and Paley 2012) also raise exciting challenges in 
estimation theory, computational geometry, and 
spatial statistics. 

The field of systems and control brings a 
comprehensive approach to the modeling, analy¬ 
sis, and design of networked systems. Emphasis 
is put on the understanding of the general 
principles that explain how specific collective 
behaviors emerge from basic interactions; the 
establishment of models, abstractions, and tools 
that allow us to reason rigorously about complex 
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interconnected systems; and the development 
of systematic methodologies that help engineer 
their behavior. The ultimate goal is to establish a 
science for integrating individual components 
into complex, self-organizing networks with 
predictable behavior. To realize the “power 
of many” and expand the realm of what is 
possible to achieve beyond the individual agent 
capabilities, special care is taken to obtain 
precise guarantees on the stability properties 
of coordination algorithms, understand the 
conditions and constraints under which they 
work, and characterize their performance and 
robustness against a variety of disturbances and 
disruptions. 

Research Issues - and How the Entries 
in the Encyclopedia Address Them 

Given the key role played by agent-to-agent inter¬ 
actions in networked systems, the Encyclopedia 
entries ►Graphs for Modeling Networked In¬ 
teractions and ►Dynamic Graphs, Connectivity 
of deal with how their nature and effect can be 
modeled through graphs. This includes diverse 
aspects such as deterministic and stochastic 
interactions, static and dynamic graphs, state- 
dependent and time-dependent neighboring 
relationships, and connectivity. The importance 
of maintaining a certain level of coordination 
and consistency across the networked system is 
manifested in the various entries that deal with 
coordination tasks that are, in some way or an¬ 
other, related to some form of agreement. These 
include consensus (►Averaging Algorithms 
and Consensus), formation control (►Vehicular 
Chains), cohesiveness, flocking (►Flocking 
in Networked Systems), synchronization 
(►Oscillator Synchronization), and distributed 
optimization (►Distributed Optimization). A 
great deal of work (e.g., see ► Optimal Deploy¬ 
ment and Spatial Coverage and ► Multi-vehicle 
Routing), is also devoted to the design of 
cooperative strategies that achieve spatially 
distributed tasks such as optimal coverage, space 
partitioning, vehicle routing, and servicing. These 
entries explore the optimal placement of agents, 


the optimal tuning of sensors, and the distributed 
optimization of network resources. The entry 
► Estimation and Control over Networks explores 
the impact that communication channels may 
have on the execution of estimation and control 
tasks over networks of sensors and actuators. 
A strong point of commonality among the 
contributions is the precise characterization of the 
scalability of coordination algorithms, together 
with the rigorous analysis of their correctness 
and stability properties. Another focal point is 
the analysis of the performance gap between 
centralized and distributed approaches in regard 
to the ultimate network objective. 

Further information about other relevant 
aspects of networked systems can be found 
throughout this Encyclopedia. Among these, we 
highlight the synthesis of cooperative strategies 
for data fusion, distributed estimation, and 
adaptive sampling, the analysis of the network 
operation under communication constraints (e.g., 
limited bandwidth, message drops, delays, and 
quantization), the treatment of game-theoretic 
scenarios that involve interactions among 
multiple players and where security concerns 
might be involved, distributed model predictive 
control, and the handling of uncertainty, 
imprecise information, and events via discrete- 
event systems and triggered control. 

Summary and Future Directions 

In conclusion, this entry has illustrated ways 
in which systems and control can help us de¬ 
sign and analyze networked systems. We have 
focused on the role that information and agent 
interconnection play in shaping their behavior. 
We have also made emphasis on the increasingly 
rich set of methods and techniques that allow 
to provide correctness and performance guar¬ 
antees. The field of networked systems is vast 
and the amount of work impossible to survey in 
this brief entry. The reader is invited to further 
explore additional topics beyond the ones men¬ 
tioned here. The monographs (Ren and Beard 
2008; Bullo et al. 2009; Mesbahi and Egerst- 
edt 2010; Alpcan and Ba§ar 2010) and edited 
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volumes (Kumar et al. 2004; Shamma 2008; 
Saligrama 2008), and manuscripts (Olfati-Saber 
et al. 2007; Baillieul and Antsaklis 2007; Leonard 
et al. 2007; Kim and Kumar 2012), together 
with the references provided in the Encyclopedia 
entries mentioned above, are a good starting point 
to undertake this enjoyable effort. Given the big 
impact that networked systems have, and will 
continue to have, in our society, from energy and 
transportation, through human interaction and 
healthcare, to biology and the environment, there 
is no doubt that the coming years will witness 
the development of more tools, abstractions, and 
models that allow to reason rigorously about 
intelligent networks and for techniques that help 
design truly autonomous and adaptive networks. 
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Abstract 

There has been great interest recently in “uni¬ 
versal model-free controllers” that do not need 
a mathematical model of the controlled plant, 
but mimic the functions of biological processes 
to learn about the systems they are controlling 
online, so that performance improves automati¬ 
cally. Neural network (NN) control has had two 
major thrusts: approximate dynamic program¬ 
ming, which uses NN to approximately solve the 
optimal control problem, and NN in closed-loop 
feedback control. 
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Adaptive control; Learning systems; Neural net¬ 
works; Optimal control; Reinforcement learning 

Neural Feedback Control 

The objective is to design NN feedback con¬ 
trollers that cause a system to follow, or track, 
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a prescribed trajectory or path. Consider the dy¬ 
namics of an ft-link robot manipulator 

M{q)q + V m (q, q)q + G(q) + F(q) + r d = x 

( 1 ) 

with q(t) e R" the joint variable vector, M(q) 
an inertia matrix, V m a centripetal/coriolis ma¬ 
trix, G(q) a gravity vector, and F(-) representing 
friction terms. Bounded unknown disturbances 
and modeling errors are denoted by r d and the 
control input torque is r (t). The sliding mode 
control approach (Slotine and Li 1987) can be 
generalized to NN control systems. Given a de¬ 
sired trajectory, q d £ R /7 define the tracking error 
e(t) = q d (t) — q(t) and the sliding variable error 
r = e + Xe with X = X T > 0. Define the 
nonlinear robot function, 

/ O) =M(q)(q d + Xe) + V m (q,q)(q d + Xe) 
+ G(q) + F(g) 


= Gx(dWrf -kG ||r|| V, 

with any constant symmetric matrices F, G > 0, 
and scalar tuning parameter k > 0 . 

NN Controller for Discrete-Time Systems 

Most feedback controllers today are implemented 
on digital computers. This requires the specifi¬ 
cation of control algorithms in discrete time or 
digital form (Lewis et al. 1999). To design such 
controllers, one may consider the discrete-time 
dynamics = f(xk) + g(xk)uk with un¬ 

known functions f(-),g(-)- The digital NN con¬ 
troller derived in this situation has the form of a 
feedback linearization controller shown in Fig. 1. 
One can derive tuning algorithms, for a discrete¬ 
time neural network controller with L layers, that 
guarantee system stability and robustness (Lewis 
et al. 1999). For the i -th layer, the weight updates 
are of the form 


with the known vector x(t) of measured signals 
is selected as, x = [e T e T q d q J q T d \ 


W i {k+\) = W i (k)-a i j> i (k)yJ(k) 


-r 


~ /v T 
I -ai^i(k)(/)i(k) 


Wt(k) 


NN Controller for Continuous-Time 
Systems 

The NN controller is designed based on func¬ 
tional approximation properties of NN as shown 
in Lewis et al. (1999). Thus, assume that f(x) 
can be approximated by f (x) = W T a(V T x) 
with V, W the estimated NN weights. Select the 
control input, r = W T a(V T x) + K v r — v with 
K v a symmetric positive definite gain and v(t) 
a robustifying function. This NN control struc¬ 
ture is shown in Fig. 1. The outer proportional- 
derivative (PD) tracking loop guarantees robust 
behavior. The inner loop containing the NN is 
known as a feedback linearization loop, and the 
NN effectively learns the unknown dynamics 
online to cancel the nonlinearities of the system. 
Let the estimated sigmoid Jacobian be o' = 

1 z= yT x - Then, the NN weight tuning laws are 
provided by 

W = For T - Fo'V T xr T — kF ||r|| W, 


where 0 / (k) are the output functions of layer i , 
0 < r < 1 is a design parameter, and 

- \ Wffiik) + K v r(k) for i = 1 ,..., L - 1 , 

yi (k) = { 1 

(r(£ + l) for i=L 

with r (k) a filtered error. 

Feedforward Neurocontroller 

Industrial, aerospace, DoD, and MEMS assembly 
systems have actuators that generally contain 
deadzone, backlash, and hysteresis. Since these 
actuator nonlinearities appear in the feedforward 
loop, the NN compensator must also appear in 
the feedforward loop. This design is significantly 
more complex than for feedback NN controllers. 
Details are given in Lewis et al. (2002). Feedfor¬ 
ward controllers can offset the effects of dead- 
zone if properly designed. It can be shown that 
a NN deadzone compensator has the structure 
shown in Fig. 2. 
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Neural Control and Approximate Dynamic Programming, Fig. 1 Neural network robot controller 



The NN compensator consists of two NNs. 
NN II is in the direct feedforward control loop, 
and NN I is not directly in the control loop but 
serves as an observer to estimate the (unmea¬ 
sured) applied torque r (t). The feedback stability 
and performance of the NN deadzone compen¬ 
sator have been rigorously proven using nonlinear 


stability proof techniques. The two NN were each 
selected as having one tunable layer, namely, the 
output weights. The activation functions were set 
as a basis by selecting fixed random values for 
the first-layer weights. To guarantee stability, the 
output weights of the inversion NN II (subscript 
i denotes weights and sigmoids of the inversion) 
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and the estimator NN I should be tuned respec¬ 
tively as 


controller or adding adaptive characteristics to an 
optimal controller. 


IV, = T Gj (Vi\v)r T W T o'(V T u)V 


-hTWrWWi -k 2 T\\r\\ 


W, 


Wi, 


W = —So'(V T u)V‘WjOj(V i I w)r‘ 


Optimal Adaptive Control of Discrete-Time 
Nonlinear Systems 

Consider a class of discrete-time systems de¬ 
scribed by the deterministic nonlinear dynamics 
in the affine state space difference equation form 


— k\S ||r|| W , 


%k+\ — f (Xk) H - (2) 


with design matrices T, S > 0 and tuning gains 

k\,k 2 . 


Approximate Dynamic Programming 
for Feedback Control 

The current status of work in approximate dy¬ 
namic programming (ADP) for feedback control 
is given in Lewis and Liu (2012). ADP is a 
form of reinforcement learning based on an ac¬ 
tor/critic structure. Reinforcement learning (RL) 
is a class of methods used in machine learning 
to methodically modify the actions of an agent 
based on observed responses from its environ¬ 
ment (Sutton and Barto 1998). The actor/critic 
structures are RL systems that have two learning 
structures: A critic network evaluates the perfor¬ 
mance of a current action policy, and based on 
that evaluation, an actor structure updates the ac¬ 
tion policy as shown in Fig. 3. Adaptive optimal 
controllers (Lewis et al. 2012b) have been pro¬ 
posed by adding optimality criteria to an adaptive 


with state Xk E R" and control input Uk E 
R m . A deterministic control policy is defined 
as a function from state space to control space 
R" —> R m . That is, for every state Xk, the policy 
defines a control action Uk = h(xk) as a feedback 
controller. Define a deterministic cost function 
that yields the value function: 

oo 

V(x k ) = 'Y^y l ~ k r(x i ,u i ), 

i =k 

withO < y < 1 a discount factor, Q(xk), R > 0, 
and Uk = h(xk) a prescribed feedback control 
policy. The optimal value is given by Bellman’s 
optimality equation: 

V*(x k ) = min (r(x k ,h(x k )) + yV*(x k+ 1 )), 
K) 

which is the discrete-time Hamilton-Jacobi- 
Bellman (HJB) equation. Two forms of RL can be 
based on policy iteration (PI) and value iteration 
(VI). For temporal difference learning, PI is 


Neural Control and 
Approximate Dynamic 
Programming, Fig. 3 RL 

with an actor/critic 
structure 
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written as follows in terms of the deterministic 
Bellman equation. 


Algorithm 1 PI for discrete-time systems 
1: procedure 

2: Given admissible policies ho(x k ) 

3: while || V hi — V h ‘ || > e ac do 

4: Solve for the value F( z ) (x) using 

Vi+i(x k ) = r(x k ,hi(x k )) + yL + ife+i) 

5: Update the control policy h (,■ _|_ p (. x k ) using 

hi + \(x k ) = argmin (r(x k ,h(x k )) + yV i + ife+i)) 
K-) 

6 : i '.= i -|-1 

7: end while 

8 : end procedure 


in Abu-Khalaf and Lewis (2005) that will give 
us the structure for the proposed online algo¬ 
rithms that follow. Consider the following non¬ 
linear time-invariant affine in the input dynamical 
system given by 

x(t) = f(x(t )) + g(x(t))u(t);x( 0) = v 0 (3) 

with x(t) e R", f(x(t)) e R", g(x(t)) e 
R nXm an d control input u(t) G R m . We assume 
that /(0) = 0, f(x) + g(x)u is Lipschitz 
continuous on a set Q C R" that contains the 
origin and that the system is stabilizable on Q, 
that is, there exists a continuous control function 
u(t) e U such that the system is asymptotically 
stable on £2 . Define the infinite horizon integral 
cost W > 0 


where e ac is a small number that checks the 
algorithm convergence. Value iteration is similar, 
but the policy evaluation procedure is performed 
as V i+l (x k ) = r(x k ,hi(x k )) + yVj(x k+ 1 ). In 
value iteration, we can select any initial control 
policy, not necessarily admissible or stabilizing. 
In the control system shown in Fig. 3, the critic 
and the actor NNs are tuned online using the 
observed data (. x k , x k +\,r(x k , hi (x k ))) along the 
system trajectory. The critic and actor are tuned 
sequentially in both the PI and the VI. That is, the 
weights of one neural network are held constant, 
while the weights of the other are tuned until 
convergence. This procedure is repeated until 
both neural networks have converged. Thus, the 
controller learns the optimal controller online. 
The convergence of value iteration using two 
neural networks for the discrete-time nonlinear 
system (2) is proven in Al-Tamimi et al. (2008). 
Design of an ADP controller that uses only output 
feedback is given in Lewis and Vamvoudakis 
( 2011 ). 

Optimal Adaptive Control of 
Continuous-Time Nonlinear Systems 

RL is considerably more difficult for continuous¬ 
time systems than for discrete-time systems, and 
fewer results are available. This subsection will 
provide the formulation of optimal control prob¬ 
lem followed by an offline PI algorithm provided 


/ oo 

r(x( r), u{x))d r, (4) 

with Q(x) positive definite and R G R mXm 
a symmetric positive definite matrix. For any 
admissible control policy if the associated cost 
(4) is C 1 , then an infinitesimal version is the 
Bellman equation, and the optimal cost function 
U*(v) is defined by 


r(x, u)dx J 

which satisfies the HJB equation. By employing 
the stationarity condition, the optimal control 
function for the given problem is 


U*(x 0 ) = min 


u*(x) 


— R~ l 
2 


g (x) 


dv*(x ) 

dx 


( 5 ) 


Inserting the optimal control (5) into the Bellman 
equation, one obtains the formulation for HJB 
equation in terms of dV d ^ and with boundary 
condition U*(0) = 0 


0 — v (v, u ) -|- 


3U*(v) 

dx 


(/(v)+g(vK), 


( 6 ) 


which for the linear case becomes the well- 
known Riccati equation. In order to find the 
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optimal control solution for the problem, one 
needs to solve the HJB equations (6) for the value 
function and then substitute in (5) to obtain the 
optimal control. However, due to the nonlinear 
nature of the HJB equation, finding its solution is 
generally difficult or impossible. The following 
PI algorithm is an iterative algorithm for solving 
optimal control problems and will give us the 
structure for the online learning algorithm. 


dynamics is proposed in Vrabie et al. (2009) 
where the Bellman equation is proved to be 
equivalent to the integral reinforcement learn¬ 
ing form with an optimal value given for some 
T > 0 as 

rt-\-T 

V*(x(t)) = argmin / r(x(r),u(r))dr 

u Jt 

+ v\x(t + r)). 


Algorithm 2 PI for continuous-time systems 
1: procedure 

2: Given admissible policies 

3: while jv u(i) - V u(i) || > e ac do 

4: Solve for the value using Bellman’s 

equation 


dv u . T 

Q(x) + — (/(x) + g(x)z/ ?) ) + if l) Ru^ = 0, 

ox 

v u(i \o) = 0 


5: 


Update the control policy 


1 , T , JV U 

i r s 


using 



6: i I— i + 1 

7: end while 

8 : end procedure 


A PI algorithm that solves online the HJB 
equation without full information of the plant 


Therefore, the temporal difference error for 
continuous-time systems can be defined as 

e(t \t + T) = p(t : t + T) + V{x(t + Tf) 

- V(x(t)), 

with p(t : t + T) = // +r r(x(r), u(r))dr with¬ 
out any information of the plant dynamics. The 
IRL controller just given tunes the critic neural 
network to determine the value while holding 
the control policy fixed. The IRL algorithm can 
be implemented online by RL techniques using 
value function approximation V (x) = Wf <j){x) 
in a critic approximator network. Using that ap¬ 
proximation in the PI algorithm, one can use 
batch least squares or recursive least squares 
to update the value function, and then on con¬ 
vergence of the value parameters, the action is 
updated. The implementation of the IRL optimal 
adaptive control algorithm is shown in Fig. 4. 


Run KIS or use batch L.S. 

To identify value of current control 
-\ 
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Control 
System 
w/ MEMORY 


J 


Neural Control and Approximate Dynamic Programming, Fig. 4 Hybrid optimal adaptive controller based on IRL 




























Neural Control and Approximate Dynamic Programming 


859 


The work in Vamvoudakis and Lewis (2010) 
presents a way of finding the optimal control 
solution in a synchronous manner along with 
stability and convergence guarantees but with 
known dynamics. This procedure is more nearly 
in line with accepted practice in adaptive control. 

A synchronous online learning algorithm that 
avoids the knowledge of drift dynamics is pro¬ 
posed in Vamvoudakis et al. (2013). 

Learning in Games 

Reinforcement learning techniques have been ap¬ 
plied to design adaptive controllers that converge 
to the solution of two-player zero-sum games 
in Vamvoudakis and Lewis (2012) and Vrabie 
et al. (2012), of multiplayer nonzero-sum games 
in Vamvoudakis et al. (2012a), and of Stackelberg 
games in Vamvoudakis et al. (2012b). In these 
cases, the adaptive control structure has multiple 
loops, with action networks and critic networks 
for each player. The adaptive controller for zero- 
sum games finds the solution to the H-infinity 
control problem online in real time. This adaptive 
controller does not require any systems dynamics 
information. 

Summary and Future Directions 

This entry discusses some neuro-inspired adap¬ 
tive control techniques. These controllers have 
multi-loop, multi-timescale structures and can 
learn the solutions to Hamilton-Jacobi design 
equations such as the Riccati equation online 
without knowing the full dynamical model of the 
system. A method known as Q learning allows the 
learning of optimal control solutions online, in 
the discrete-time case, for completely unknown 
systems. Q learning has not yet been fully inves¬ 
tigated for continuous-time systems. 

Cross-References 

► Adaptive Control, Overview 

► Stochastic Games and Learning 

► Optimal Control and the Dynamic Program¬ 
ming Principle 
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Abstract 

Model-predictive control is a controller design 
method which synthesizes a sampled data feed¬ 
back controller from the iterative solution of 
open-loop optimal control problems. We describe 
the basic functionality of MPC controllers, their 
properties regarding feasibility, stability and per¬ 
formance, and the assumptions needed in order 
to rigorously ensure these properties in a nominal 
setting. 

Keywords 

Recursive feasibility; Sampled-data feedback; 
Stability 

Introduction 

Model-predictive control (MPC) is a method for 
the optimization-based control of linear and non¬ 


linear dynamical systems. While the literal mean¬ 
ing of “model-predictive control” applies to virtu¬ 
ally every model-based controller design method, 
nowadays the term commonly refers to control 
methods in which pieces of open-loop optimal 
control functions or sequences are put together 
in order to synthesize a sampled data feedback 
law. As such, it is often used synonymously with 
“receding horizon control.” 

The concept of MPC was first presented in 
Propoi (1963) and was reinvented several times 
already in the 1960s. Due to the lack of suf¬ 
ficiently fast computer hardware, for a while 
these ideas did not have much of an impact. 
This changed during the 1970s when MPC was 
successfully used in chemical process control. At 
that time, MPC was mainly applied to linear sys¬ 
tems with quadratic cost and linear constraints, 
since for this class of problems algorithms were 
sufficiently fast for real-time implementation - at 
least for the typically relatively slow dynamics 
of process control systems. The 1980s have then 
seen the development of theory and increasingly 
sophisticated concepts for linear MPC, while in 
the 1990s nonlinear MPC (often abbreviated as 
NMPC) attracted the attention of the MPC com¬ 
munity. After the year 2000, several gaps in the 
analysis of nonlinear MPC without terminal con¬ 
straints and costs were closed, and increasingly 
faster algorithms were developed. Together with 
the progress in hardware, this has considerably 
broadened the possible applications of both linear 
and nonlinear MPC. 

In this entry, we explain the functionality of 
nominal MPC along with its most important 
properties and the assumptions needed to 
rigorously ensure these properties. We also 
give some hints on the underlying proofs. The 
term nominal MPC refers to the assumption 
that the mismatch between our model and the 
real plant is sufficiently small to be neglected 
in the following considerations. If this is not 
the case, methods from robust MPC must be 
used (►Robust Model-Predictive Control). We 
describe all concepts for nonlinear discrete time 
systems, noting that the basic results outlined in 
this entry are conceptually similar for linear and 
for continuous-time systems. 
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Model-Predictive Control 

In this entry, we discuss MPC for discrete time 
control systems of the form 

Xu(j + 1) = x„(0) = Xo (1) 

with state x u (j ) G X, initial condition xo £ X, 
and control input sequence u = (u(0), u(l),...) 
with u(k) G U, where the state space X and 
the control value space U are normed spaces. 
For control systems in continuous time, one may 
either apply the discrete time approach to a sam¬ 
pled data model of the system. Alternatively, 
continuous-time versions of the concepts and 
results from this entry are available in the liter¬ 
ature; see, e.g., Findeisen and Allgower (2002) or 
Mayne et al. (2000). 

The core of any MPC scheme is an optimal 
control problem of the form 

minimize Jn C^o , u) (2) 

w.r.t. u = (u( 0),..., u(N — 1)) with 
N -1 

/*(*<),U) : y^£U„(/),»(./)) + F(X U (N)) 
]= o 

(3) 

subject to the constraints 

u(j) G U, x u (j ) G X for j = 0,..., N — 1 
x u (N) G Xo, 

(4) 

for control constraint set U c U, state con¬ 
straint set X c X, and terminal constraint set 
Xo c X. The function l : X x U —>► R is 
called stage cost or running cost; the function 
F : X M is referred to as terminal cost. 
We assume that for each initial value Vo G X, 
the optimal control problem (2) has a solution 
and denote the corresponding minimizing control 
sequence by u*. Algorithms for computing u* 
are discussed in ► Optimization Algorithms for 
Model Predictive Control and ► Explicit Model 
Predictive Control. 


The key idea of MPC is to compute the values 
/Ziv(v) of the MPC feedback law /xn from the 
open-loop optimal control sequences u*. To 
formalize this idea, consider the closed-loop 
system 

(k + \ ) = f (xp N (k),iA N ( 

X l (*)))• ( 5 ) 

In order to evaluate /xn along the closed-loop 
solution, given an initial value x^ N (0) G X, we 
iteratively perform the following steps. 

Basic MPC Loop 

1. Set k := 0. 

2. Solve (2)-(4) for v 0 = x flN (k); denote 
the optimal control sequence by u* = 
(m*(0),...,m*(# - 1)). 

3. Set N (x /1;V (A;)) : u*(0), compute x^ N (k + 1) 
according to (5), set k := k + 1. and go to (1). 

Due to its ability to handle constraints and pos¬ 
sibly nonlinear dynamics, MPC has become one 
of the most popular modern control methods in 
the industry (► Model-Predictive Control in Prac¬ 
tice). While in the literature various variants of 
this basic scheme are discussed, here we restrict 
ourselves to this most widely used basic MPC 
scheme. 

When analyzing an MPC scheme, three prop¬ 
erties are important: 

• Recursive Feasibility, i.e., the property that the 
constraints (4) can be satisfied in Step (ii) in 
each sampling instant 

• Stability, i.e., in particular convergence of 
the closed-loop solutions x /JiN ( k ) to a desired 
equilibrium v* as k —> oo 

• Performance, i.e., appropriate quantitative 
properties of x^ N (j k ) 

Here we discuss these three issues for two widely 
used MPC variants: 

1. MPC with terminal constraints and costs 

2. MPC with neither terminal constraints nor 
costs 

In (a), F and Xo in (3) and (4) are specifically 
designed in order to guarantee proper perfor¬ 
mance of the closed loop. In (b), we set F = 0 
and Xo = X. Thus, the choice of l and N 
in (3) is the most important part of the design 
procedure. 
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Recursive Feasibility 

Since the ability to handle constraints is one 
of the key features of MPC, it is important to 
ensure that the constraints x jlN (k) E X and 
pn (k)) E U are satisfied for all k > 0. How¬ 
ever, beyond constraint satisfaction, the stronger 
property x flN (k) e Xjy is required, where X^y 
denotes the feasible set for horizon N, 

Xn'-= {xeX| there exists u such that (4) holds}. 

The property x E X^ is called feasibility of 
x. Feasibility of x = x /lN (k ) is a prerequisite 
for the MPC feedback being well defined, 
because the nonexistence of such an admissible 
control sequence u would imply that solving (2) 
under the constraints (4) in Step (ii) of the MPC 
iteration is impossible. 

Since for k > 0 the state x /lN (k + 1) = 
f( x n N (k),u*( 0)) is determined by the solution 
of the previous optimal control problem, the usual 
way to address this problem is via the notion of 
recursive feasibility. This property demands the 
existence of a set A cX such that: 

• For each xo e A, the problem (2)-(4) is 
feasible. 

• For each xo e A and the optimal control u* 
from (2) to (4), the relation /(xo, m*( 0)) E A 
holds. 

It is not too difficult to see that this property 
implies x m (, k ) E A for all k > 1 if x llN (0) E A. 

For terminal-constrained problems, recursive 
feasibility is usually established by demanding 
that the terminal constraint set Xo is viable or 
controlled forward invariant. This means that for 
each x E Xo, there exists u E U with f(x,u ) E 
Xo. Under this assumption, it is quite straight¬ 
forward to prove that the feasible set A = Xjy 
is also recursively feasible (Grime and Pannek 
2011, Lemma 5.11). Note that viability of X 0 
is immediate if Xo = {x*} and x* E X is an 
equilibrium, i.e., a point for which there exists 
u* E U with f(x*,u *) = x*. This setting is 
referred to as equilibrium terminal constraint. 

For MPC without terminal constraints, the 
most straightforward way to ensure recursive fea¬ 
sibility is to assume that the state constraint set X 


is viable (Grime and Pannek 2011, Theorem 3.5). 
However, checking viability and even more con¬ 
structing a viable state constraint set is in general 
a very difficult task. Hence, other methods for 
establishing recursive feasibility are needed. One 
method is to assume that the sequence of feasible 
sets Xjy, N E N becomes stationary for some No, 
i.e., that X N +\ = X N holds for all N > No. 
Under this assumption, recursive feasibility of 
X No follows, see Kerrigan (2000, Theorem 5.3). 
However, like viability, stationarity is difficult to 
verify. 

For this reason, a conceptually different 
approach to ensure recursive feasibility was 
presented in Grime and Pannek (2011, Theo¬ 
rem 8.20); a similar approach for linear systems 
can be found in Primbs and Nevistic (2000). 
The approach is suitable for stabilizing MPC 
problems in which the stage cost i penalizes the 
distance to a desired equilibrium x* (cf. section 
“Stability”). Assuming the existence - but not 
the knowledge - of a viable neighborhood M 
of x*, one can show that any initial point Xo 
for which the corresponding open-loop optimal 
solution satisfies x u *(y) E M or some j < N is 
contained in a recursively feasible set. The fact 
that l penalizes the distance to x* then implies 
x u * (j) E A f for suitable initial values. Together, 
these properties yield the existence of recursively 
feasible sets A # which become arbitrarily large 
as N increases. 


Stability 

Stability in the sense of this entry refers to the fact 
that a prespecified equilibrium x* E X — typically 
a desired operating point - is asymptotically sta¬ 
ble for the MPC closed loop for all initial values 
in some set S. This means that the solutions 
x^ N ( k ) starting in S converge to x* as k —> oo 
and that solutions starting close to x* remain 
close to x* for all k > 0. Note that this setting can 
be extended to time-varying reference solutions; 
see ► Tracking Model Predictive Control. 

In order to enforce this property, we assume 
that the stage cost l penalizes the distance to the 
equilibrium x* in the following sense: l satisfies 
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l(x*, u*) = 0 and ot\ (|x|) < l(x, u ) (6) 

for all x G X and u G U. Here oq is a 
/Coo function, i.e., a continuous function oq : 
[0, oo) -> [0, oo) which is strictly increasing, 
unbounded, and satisfies o'i(O) = 0. With \x\, we 
denote the norm on X. In this entry, we exclu¬ 
sively discuss stage costs l satisfying (6). More 
general settings using appropriate detectability 
conditions are discussed in Rawlings and Mayne 
(2009, Sect. 2.7) or Grimm et al. (2005) in the 
context of stabilizing MPC. Even more general l 
are allowed in the context of economic MPC; see 
the ► Economic Model Predictive Control article. 

In case of terminal constraints and terminal 
costs, a compatibility condition between l and 
F is needed on Xo in order to ensure stability. 
More precisely, we demand that for each x G Xo 
there exists a control value u G U such that 
f(x, u) G Xo and 

F(/(jc, u)) - F(x) < -£(x, u) (7) 

holds. Observe that the condition f(x,u) G Xo 
is again the viability condition which we already 
imposed for ensuring recursive feasibility. Note 
that (7) is trivially satisfied for F = 0 in case of 
Xo = {v*} by choosing u = u*. 

Stability is now concluded by using the opti¬ 
mal value function 

FvOo) : = inf Jn( x 0 ,u) 

u s.t. (4) 

as a Lyapunov function. This will yield stability 
on S = Xn, as Xn is exactly the set on which Fv 
is defined. In order to prove that Fv is a Lyapunov 
function, we need to check that Fv is bounded 
from below and above by /Coo functions ot\ and a 2 
and that Fv is strictly decaying along the closed- 
loop solution. 

The first amounts to checking 

«i Ol) < V N (x) < a 2 (\x\) (8) 

for all v G Xn- The lower bound follows 
immediately from (6) (with the same oq), and 
the upper bound can be ensured by conditions 


on the problem data (see, e.g., Rawlings and 
Mayne 2009, Sect. 4.5; Grime and Pannek 2011, 
Sect. 5.3). 

For ensuring that Fv is strictly decreasing 
along the closed-loop solutions, we need to prove 

V N {f{x,n N (x))) < V n {x)-1{x,il n {x)). (9) 

In order to prove this inequality, one uses on 
the one hand the dynamic programming principle 
stating that 

VN-i{f(x,fi N {x))) = V N (x) -l(x,n N (x)). 

( 10 ) 

On the other hand, one shows that (7) implies 

Fv-iM > FvO) (11) 

for all v G X#. Inserting (11) with f(x, /jLn(x)) 
in place of v into (10) then immediately 
yields (9). Details of this proof can be found, 
e.g., in Mayne et al. (2000), Rawlings and Mayne 
(2009), or Grime and Pannek (2011). The survey 
Mayne et al. (2000) is probably the first paper 
which develops the conditions needed for this 
proof in a systematic way; a continuous-time 
version of these results can be found in Fontes 
(2001). 

Summarizing, for MPC with terminal con¬ 
straints and costs, under the conditions (6)-(8), 
we obtain asymptotic stability of x* on <S=Xn. 

For MPC without terminal constraints and 
costs, i.e., with Xo = X and F = 0, these 
conditions can never be satisfied, as (7) will 
immediately imply l(x,u) = 0 for all v G 
X, contradicting (6). Moreover, without terminal 
constraints and costs, one cannot expect (9) to 
be true. This is because without terminal con¬ 
straints, the inequality Fv-iW < FvM holds, 
which together with the dynamic programming 
principle implies that if (9) holds, then it holds 
with equality. This, however, would imply that 
[In is the infinite horizon optimal feedback law, 
which - though not impossible - is very unlikely 
to hold. 
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Thus, we need to relax (9). In order to do so, 
instead of (9), we assume the relaxed inequality 

V N (f(x,n N (x))) < V N (x) ~,at(x,n N (x)) 

( 12 ) 

for some a > 0 and all x G X, which is still 
enough to conclude asymptotic stability of x* 
if (6) and (8) hold. The existence of such an a 
can be concluded from bounds on the optimal 
value function Ty . Assuming the existence of 
constants Yk > 0 such that the inequality 


performance is to evaluate the infinite horizon 
functional corresponding to (3) along the closed 
loop, i.e., 


oo 

J^(x (h ii N ) := (xn N (k),Li N {xfi N (k))) 
k= 0 


with .*^(0) = xo. This value can then be 
compared with the optimal infinite horizon value 


^ooCxo) 


inf J oo (x 0 , u) 

u:u(k)€U,x a (k)€X 


Vk(x) < Vk min t(x, u) 

uE U 


where 


holds for all K = 1,..., N and x G X, there are 
various ways to compute a from yi,..., yy, see 
Grime (2012, Sect. 3). The best possible estimate 
for a , whose derivation is explained in detail in 
Grime and Pannek (2011, Chap. 6), yields 

(Yn - 1) II (n ~ !) 

« = l ~- N —™ -• < 14 > 

n yi - n & - 1) 

i =2 i =2 

Though not immediately obvious, a closer look at 
this term reveals a —> 1 as N —> oo if the Yk are 
bounded. Hence, a > 0 for sufficiently large N. 

Summarizing the second part of this section, 
for MPC without terminal constraints and costs, 
under the conditions (6), (8), and (13), asymptotic 
stability follows on S = X for all optimization 
horizons N for which a > 0 holds in (14). 
Note that the condition (13) implicitly depends 
on the choice of l. A judicious choice of i can 
considerably reduce the size of the horizon N for 
which a > 0 holds, see Grime and Pannek (2011, 
Sect. 6.6) and thus the computational effort for 
solving (2)-(4). 

Performance 

Performance of MPC controllers can be 
measured in many different ways. As the MPC 
controller is derived from successive solutions 
of (2), a natural quantitative way to measure its 


oo 

7oo(^o,«) := y^/(x a (k),u(k)). 
k =o 

To this end, for MPC with terminal constraints 
and costs, by induction over (9) and using non¬ 
negativity of £, it is fairly easy to conclude the 
inequality 

J^(x 0 ,ii N ) < Vn(x o ) 

for all v G Xjy. However, due to the conditions 
on the terminal cost in (7), Vn may be consid¬ 
erably larger than Too and an estimate relating 
these two functions is in general not easy to 
derive (Grime and Pannek 2011, Examples 5.18 
and 5.19). However, it is possible to show that un¬ 
der the same assumptions guaranteeing stability, 
the convergence 

V N (x) -> Foo(x) 

holds for N oo (Grime and Pannek 2011, 
Theorem 5.21). Hence, we recover approximately 
optimal infinite horizon performance for suffi¬ 
ciently large horizon N . 

For MPC without terminal constraints and 
costs, the inequality Ky (xo) < Too(xo) is im¬ 
mediate; however, (9) will typically not hold. As 
a remedy, we can use (12) in order to derive an 
estimate. Using induction over (12), we arrive at 
the estimate 

J c £(xq,h n ) < V N (x 0 )/ex < V oo (x 0 )/a. 




Nominal Model-Predictive Control 


865 


Since a —> 1 as iV —> oo, also in this case 
we obtain approximately optimal infinite horizon 
performance for sufficiently large horizon N . 


Summary and Future Directions 

MPC is a controller design method which uses 
the iterative solution of open-loop optimal control 
problems in order to synthesize a sampled data 
feedback controller /x n • The advantages of MPC 
are its ability to handle constraints, the rigorously 
provable stability properties of the closed loop, 
and its approximate optimality properties. As¬ 
sumptions needed in order to rigorously ensure 
these properties together with the corresponding 
mathematical arguments have been outlined in 
this entry, both for MPC with terminal constraints 
and costs and without. Among the disadvantages 
of MPC are the computational effort and the fact 
that the resulting feedback is a full state feedback, 
thus necessitating the use of a state estimator to 
reconstruct the state from output data (► Moving 
Horizon Estimation). 

Future directions include the application of 
MPC to more general problems than set point 
stabilization or tracking, the development of effi¬ 
cient algorithms for large-scale problems includ¬ 
ing those originating from discretized infinite¬ 
dimensional control problems, and the under¬ 
standing of the opportunities and limitations of 
MPC in increasingly complex environments; see 
also ► Distributed Model Predictive Control. 


Cross-References 

► Distributed Model Predictive Control 

► Economic Model Predictive Control 

► Explicit Model Predictive Control 

► Model-Predictive Control in Practice 

► Moving Horizon Estimation 

► Optimization Algorithms for Model Predictive 
Control 

► Robust Model-Predictive Control 

► Stochastic Model Predictive Control 

► Tracking Model Predictive Control 


Recommended Reading 

MPC in the form known today was first described 
in Propoi (1963) and is now covered in several 
monographs, two recent ones being Rawlings and 
Mayne (2009) and Grime and Pannek (2011). 
More information on continuous-time MPC can 
be found in the survey by Findeisen and Allgower 
(2002). The nowadays standard framework for 
stability and feasibility of MPC with stabiliz¬ 
ing terminal constraints is presented in Mayne 
et al. (2000); for a continuous-time version, see 
Fontes (2001). Stability of MPC without terminal 
constraints was proved in Grimm et al. (2005) 
under very general conditions; for a comparison 
of various such results, see Grime (2012). Feasi¬ 
bility without terminal constraints is discussed in 
Kerrigan (2000) and Primbs and Nevistic (2000). 
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Abstract 

We consider the control of nonlinear systems in 
which parameters are uncertain and may vary. 
For such systems the control must adapt to the 
parameter change to deliver closed-loop perfor¬ 
mance, such as asymptotic stability or tracking. 
A concise description of available methods and 
basic adaptive stabilization results, which can be 
used as building blocks for complex adaptive 
control problems, are discussed. 


Keywords 

Adaptive stabilization; Linear parameterization; 
Lyapunov function; Nonlinear parameterization; 
Output feedback 


Introduction 

The adaptive control problem, namely, the prob¬ 
lem of designing a feedback controller which 
contains an adaptation mechanism to counteract 
changes in the parameters of the system to be 
controlled, is of significant importance in ap¬ 
plications. In almost all systems, physical pa¬ 
rameters are subject to changes. These may be 
triggered, for example, by changes in temperature 
(the volume of a liquid/gas), aging (the friction 
coefficient of a mechanical system), or normal 
operation (the mass of the fuel of an aircraft 
changes during flight, the center of mass of a 
vehicle is affected by its load). 


While adaptive control is naturally associated 
with the notion of estimation, i.e., the parameters 
of a system have to be identified to design a 
controller, it may be possible to design adaptive 
controllers which do not rely on a complete pa¬ 
rameter estimation: it is sufficient to estimate the 
effect of the parameters on the control signal. 

Adaptive control is different from robust con¬ 
trol. In the simplest possible occurrence, the aim 
of robust control is to design a control law guar¬ 
anteeing performance specifications for a given 
range of parameter values. Robust control thus 
requires some a priori information on the parame¬ 
ter. Adaptive control does not require any a priori 
information on the parameter, although any such 
information can be exploited in the controller 
design, but requires a parameterized model: a 
model which contains information on the way the 
parameters affect the dynamics of the system. 

The adaptive control problem for general non¬ 
linear systems can be formulated as follows. Con¬ 
sider a nonlinear system described by equations 
of the form 

x = F(x,u,0), y = H(x,6 ), (1) 

where x{t) G R 77 denotes the state of the system, 
u(t) G R m denotes the input of the system, 
6 G R^ denotes the constant unknown parameter, 
y{t) G R p denotes the measured output, and 
F : R 77 x R m xR«-> R 77 and H : R 77 
R p are smooth mappings. While we focus on 
continuous-time systems, similar considerations 
apply to discrete-time systems. In what follows, 
for simplicity, we mostly assume that y = x: the 
whole state of the system is available for control 
design. 

The adaptive control problem consists in find¬ 
ing, if possible, a dynamic control law described 
by equations of the form 

0=w(x,0,r), (2) 

u = v(x, 0, r), (3) 

with r(t) G R 5 an exogenous (reference) signal 
and w : R 77 x R^ x R^ -> R^ and v : 
R 77 x R^ x R* -> R m smooth mappings, such 
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that the closed-loop system, described by the 
equations 

x = F(x , v(x, 9, r ), 9 ), 9 = w(v, 0, r), 

(4) 

has specific properties. For example, one could 
require that all trajectories be bounded and the 
x-component of the state converge to a given 
value v* (this is the so-called adaptive regulation 
requirement) or that the input-output behavior 
of the system from the input r to some user- 
defined output signal coincide with a given refer¬ 
ence model (this is the so-called model reference 
adaptive control requirement). 

A natural way to characterize design specifi¬ 
cations for the adaptive control problem and to 
facilitate its solution is to assume the existence of 
a known parameter controller, described by the 
equation 

u = v*(x, 9 , r), (5) 

such that the nonadaptive closed-loop system 
x = F(x, v*(x, 9, r), 9) satisfies given design 
specifications. In this perspective, the adaptive 
control problem boils down to the design of 
the update law (2) and of the feedback law (3) 
such that the behavior of the adaptive closed-loop 
system matches that of the nonadaptive closed- 
loop system x = F(x, v*(x, 6, r), 6). 

The above description suggests a design 
method for the feedback law : one could replace 
0 with 9 in Eq. (5). This design is often known 
as certainty equivalence design and lends itself 
to the interpretation that 9 be an estimate for 9. 
Naturally, one could also modify the feedback 
law, replacing 0 with 9 and adding v-dependant 
terms: this is often called a redesign. Redesign 
may be guided by various considerations, for 
example, it may be based on the use of a specific 
Lyapunov function (yielding the so-called 
Lyapunov redesign), or by structural properties 
of the system, or by robustness constraints. 

The interpretation of 9 as an estimate for 0 
leads to two similar approaches for the design 
of the update law. The former, pursued in the 
so-called indirect adaptive control, relies on the 
design of a parameter estimator, for example, 


using recursive least-square methods. This ap¬ 
proach has its roots in identification theory and 
has been studied in-depth for linear systems. The 
latter relies on the observation that the design of 
an update law is equivalent to the design of a 
(reduced-order) observer for the extended system 

x = F(x, u, 9), 0 = 0, 

with output y = x. This approach has its roots in 
the theory of nonlinear observer design. 

The approaches described so far relies on a 
sort of separation principle: the update law and 
the feedback law are designed separately. While 
this approach may be adequate for linear systems, 
for nonlinear systems it is often necessary to 
design the update law and the feedback law in 
one step, i.e., the selection of the feedback law 
depends upon the selection of the update law 
and vice versa. To illustrate this design method, 
and provide some explicit adaptive control design 
tools, we focus on a special class of nonlinear sys¬ 
tems: systems which are linearly parameterized 
in the unknown parameter. 

Linearly Parameterized Systems 

Consider the system (1) and assume the mapping 
F is affine in the parameter 0 and in the control 
u, namely, 

F(x,u,9 ) = f 0 (x) + g(x)u + fi(x)9, (6) 

with / 0 : R n -> R", g : R n -> R n x R m 
and /i : R" -> R" x R^ smooth mappings. 
For this class of systems, under additional as¬ 
sumptions, it is possible to provide systematic 
adaptive control design tools. We provide two 
formal results: additional results (depending on 
the specific assumptions imposed on the system) 
may be derived. In both cases the focus is on 
the adaptive stabilization problem: the goal of 
the adaptive controller is to render a given equi¬ 
librium stable, in the sense of Lyapunov, and to 
guarantee convergence of the v-component of the 
state (recall that the state of the adaptive closed- 
loop system is the vector (x, 9)). 
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Theorem 1 Consider the system (6) and a point 
x*. Assume there exist a known parameter con¬ 
troller 

u = vo(x) + v\(x)6, 

with Vo : R 77 -> R 777 and v\ : R 77 -> R 777 x R^ 
smooth mappings, and a positive definite and 
radially unbounded function V : R 77 —> R, such 
that V(x*) = 0 and 



M-x) + Mx)(§ + p(x)) 


+ g(x)v(xj +fi(x)) 
and the feedback law 


( 8 ) 


u = v(x, 9 + fi(x)) 


—f*(x,e)<o 

OX 


for all x jc*. 

update law 

a /3F \ 

9 = '(to ?Wui(; "v 

and the feedback law 

u = Vq(x) + ui(x)0 


are swc/z that all trajectories of the closed-loop 
system are bounded and lim x{t ) = x*. 

t — >oo 

Theorem 2 Consider the system (6) and a point 
x*. Assume there exists a known parameter con¬ 
troller u = v(x,9) such that the closed-loop 
system 

i = /*(*, 0), 

where f*(x,0) = fo(x) + fi(x)0+g(x)v(x,0), 
has a globally asymptotically stable equilibrium 
at x*. Assume, in addition, that there exists a 
mapping /? : R 77 -> R^ such that all trajectories 
of the system 




f-Mx) 


z, 


X = /*(*) + g(x) (v(x, 6 + z) - v(x, 9 )) 


(7) 


are bounded and satisfy 


are such that all trajectories of the closed-loop 
system are bounded and lim x(t) = x*. 

t-> oo 

The stability properties of the adaptive closed- 
loop system in Theorem 1 can be studied with the 
Lyapunov function W(x, 9) = V(x) + \ \\9 — 9\\ 2 , 
whereas a Lyapunov analysis for the adaptive 
closed-loop system of Theorem 2 can be carried 
out, under additional assumptions, via a Lya¬ 
punov function of the form W(x, 9) = V(x) + 
\\\9—9+/3(x)\\ 2 . This suggests that in Theorem 1 
9 plays the role of the estimate of 9 , whereas in 
Theorem 2 such a role is played by 9 + fi(x). 
Note that in none of the theorems, the parame¬ 
ter estimate is required to converge to the true 
value of the parameters, although in Theorem 2 
the feedback law is required to converge, along 
trajectories, to the known parameter controller. 
This has a very important, possibly counterin¬ 
tuitive, consequence: the asymptotic nonadaptive 
controller u = v(x, Qoo), where 9^ = lim 9 ft), 

t —>oo 

provided the limit exists, is not in general a 
stabilizing controller for system (6). 

Example 1 Consider the nonlinear system de¬ 
scribed by the equation x = u + 9x 2 , with 
x(t) G R, u(t) G R, and 9 e R. A known 
parameter controller satisfying the assumptions 
of Theorem 1 (with and V(x) = x 2 /2) and of 
Theorem 2 (with /3(x ) = x) is n = —x — 9x 2 . 
The resulting update laws and feedback laws are 

9\ = v 3 , u\ = —x — 9x 2 , 


lim [g(x(0) (v(x(t), 9 + z(t)) - v(x(t), 0))] = 0. 

t —>oo 


and 


Then the update law 


e 2 = x, 


u 2 = —x — (6 + x)x 2 , 
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respectively, the subscripts “ 1 ” and “ 2 ” are used 
to refer to the construction in Theorem 1 and 2, 
respectively. 

The basic building blocks in Theorems 1 and 2 
can be exploited repeatedly to design adaptive 
controllers for systems with a specific structure, 
for example, for systems described by the equa¬ 
tions 

X\ = X 2 + (p\(Xi) T 0, 

X 2 = X3 + <P2(Xi, X 2 ) T 0, 

: T (9) 

Xi = Xi +1 +<pi(xi,...,Xi) 9, 

x n = U + (p n (x 1 , . . . ,X n ) T 9 , 

with Xf (t) G R, for i = 1,..., n , u(t ) G R, cpi : 
R ? —> R^, for / = 1,..., n, smooth mappings, 
and 6 e R^. Note that the last of the equations 
(9) can be replaced by 

x n = 0u -\- <p n (x 1 , . . . , X„) T $, 

with 6 G R, provided its sign is known (this 
condition may be removed using the so-called 
Nussbaum gain). The parameter 0 is often re¬ 
ferred to as the high-frequency gain of the sys¬ 
tem: a terminology borrowed from linear systems 
theory. 

Output Feedback Adaptive Control 

A key feature of the parameterized systems de¬ 
scribed so far is that these are linearly parame¬ 
terized in 6. The linear parameterization allows 
to develop systematic design tools, such as those 
given in Theorems 1 and 2. Such results, how¬ 
ever, require full information on the state of the 
system. If only partial information on the state 
is available, one has to combine an estimator of 
the state with an update law. Such a combina¬ 
tion requires either strong assumptions on the 
system or very specific structures. For example, 
it is feasible if the system is not only linearly 
parameterized in the parameter 9 , but it is also 


linearly parameterized in the unmeasured states, 
namely, it is described by equations of the form 

= x 2 + ^\{xi) + cpi(xi) T 0, 

x 2 = X 3 + xp 2 (xi) + (p 2 (xi) T 9, 

Xi = Xi+i + fi(x i) + (pi(xi) T 0 + biU, 

Xn -1 — X n + + (p n -l(Xi) T 0 + b n -\U, 

X n = fn(x i) + (p n (Xi) T 9 + b n U, 

y = x i 

with Xi(t) G R, for i = 1 u(t) G R, 

y(t) G R , (pi : R -> R^, and 1//7 : R -> R, 
for i = 1,..., n, smooth mappings, 9 G R^, 
and b = [bi , • • • , b n -\, b n \ T unknown, but such 
that the sign of b n is known and the polynomial 

b n s n ~ l + b n -\s n ~ l ~ x H- 1 - bj has all roots with 

negative real part (this implies that the system, 
with input u and output y, is minimum phase). 

Nonlinear Parameterized Systems 

Adaptive control of nonlinearly parameterized 
systems is an open area of research. The design 
of adaptive controllers relies often upon struc¬ 
tural assumptions, for example, the existence of 
a monotonic parameterization, as in the system 
described by the equation 

x = F(x , u ) + <F(v, 9), 

with x(t) G R", u(t ) G R m , 9 G R^, and F : 
R" x R m -> R /7 and O : R n x R^ R" smooth 
mappings and such that, for all x, the mapping <f> 
satisfies the monotonicity condition 

(0a-0b) T (*(x,0a)-*(x,0b)) >0, 

for all 9 a 7 ^ 9b. Alternatively, the design may 
exploit the so-called over-parameterization, for 
example, the equation of the system 

x = u + tyx (x) sin 9 + ^(x) cos 9, 
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with x(t) G R, u(t ) G R, and 0 e R, may be 
rewritten in over-parameterized form as 

x = u + + \l/ 2 (x) 0 2 , 

with Qi g R, for i = 1,2. Note that the 
over-parameterized form overlooks the important 
information that Of + 0| = 1. 

Summary and Future Directions 

The problem of adaptive stabilization for non¬ 
linear systems has been discussed. Two concep¬ 
tual building blocks for the design of stabilizing 
adaptive controllers have been discussed, and 
classes of systems for which these blocks al¬ 
low to explicitly design adaptive controllers have 
been given. The role of parameter convergence, 
or lack thereof, has been briefly discussed to¬ 
gether with connections between adaptive and 
observer designs. The difficulties associated with 
non-full state measurement and with nonlinear 
parameterization have been also briefly high¬ 
lighted. Several problems have not been dis¬ 
cussed, for example, model reference adaptive 
control, robust adaptive control, universal adap¬ 
tive controllers, and the use of projections to 
incorporate prior knowledge on the parameter. 
Details on these can be found in the bibliography 
below. 


Cross-References 

► Adaptive Control, Overview 

► History of Adaptive Control 

► Stochastic Adaptive Control 

► Switching Adaptive Control 
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Abstract 

Nonlinear filters estimate the state of dynami¬ 
cal systems given noisy measurements related to 
the state vector. In theory, such filters can pro¬ 
vide optimal estimation accuracy for nonlinear 
measurements with nonlinear dynamics and non- 
Gaussian noise. However, in practice, the actual 
performance of nonlinear filters is limited by the 
curse of dimensionality. There are many different 
types of nonlinear filters, including the extended 
Kalman filter, the unscented Kalman filter, and 
particle filters. 
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Description of Nonlinear Filters 

Nonlinear filters are algorithms that estimate the 
state vector (x) of a nonlinear dynamical system 
given measurements of nonlinear functions of the 
state vector corrupted by noise. Such filters also 
quantify the uncertainty in the resulting estimate 
of the state vector (e.g., using the error covariance 
matrix). Some nonlinear filters compute the entire 
probability density of the state vector conditioned 
on the set of measurements available, rather than 
computing a point estimate of the state vector 
(e.g., conditional mean or maximum likelihood). 
For some applications the conditional probability 
density of x is highly non-Gaussian (e.g., strongly 
multimodal). Even if the measurement noise and 
the process noise and the initial uncertainty in 
x are all Gaussian, the conditional density of 
x can be non-Gaussian, owing to the nonlin¬ 
earities in the dynamics or measurements. The 
dynamical systems can evolve in continuous time 
or discrete time, and the measurements can be 
made in continuous time or at discrete times. 
The most popular nonlinear filter in practical 
applications is the extended Kalman filter (EKF), 
but there are many other families of nonlin¬ 
ear filters, including particle filters, unscented 
Kalman filters (UKFs), batch least squares, exact 
finite-dimensional filters, Gaussian sum filters, 
cubature Kalman filters, etc. Table 1 summarizes 
the most popular nonlinear filters. The theory 
for nonlinear filters is relatively simple (see Ho 
and Lee 1964), but the crucial practical issue is 
computational complexity, even today with fast 
modern inexpensive computers, e.g., graphical 
processing units (GPUs). See Ristic et al. (2004) 
for a book which is both accessible to engineers 
and thorough. 


Bayesian Formulation of Filtering 
Problem 

The Bayesian approach to nonlinear filters is by 
far the most popular formulation of the problem 
(see Ho and Lee 1964), and it has virtually 
eliminated all other competing theories, because 
it is simple, general, systematic, and useful. All 
ten nonlinear filters listed in Table 1 are Bayesian. 
The Bayesian approach uses a model of the dy¬ 
namics of x as well as a model of the measure¬ 
ments. For example, discrete-time dynamics and 
measurement models are typically of the form 

x(tk+l) = f(x(t k ),t k ) + w(t k ) 

z(t k ) = h(x(t k ),t k ) + v(t k ) 

in which x(t k ) is the d-dimensional state vector at 
time t k , z(t k ) is the m-dimensional measurement 
vector at time t k , v is the measurement noise, and 
w is the so-called process noise. Both v and w are 
often modeled as Gaussian zero-mean random 
processes with statistically independent values at 
distinct discrete times, but these models could be 
highly non-Gaussian with statistically correlated 
random values. The initial probability density of 
x before any measurements are available is also 
used in the Bayesian formulations. Real physical 
systems are most commonly modeled as evolving 
in continuous time using Ito stochastic differen¬ 
tial equations: 

dx = f(x(t), t)dt + dw 

However, most engineers would rather think 
of the above Ito equation as an ordinary differen¬ 
tial equation driven by Gaussian white noise: 

dx/dt = f(x(t), t) + dw/dt 

Mathematicians prefer the Ito equation to avoid 
the embarrassment that the time derivative of w(t) 
does not exist. For details of stochastic calculus, 
see Jazwinski (1998). Such mathematical sub¬ 
tleties rarely cause any trouble in practical en¬ 
gineering applications. We emphasize, however, 
that it is important to correctly model continuous- 
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Nonlinear Filters, Table 1 Summary of nonlinear filters 

Nonlinear filter 

Conditional 

probability 

density 

Computational 

complexity 

Comments 

References 

1. Extended Kal¬ 
man filter (EKF) 

Gaussian 

d 3 

Gives good accuracy for 
many practical 
applications but can be 
highly suboptimal in 
difficult problems 

Gelb et al. (1974) 

2. Unscented Kal¬ 
man filter(UKF) 

Gaussian 

d 3 

Often the UKF beats the 
EKF, but sometimes the 
EKF is better than the 

UKF; see Noushin (2008) 
for details 

Julier and 

Uhlmann (2003) 

3. Batch least sq¬ 
uares 

Gaussian 

d 3 

Often beats the EKF 
accuracy but can fail for 
multimodal or other 
strongly non-Gaussian 
densities 

Sorenson (1980) 

4. Particle filter 

Arbitrary 

Varies from d 3 to 
exponential in d, 
depending on many 
features of the problem 

Often beats the EKF 
accuracy but can fail due 
to the curse of 
dimensionality and 
particle degeneracy and 
ill-conditioning 

Doucet (2011) 

5. Cubature Kal¬ 
man filter 

Gaussian 

d 3 

Sometimes beats the EKF 
and UKF for difficult 
nonlinear non-Gaussian 
problems, but not always 

Haykin (2010) 

6. Gaussian sum 

Arbitrary 

Varies from d 3 to 
exponential in d, 
depending on many 
features of the problem 

Beats the EKF for certain 
difficult nonlinear 
non-Gaussian problems 

Sorenson (1988) 

7. Exact finite-di¬ 
mensional filters 

Exponential 

family 

d 3 

Beats the EKF for certain 
difficult nonlinear non- 
Gaussian problems 

Daum (2005) 

8. Implicit 
particle filters 

Arbitrary 

Suffers from the curse 
of dimensionality 
(i.e., computation time 
grows exponentially 
ind) 

Only low-dimensional 
numerical examples have 
been published so far 

Chorin (2009) 

9. Particle flow 
filter 

Arbitrary 

Faster than standard 
particle filters by many 
orders of magnitude 
for high-dimensional 
problems (but 
unfortunately there is 
no explicit formula for 
computation time) 

Beats the EKF by orders 
of magnitude for certain 
difficult nonlinear 
non-Gaussian problems 

Daum (2013) 

10. Numerical so¬ 
lution of Fokker- 
Planck equation 

Arbitrary 

Suffers from the curse 
of dimensionality 
(i.e., computation time 
grows exponentially 
in d) 

Beats the EKF by orders 
of magnitude for certain 
difficult nonlinear 
non-Gaussian problems 

Ristic (2004) 
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time random processes for the evolution of the 
state vector (x) in many practical applications. 
Similarly, one can model measurements in con¬ 
tinuous time using Ito calculus: 

dz = h(x(t), t)dt + dv 

Most engineers consider continuous-time mea¬ 
surement models as impractical and unnecessar¬ 
ily complicated mathematically, because digital 
computers always require discrete-time measure¬ 
ments and there are no practical analog com¬ 
puters that can be used for nonlinear filtering, 
owing to the overwhelming superiority of digital 
computers in terms of accuracy, stability, dy¬ 
namic range, and flexibility. Nevertheless, there 
are many papers published by researchers us¬ 
ing continuous-time measurement models. But 
the vast majority of practical papers on nonlin¬ 
ear filters use discrete time measurement models 
for obvious reasons. This contrasts sharply with 
the practical importance of correctly modeling 
continuous-time random processes for the evolu¬ 
tion of the state vector (x). 

Nonlinear Filter Algorithms 

There is no universally best nonlinear filter for 
all applications, and there is much debate about 
which is the best nonlinear filter for any given 
application. Even if we knew the best nonlinear 
filter for a given computer, the answer could be 
very different for a different computer; in partic¬ 
ular, some filters can exploit massively parallel 
processing architectures, whereas others cannot. 
Research and development of nonlinear filters 
should continue rapidly for the foreseeable fu¬ 
ture. More generally, there is no universal the¬ 
ory of computational complexity for practical 
algorithms of this type; perhaps the closest ap¬ 
proximation to such a theory is “information- 
based complexity” (IBC); e.g., see Traub and 
Werschulz (1998) and Dick et al. (2013). The 
estimation accuracy of x and the computational 
complexity of the nonlinear filter are intimately 
connected, as shown below for particle filters. 


There is no useful way to quantify the com¬ 
putational complexity of nonlinear filters with¬ 
out also quantifying estimation accuracy of x. 
This contrasts with standard computational com¬ 
plexity theory (e.g., P vs. NP) because we are 
interested in approximations rather than exact so¬ 
lutions. This is the basic idea of IBC. In practice, 
engineers compare the estimation accuracy and 
computational complexity of different nonlinear 
filters using Monte Carlo simulations for specific 
applications and specific computers. 

The most active area of current research in 
nonlinear filters is focused on particle filters, 
which have the promise of optimal accuracy for 
essentially any nonlinear filter problem, at the 
cost of very high computational complexity for 
high-dimensional problems. In the early days 
(1994-2004), researchers often asserted that par¬ 
ticle filters “beat the curse of dimensionality,” 
but it is well known today that this assertion 
is wrong (e.g., see Daum 2005). Unfortunately, 
there is no useful theory of computational com¬ 
plexity for particle filters, but rather the currently 
available theory gives asymptotic bounds on ac¬ 
curacy with generic “constants.” Such bounds 
on the variance of estimation error are generally 
of the form c/N in which N is the number of 
particles and c is the generic so-called constant. 
But we know that the so-called constant actually 
varies by many orders of magnitude depending 
on the specifics of the problem, including the 
following: (1) dimension of the state vector be¬ 
ing estimated, (2) uncertainty in the initial state 
vector, (3) measurement accuracy, (4) stability 
of the dynamical system that describes the time 
evolution of the state vector, (5) geometry of 
the conditional probability densities (e.g., uni- 
modal, log-concave, multimodal, etc.), (6) Lip- 
schitz constants of the log probability densi¬ 
ties, (7) curvature of the nonlinear dynamics and 
measurements, (8) ill-conditioning of the Fisher 
information matrix for the estimation problem, 
etc. Moreover, there are no tight bounds on the 
so-called constant c for practical nonlinear filter 
problems, but rather the best bounds for simple 
MCMC problems are known to be 30 orders of 
magnitude too large; see Dick et al. (2013). 
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Discrete-Time Measurement Models 

Research papers on nonlinear filters are often 
mathematically abstract, but advanced math is not 
required for practical engineering applications 
(e.g., see Ho and Lee 1964). In particular, one 
can avoid the advanced stochastic mathematics 
used for continuous-time measurements by using 
discrete-time measurements, which is the practi¬ 
cal case of interest anyway, owing to the use of 
digital computers to implement such algorithms. 
The notion that continuous-time measurements 
results in simpler, better, or more elegant results 
for nonlinear filters is misleading; for exam¬ 
ple, we have the elegant innovation theory for 
continuous-time measurements (Kailath 1970), 
but this theory is not applicable for discrete¬ 
time measurements, likewise with the elegant 
formula for propagating the conditional mean 
for continuous-time measurements (the so-called 
Fujisaki-Kallianpur-Kunita formula). More gen¬ 
erally, the simple discrete-time version of Bayes’ 
rule suffices for practical real-world engineering 
applications; there is rarely a need to employ 
the more complex continuous-time version. The 
discrete time formula for Bayes’ rule is simply 

p(x(t k ),t k |Z k ) = p(x(t k ), t k |Z k —i)p(z k |x(t k ))/p(z k |Z k —i) 

in which 

p(x(t k ), t k |Z k ) = probability density of x at time 
t k conditioned on Z k ; this is also called the 
“posteriori probability density” 
x(t) = state vector of the dynamical system at 
time t 

Z k = set of all measurements up to and including 
time t k 

z k = measurement vector at time t k 
p(z k |x(t k )) = probability density of z k condi¬ 
tioned on x(t k ); this is also called the “like¬ 
lihood” 

p(A|B) = probability density of A conditioned 
on B 

This is all one needs to know about Bayes’ 
rule for practical engineering applications of non¬ 
linear filtering; see Ho and Lee (1964). Bayes’ 


rule is a simple formula that multiplies two prob¬ 
ability densities and normalizes it by dividing by 
p(z k |Z k _i). In most applications, there is no need 
to normalize the density, and hence, Bayes’ rule 
for the unnormalized conditional density is even 
simpler: 

p(x(t k ),t k |Z k ) = p(x(t k ),t k |Z k _i)p(z k |x(t k )) 

We see that Bayes’ rule for the unnormalized 
conditional density is simply a multiplication of 
two densities (i.e., the likelihood and the prior). 

Summary and Future Directions 

In practical applications, the most popular non¬ 
linear filter is the extended Kalman filter (EKF), 
followed by the unscented Kalman filter (UKF). 
These two filters give good accuracy and robust 
performance for many practical applications. The 
computational complexity of both the EKF and 
UKF grows as the cube of the dimension of the 
state vector, and hence, they are very practical 
to run in real time on laptops or PCs for many 
real- world applications. But there are also many 
difficult nonlinear or non-Gaussian problems for 
which the EKF and UKF give suboptimal accu¬ 
racy, and in some cases, they give surprisingly 
bad accuracy. The accuracy of optimal nonlinear 
filters is limited by the curse of dimensionality. 
We know how to write the equations for the 
optimal nonlinear filter, but the solution generally 
takes an exponentially increasing time to com¬ 
pute as the dimension of the state vector grows. 
There are many different kinds of nonlinear fil¬ 
ters, and this is still an active field of research, 
as shown in Crisan and Rozovskii (2011). Future 
research is likely to exploit advances in compu¬ 
tational complexity theory for approximation of 
functions in the style of information-based com¬ 
plexity (IBC) rather than P vs. NP theory. This 
is because we want good fast approximations 
rather than exact algorithms. A lucid introduc¬ 
tion to IBC is Traub and Werschulz (1998), and 
recent work is surveyed in Dick et al. (2013). 
Another fruitful direction of research is to ex¬ 
ploit the recent advances in transport theory, 
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as explained in Daum (2013); the best intro¬ 
duction to transport theory is the book by Vil- 
lani (2003), which is very accessible yet thor¬ 
ough. Research in exact finite-dimensional fil¬ 
ters is difficult but could yield substantial im¬ 
provements in accuracy and computational com¬ 
plexity; for example, see Benes (1981), Marcus 
(1984), and Daum (2005). Progress in nonlin¬ 
ear filter research could be inspired by many 
diverse fields, including fluid dynamics, quan¬ 
tum chemistry, quantum field theory, gauge the¬ 
ory, string theory, Lie superalgebras, Lie super¬ 
groups, and neuroscience. An important open 
research topic is the stability of nonlinear fil¬ 
ters, which is obviously a fundamental limita¬ 
tion to good theoretical upper bounds on esti¬ 
mation error. We still do not have a practical 
theory of stability for nonlinear filters. Perhaps 
the closest approximation to such a theory is 
the lucid paper by van Handel (2010), which 
makes an interesting attempt at understanding 
the stability of nonlinear filters. In particular, 
van Handel’s paper aims to generalize Kalman’s 
theory of stability for the Kalman filter by con¬ 
necting stability with the essence of controlla¬ 
bility and observability. A good survey of what 
is known about stability theory for nonlinear 
filters is given in various articles in Crisan and 
Rozovskii (2011). 

Cross-References 

► Estimation, Survey on 

► Extended Kalman Filters 

► Kalman Filters 

► Particle Filters 
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Abstract 

Sampled-data systems are control systems in 
which the feedback law is digitally implemented 
via a computer. They are prevalent nowadays 
due to the numerous advantages they offer 
compared to analog control. Nonlinear sampled- 
data systems arise in this context when either the 
plant model or the controller is nonlinear. While 
their linear counterpart is now a mature area, 
nonlinear sampled-data systems are much harder 
to deal with and, hence, much less understood. 
Their inherent complexity leads to a variety of 
methods for their modeling, analysis, and design. 
A summary of these methods is presented in this 
entry. 


Keywords 

Discrete time; Nonlinear; Sampled data; Sam¬ 
pler; Zero-order hold 


Introduction 

Definition: A control system in which a 
continuous-time plant is controlled by a digital 
computer is referred to as a sampled-data 
control system or simply a sampled-data system 
(Chen and Francis 1994); see Fig. 1. Nonlinear 
sampled-data systems arise when either the 
model of the plant or the controller is nonlinear; 
otherwise the system is referred to as a linear 
sampled-data system. 

Motivation: Sampled-data control is prefer¬ 
able to continuous-time (analog) control for a 


range of reasons including reduced cost, reduced 
wiring, more robust hardware, easier and more 
flexible programming, and so on. Nowadays, a 
large majority of controllers are implemented 
on digital computers, and, hence, sampled-data 
systems are prevalent in practice. On the other 
hand, nonlinear plant models are necessary in 
numerous applications when a wide range of op¬ 
erating conditions need to be considered or when 
truly nonlinear phenomena, such as friction or 
state/input constraints, are not negligible. Hence, 
there are many situations where nonlinear plant 
models are essential, such as vertical takeoff and 
landing of an aircraft, robots, automotive engines, 
and biochemical reactors, to name a few. It has 
to be noted that the nonlinearity may also come 
from the controller even when we consider linear 
plants as it is the case in adaptive control or model 
predictive control with constraints, for example. 

Structure of sampled-data systems: Figure 1 
presents a typical structure of a sampled-data 
system which consists of a continuous-time plant, 
an analog-to-digital (A/D) converter (i.e., a sam¬ 
pler), a digital-to-analog (D/A) converter (i.e., a 
hold device), and a discrete-time controller. 

The A/D converter takes measurements y (4) 
of a continuous-time output signal y(t ), such as 
temperature or pressure, at sampling time instants 
tk,k = 0,1,... and sends them to the control 
algorithm. The measurements are obtained with 
finite precision (i.e., they are quantized); this ef¬ 
fect is not considered in this entry. The sampling 
instants 4 are often equidistant, that is, 4 = kT, 
k = 0,1,..., where the distance T between any 
two consecutive sampling instants is referred to 
as the sampling period. The sampling period is 
an important degree of freedom in the design of 
sampled-data systems and it needs to be carefully 
selected. 

The control algorithm is discrete in nature. It 
takes the sequence of measurements y(4) and 
processes them to produce a sequence of control 
values u{tk). The D/A converter converts the se¬ 
quence of control values u(tk) into a continuous¬ 
time signal u{t) that drives the actuators which 
control the plant. Typically, a zero-order hold is 
used, i.e., u(t ) = w(4), Vf e [4,4+0- However, 
it is possible to use other types of holds. 
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Nonlinear Sampled-Data 
Systems, Fig. 1 

Sampled-data (control) 
system 



Note that the system in Fig. 1 can be general¬ 
ized in many ways. An important generalization 
is multi-rate sampling where the output of the 
system is sampled at one sampling rate while 
the control inputs are updated at a different sam¬ 
pling rate. Another generalization are networked 
control systems which are discussed in the last 
section. 


Modeling 

The combination of continuous-time and 
discrete-time components renders the analysis 
and the design of sampled-data systems 
challenging. Still, linear systems allow for 
computationally efficient analysis and design 
techniques that benefit from the z and 8 
transforms, as well as convex optimization 
(Chen and Francis 1994). Nonlinear sampled- 
data systems, one the other hand, are much harder 
to deal with since the aforementioned methods do 
not apply in this case. This inherent difficulty has 
led to a variety of models for different analysis 
and design methods: 

1. Continuous-time models 

2. Discrete-time models 

3. Sampled-data models 

We discuss bellow each of these models, their 
features, and the analysis or design methods that 
exploit them. 

Continuous-time models basically ignore the 
sampling process and assume that all signals are 
continuous time. They are the coarsest approx¬ 
imation of the sampled-data system and they 


are useful only for very small sampling periods. 
Nevertheless, they are invaluable and are used as 
the first step in the controller/observer design in 
the so-called emulation design approach. 

Discrete-time models only capture the be¬ 
havior of the sampled-data system at sampling 
instants. Indeed, they ignore the inter-sample be¬ 
havior of the system and this is their main draw¬ 
back. There are two ways in which nonlinear 
discrete-time models arise: (i) from the identi¬ 
fication of the plant model using the sampled 
measurements and (ii) from the discretization 
of a known continuous-time plant model. For 
instance, black box identification methods often 
lead to nonlinear discrete-time models in input- 
output form, such as NARMA (nonlinear auto¬ 
regressive moving average) models (Chen et al. 
1989; Juditsky et al. 1995). Depending on the 
approximating functions used, the nonlinearities 
can be polynomial, neural network type, fuzzy 
type, and so on. On the other hand, the discretiza¬ 
tion of the continuous-time plant model requires 
an exact analytic solution of a set of nonlinear 
differential equations. When such an analytic 
solution exists, we can obtain the exact discrete¬ 
time models of the system; this is typically as¬ 
sumed for linear plants. Nonlinear sampled-data 
systems are different from their linear counter¬ 
parts in that it is typically impossible to obtain the 
exact discrete-time model and only approximate 
discrete-time models are available for analysis 
and design (Nesic et al. 1999; Nesic and Teel 
2004). 

Sampled-data models capture the true be¬ 
havior of the sampled-data system including its 
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inter-sample behavior. There are several ways 
in which this can be achieved. One way is to 
model the piecewise constant signals that arise 
from zero-order hold devices as signals with a 
time-varying delay; this gives rise to time-delay 
nonlinear models (Teel et al. 1998). Another 
recently proposed approach is to model nonlinear 
sampled-data systems as hybrid dynamical sys¬ 
tems (Goebel et al. 2012). An extensive analy¬ 
sis and design toolbox has been developed for 
hybrid dynamical systems and these results can 
be used for nonlinear sampled-data systems. An¬ 
other class of models, based on the so-called lift¬ 
ing, has been applied for linear systems where the 
system is represented as a discrete-time system 
with infinite dimensional input and output spaces. 
While this approach has been very successful in 
the linear context (Chen and Francis 1994), it ap¬ 
pears that it is not as useful for nonlinear systems 
due to difficulties arising from harder analysis 
and prohibitive computational requirements. 

The Main Issues and Analysis 

Controllability/observability: Issues arising 
due to sampling in linear systems transfer to 
the nonlinear context although they are less 
understood in this case. For instance, it is 
well known that sampling may “destroy” the 
controllability and/or observability properties of 
the system (Chen and Francis 1994). In other 
words, if the continuous-time plant model is 
controllable/observable, then the corresponding 
exact discrete-time model of the plant may 
not verify these properties for some sampling 
periods. A simple test is available for linear 
systems to avoid this phenomenon, but we are 
not aware of similar results in the nonlinear 
context. 

Finite escape times: A major difference 
between continuous-time linear and nonlinear 
systems is that the former have well defined 
solutions for constant control inputs and 
arbitrarily long sampling periods. This is not 
the case, in general, for nonlinear systems as they 
may exhibit finite escape times. In other words, 
for a constant input it may happen for some initial 


conditions of a nonlinear system that solutions 
blow up within a time that is shorter than the 
sampling period. As a consequence, for such an 
initial condition and input, the exact discrete-time 
system cannot be defined. This is a fundamental 
obstacle to achieving global stability results for 
nonlinear systems if the sampling period is fixed 
and independent of the size of the initial state. 
Nevertheless, it is possible to ensure semi-global 
stability properties for very general nonlinear 
systems which means that any compact domain 
of convergence can be achieved if the sampling 
period is sufficiently reduced (Nesic and Teel 
2004). 

Model structure is changed: An important 
issue for nonlinear sampled-data systems is that 
the sampling modifies the structure of the model. 
When the continuous-time plant model has a 
certain structure, such as triangular or affine 
in the input, the corresponding exact discrete¬ 
time model will not inherit it; see Monaco and 
Normand-Cyrot (2007) and Yuz and Goodwin 
(2005). This significantly complicates the design 
of sampled-data systems via the discrete¬ 
time approach since many nonlinear design 
techniques, like backstepping or forwarding, are 
heavily reliant on the structure of the model. 

Zero dynamics: Probably the most signifi¬ 
cant aspect of the changed structure are the so- 
called sampling zeros. In linear systems, it is well 
known that if a continuous-time linear system of 
relative degree r > 2 is sampled, then generically 
for fast sampling the discrete-time models of 
the plant will have relative degree r = 1. In 
other words, sampling introduces extra zeros in 
the model which are often unstable and thus 
render the system non-minimum phase. It is well 
known that the controller design is much harder 
for non-minimum phase systems, and, moreover, 
there are certain fundamental performance limi¬ 
tations in this case. Recently, results that extend 
the notion of sampling zeros to the nonlinear 
sampled-data systems have been reported; see 
the references in Monaco and Normand-Cyrot 
(2007). 

Passivity: Some plant properties like passivity 
are much more restrictive in discrete time than 
in continuous time. Indeed, it is necessary for a 
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continuous-time plant to have relative degree 1 
or 0 to be passive, whereas only relative degree 
0 discrete-time plants may possess this property. 
In other words, an exact discrete-time model of 
a passive continuous-time plant of relative degree 
1 will not be passive; that is, sampling typically 
destroys passivity. 

Controller Design 

Linearization: The simplest way to design 
sampled-data nonlinear systems is to linearize the 
plant at a given operating point. In this case, the 
nonlinear plant dynamics are approximated by a 
linear model around a chosen equilibrium, and 
then any of the linear sampled-data techniques 
can be applied to the linearized model. The 
obtained solution is then implemented on the true 
nonlinear plant. The drawback of this technique 
is that the solution would typically perform well 
only in the vicinity of the selected equilibrium 
point. 

Nonlinear methods: An alternative is to per¬ 
form designs that rely on a nonlinear plant model. 
These approaches can be divided into feedback 
linearization, emulation design method, (approxi¬ 
mate and exact) discrete-time design method, and 
sampled-data design method. 

Feedback linearization: Some classical prob¬ 
lems, like feedback linearization, are harder for 
sampled-data systems than continuous-time ones. 
It was shown that a class of discrete-time nonlin¬ 
ear systems for which feedback linearization is 
possible is smaller than the corresponding class 
of continuous-time systems (Grizzle 1987). This 
has led to approximate feedback linearization 
techniques which consider achieving feedback 
linearization approximately with an error that can 
be reduced by reducing the length of the sampling 
period (Arapostathis et al. 1989). 

Continuous-time design method (Emulation 
design): Emulation is a design technique consist¬ 
ing of two steps. In the first step, a continuous¬ 
time controller or observer is designed for the 
continuous-time plant while ignoring sampling 
to achieve appropriate stability, performance, 
and/or robustness guarantees. In the second step, 


the designed controller/observer is discretized 
for implementation and the sampling period is 
reduced sufficiently for the method to work. This 
method is approximate since the continuous-time 
plant model approximates well the sampled-data 
systems only for sufficiently small sampling peri¬ 
ods. The discretization can be done using various 
implicit or explicit Runge-Kutta methods, such as 
the forward or backward Euler method (Monaco 
and Normand-Cyrot 2007; Yuz and Goodwin 
2005). The emulation method is probably the 
best understood of all design methods. It was 
shown that a range of stability properties that 
can be cast in terms of dissipation inequalities 
are preserved in an appropriate sense under the 
emulation approach (Laila et al. 2002). Moreover, 
nonconservative estimates of the upper bound for 
the required sampling period in emulation have 
been reported recently (Nesic et al. 2009). 

Exact discrete-time design method: Exact 
discrete-time design method assumes that 
an exact discrete-time model of the plant is 
available to the designer; see Kotta (1995) 
and the references cited therein. This approach 
is reasonable when black box identification 
techniques are used for modeling. Moreover, 
in some rare cases it is possible to obtain 
the exact discrete-time model of the plant by 
integrating the continuous-time model with 
fixed inputs (assuming the zero-order hold is 
used). This is the case when the plant dynamics 
are linear while the control law is nonlinear 
(e.g., adaptive control) or the plant is linear 
with state/input constraints, which is a setup 
often used in the model predictive control. The 
literature on exact discrete-time design method 
is vast and many of the nonlinear continuous¬ 
time design techniques, like backstepping, 
forwarding, and passivity-based designs, are 
extended to discrete-time nonlinear systems; see 
Kotta (1995) and Grizzle (1987). A drawback 
of these methods is that they assume a special 
structure of the discrete-time nonlinear model, 
such as upper or lower triangular structure, which 
is typically much more restrictive in discrete¬ 
time than in continuous-time due to the loss of 
structure due to sampling that was discussed 
earlier. 
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Approximate discrete-time design method: 
Due to the nonlinearity, it is impossible in 
most cases to obtain an exact discrete-time 
plant model by integrating its continuous-time 
model equations; instead, a range of approximate 
discrete-time plant models, such as Runge-Kutta, 
can be used for controller/observer design. It 
was recently shown that this design method 
may lead to disastrous consequences where the 
controller stabilizes the approximate discrete¬ 
time plant model for all (arbitrarily small) 
sampling periods while the same controller 
destabilizes the exact discrete-time plant model 
for all sampling periods; see Nesic and Teel 
(2004) and Nesic et al. (1999). This is true even 
for linear systems and some commonly used 
discretization techniques and controller designs. 
These considerations have led to the development 
of a framework for controller design based on 
approximate discrete-time models (Nesic et al. 
1999; Nesic and Teel 2004). This framework 
provides checkable conditions on the continuous¬ 
time plant model, the approximate discrete-time 
model and the controller that guarantee that 
the controllers designed in this manner would 
stabilize the exact discrete-time model and, 
hence, the nonlinear sampled-data system for 
sufficiently small sampling periods. The design 
is based on families of approximate discrete-time 
models parameterized with the sampling period, 
and the design objectives are more demanding 
than in the continuous-time nonlinear systems. 
Ideas from numerical analysis are adapted 
to this context. This framework was used to 
design controllers and observers for classes of 
nonlinear sampled-data systems where typically 
Euler approximate discretization is employed to 
generate the approximate discrete-time model. 

Sampled-data design method: Both emula¬ 
tion and discrete-time design methods have their 
drawbacks. Indeed, the former method ignores 
the sampling at the design stage, whereas the 
latter method ignores and may produce unaccept¬ 
able inter-sampling behavior. Thus, methods that 
use a sampled-data model of the plant for design 
are much more attractive. There are two possible 
ways in which this can be achieved for nonlinear 
sampled-data systems. 


The first approach consists of representing 
nonlinear sampled-data systems as systems with 
time-varying delays (Teel et al. 1998). However, 
controller design tools for such systems need to 
be further developed. 

The second approach involves representing 
the nonlinear sampled-data system as a hybrid 
dynamical system. Recent advances on model¬ 
ing and analysis of hybrid dynamical systems 
(Goebel et al. 2012) offer great opportunities in 
this context, but the full potential of this approach 
is still to be exploited. Nonlinear sampled-data 
systems are just a small subclass of hybrid dy¬ 
namical systems, and developing specific anal¬ 
ysis and design tools tailored to this class of 
systems seems promising. 

It should be emphasized that there are many 
related techniques, such as discrete-time adaptive 
control and model predictive control, that deal 
with classes of nonlinear sampled-data systems 
but are not a part of the mainstream nonlinear 
sampled-data literature. 

Summary and Future Directions 

Summary: Sampled-data control systems 
are nowadays prevalent and there are many 
situations where nonlinear models need to be 
used to deal with wider ranges of operating 
conditions, more restrictive constraints, and 
enhanced performance specifications. Despite 
their increasing importance, the design of 
nonlinear sampled-data systems remains largely 
unexplored, and it is much less developed than 
its continuous-time counterpart. A variety of 
models, analysis, and design techniques make 
nonlinear sampled-data literature very diverse 
and a comprehensive textbook reference or a 
unifying approach is still missing. Many open 
questions remain for nonlinear sampled-data 
systems, such as results on multi-rate sampling, 
design techniques based on sampled-data models, 
and other generalizations which are discussed 
below. 

Future Directions: In the 1990s, a new gener¬ 
ation of digitally controlled systems has evolved 
from the more classical sampled-data systems 
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which are generally referred to as networked 
control systems (NCS); see Heemels et al. (2010) 
and the references cited therein. These systems 
exploit digital wired or wireless communication 
networks within the control loops. Such a setup 
is introduced to reduce the cost, weight, and 
volume of the engineered systems, but its spe¬ 
cial structure imposes new challenges due to the 
communication constraints, data packet dropouts, 
quantization of data, varying sampling periods, 
time delays, etc. At the same time, these systems 
provide new flexibilities due to the distributed 
computation within the control system that can 
be used to improve the performance and mitigate 
some of the undesirable network effects on the 
overall system performance. Moreover, embed¬ 
ded microprocessors allow for event-triggered 
and self-triggered sampling (Anta and Tabuada 
2010) that are still largely unexplored especially 
for nonlinear systems. Design of NCS was identi¬ 
fied as one of the biggest challenges to the control 
research community in the twenty-first century, 
and more than a decade of intense research on this 
topic still has not provided a comprehensive and 
unifying approach for their analysis and design. 
Novel results on modeling and Lyapunov stability 
theory for (nonlinear) hybrid dynamical systems 
appear to offer the right analysis design tools but 
they are still to be converted into efficient and 
easy-to-use design tools in the control engineers’ 
toolbox. 


Cross-References 

► Event-Triggered and Self-Triggered Control 

► Hybrid Dynamical Systems, Feedback Con¬ 
trol of 

► Optimal Sampled-Data Control 

► Sampled-Data Systems 
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Abstract 

Particle filters are computational methods open¬ 
ing up for systematic inference in nonlinear/non- 
Gaussian state-space models. The particle filters 
constitute the most popular sequential Monte 
Carlo (SMC) methods. This is a relatively recent 
development, and the aim here is to provide 
a brief exposition of these SMC methods and 
how they are key enabling algorithms in solving 
nonlinear system identification problems. The 
particle filters are important for both frequentist 
(maximum likelihood) and Bayesian nonlinear 
system identification. 

Keywords 

Bayesian; Backward simulation; Maximum like¬ 
lihood; Markov chain Monte Carlo (MCMC); 
Particle filter; Particle MCMC; Particle smoother; 
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Introduction 

The state-space model (SSM) offers a general 
tool for modeling and analyzing dynamical phe¬ 
nomena. The SSM consists of two stochastic pro¬ 
cesses: the states {x t } t >\ and the measurements 
{yt}t>u which are related according to 

x,+i I (x, = X,) ~ fe(x t+ 1 I x t ,u t ), (la) 

y, | (x, = x,) ~ h e (y t \ x t ,u t ), (lb) 

and the initial state xi ~ /iq(x i). We use bold 
face for random variables and ~ means “dis¬ 
tributed according to.” The notation x r+ i | (x* = 
x t ) stands for the conditional probability of x r+ i 
given x t = x t . The state process {xj*>i is a 


Markov process, implying that we only need to 
condition on the most recent state x t , since that 
contains all information about the past. Further¬ 
more, 0 denotes the parameters, fo(-) and ho(-) 
that are probability density functions, encoding 
the dynamic and the measurement models, re¬ 
spectively. In the interest of a compact notation, 
we will suppress the input u t throughout the text. 

The SSM introduced in (1) is general in that 
it allows for nonlinear and non-Gaussian rela¬ 
tionships. Furthermore, it includes both black¬ 
box and gray-box models on state-space form. 
Nonlinear black-box and gray-box models are 
covered by ►Nonlinear System Identification: 
An Overview of Common Approaches. The off¬ 
line nonlinear system identification problem can 
(slightly simplified) be expressed as recovering 
information about the parameters 0 based on the 
information in the T measured inputs u\ : t = 
{u\, ... ,ut} and outputs For a thorough ex¬ 
position of the system identification problem, we 
refer to ► System Identification: An Overview. 
Nonlinear system identification has a long his¬ 
tory, and a common assumption of the past has 
been that of linearity and Gaussianity. This as¬ 
sumption is very restrictive, and we have now 
witnessed well over half a century of research 
devoted to finding useful approximate algorithms 
allowing this assumption to be weakened. This 
development has significantly intensified during 
the past two decades of research on sequential 
Monte Carlo (SMC) methods (including particle 
filters and particle smoothers). However, the use 
of SMC for nonlinear system identification is 
more recent than that. The aim here is to in¬ 
troduce the key ideas enabling the use of SMC 
methods in solving nonlinear system identifica¬ 
tion problems, and as we will see, it is not a 
matter of straightforward application. The devel¬ 
opment of SMC-based identification follows two 
clear trends that are indeed more general: (1) The 
problems we are working with are analytically 
intractable, and hence, the mindset has to shift 
from searching for closed-form solutions to the 
use of computational methods , and (2) the new 
algorithms have basic building blocks that are 
themselves algorithms. Both these trends call for 
new developments. 
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Before the SMC methods are introduced 
in section “Sequential Monte Carlo”, their 
need is clearly explained by formulating both 
the Bayesian and the maximum likelihood 
identification problems in sections “Bayesian 
Problem Formulation” and “Maximum Like¬ 
lihood Problem Formulation”, respectively. 
Solutions to these problems are then provided in 
sections “Bayesian Solutions” and “Maximum 
Likelihood Solutions”, respectively. Finally, 
we give some intuition for online (recursive) 
solutions in section “Online Solutions”, and in 
section “Summary and Future Directions”, we 
conclude with a summary and directions for 
future research. 

Bayesian Problem Formulation 

In formulating the Bayesian problem, the param¬ 
eters 9 are modeled as unknown stochastic vari¬ 
ables, i.e., the model (1) needs to be augmented 
with a prior density for the parameters 0 ~ 
p{9). The aim in Bayesian system identification 
is to compute the posterior density of 0 given 
the measurements p{9 \ y\ : r)- More generally, 
we typically compute the joint posterior of the 
parameters 0 and the states x\ : t, 

p(0,xi :T I yi :T ) = p(xi:T \ 9,yi :T )p(0 I yi :T ). 

( 2 ) 

By explicitly including the state variables in 
the problem formulation according to (2), they 
take on the role of auxiliary variables. The reason 
for including the state variables X\ : t as auxiliary 
variables is that the alternative of excluding them 
would require us to analytically marginalize the 
states xi :t- This is not possible for the model (1) 
under study. However, once we have an approxi¬ 
mation of p(9,x\ : t | yur) available, the density 
p(9 | y\ : r) is easily obtained by straightforward 
marginalization. 

Maximum Likelihood Problem 
Formulation 

In formulating the maximum likelihood (ML) 
problem, the parameters 6 are modeled as un¬ 
known deterministic variables. The ML formula¬ 
tion offers a systematic way of computing point 


estimates of the unknown parameters 6 in a 
model, by making use of the information avail¬ 
able in the obtained measurements y\ : r. The 
ML estimate is obtained by finding the 0 that 
maximizes the so-called log-likelihood function, 
which is defined as 

T 

l T (0) = log p$(y\:T ) = I 

t = 1 

( 3 ) 

Note that we use 9 as a subindex to denote 
that the corresponding probability density func¬ 
tion is parameterized by 9 , analogously to what 
was done in (1). The one step ahead predictor 
Po(yt I yi:t-i) is computed by marginalizing 
p(yt,x t I yut-i) = h e (y t | x t )po(x t \ yut-i) 
w.r.t. x t9 i.e., integrating out x t from p(y t ,x t \ 
yi;t-i). To summarize, the ML estimate 9 ML is 
obtained by solving the following optimization 
problem: 

0 ML = argmax log fh e (y, \ x t ) 

6 

Pe(x t | yu t -i)dx t . (4) 

This problem formulation clearly reveals the im¬ 
portant fact that the nonlinear state inference 
problem (here computing po(x t \ yv.t-i)) is 
inherent in any maximum likelihood formulation 
for identification of SSMs. For linear Gaussian 
models, the Kalman filter offers closed-form so¬ 
lutions for the state inference problem, but for 
nonlinear models, there are no closed-form solu¬ 
tions available. 


Sequential Monte Carlo 

Solving the nonlinear system identification 
problem implicitly requires us to solve various 
nonlinear state inference problems. We will, for 
example, need to approximate the smoothing 
density p{x\ : r \ yv.r) and the filtering 
density p(x t \ y \ :t ). The SMC samplers 
offer approximate solutions to these and other 
nonlinear state inference problems, where 
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the accuracy is only limited by the available 
computational resources. This section only deals 
with the state inference problem, allowing us to 
drop the 6 in the notation for brevity. 

Most SMC samplers hinge upon importance 
sampling, motivating section “Importance Sam¬ 
pling”. In section “Particle Filter”, we make use 
of importance sampling in computing an approxi¬ 
mation of the filtering density p(x t \ y \ :t ), and in 
section “Particle Smoother”, a particle smoothing 
strategy is introduced to approximately compute 
P(X\:T | Jl:r)- 


, , x h(y, I x t )p(x t I y Ut -i) 

p(x, I yi:t) = - ^ ^-, (6a) 


P(yt I yu-\) 
P(x, | yi,t-i) = J f(x, | x,-0 


P(Xf—\ | Jl:r-l)dX/_i. 


(6b) 


In the general case (1) there are no analytical 
solutions available for the above equations. The 
particle filter maintains an empirical approxima¬ 
tion of the solution, which at time t — 1 amounts to 


Importance Sampling 

Let z be a random variable distributed according 
to some complicated density 7t(z) and let <^(-) be 
some function of interest. Importance sampling 
offers a systematic way of evaluating integrals of 
the form 

E [^(z)] = J (p(z)n(z)dz, (5) 

without requiring samples directly generated 
from t r(z). The density 7t(z) is referred to as 
the target density, i.e., the density we are trying 
to sample from. The importance sampler relies 
on a proposal density q(z ), from which it is 
simple to generate samples, let z 7 ~ q(z ), 
i = l,N. Since each sample z 1 is drawn 
from the proposal density rather than from 
the target density tt(z), we must somehow 
account for this discrepancy. The so-called 
importance weights w 7 = 7t(z l )/q(z l ) encode 
the difference. By normalizing the weights 
w 7 = w 7 /^7=i > we obtain a set of 

weighted samples {z 7 , w 7 }f =l that can be used to 
approximately evaluate the integral (5) resulting 
in E [cp{ z)] % l w? <K z7 )- Schon and Lindsten 
(2014) provide an introduction to importance 
sampling within a dynamical systems setting, 
whereas Robert and Casella (2004) provide a 
general treatment. 

Particle Filter 

The solution to the nonlinear filtering problem 
is provided by the following two recursive 
equations: 


N 

p N (x t -11 ji :< _i) = y>u^-i), (7) 

i = 1 

where <5 X / denotes the Dirac delta mass 

located at x 7 _ t . Furthermore, w 7 _ x and x\_ x are 
referred to as the weights and the particles, re¬ 
spectively. We will now derive the particle filter 
by designing an importance sampler allowing 
us to approximately solve (6). The derivation is 
performed in an inductive fashion, starting by 
assuming that p(x t -\ \ yv.t-i) is approximated 
by (7). Inserting (7) into (6b) results in p N 
(x, I Ji:,—i) = £f=i w‘_i/(x ; I which is 

used in (6a) to compute an approximation of the 
filtering density p(x t \ y\ :t ) up to proportionality. 
Hence, this allows us to target p(x t \ y\ :t ) 
using an importance sampler, where the form of 
p N (x t | yi:t-i) suggests that new samples can be 
proposed according to 

N 

4 ~ q(xt I yut) = I x r-i) - ( 8 > 

i = 1 

It is worth noting that we can obtain a more 
general algorithm by replacing f(x t \ xJ_j) in 
the above mixture with a density q(x t \x l t _ x ,y t ). 
However, in the interest of a simple, but still 
highly useful algorithm, we keep (8). The pro¬ 
posal density (8) is a weighted mixture consisting 
of N components, which means that we can 
generate a sample x 7 from it via a two-step 
procedure: first we select which component to 
sample from, and secondly we generate a sample 
from that component. More precisely, the first 
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Algorithm 1 Bootstrap particle filter (for i = 

1,...,A) 

1. Initialization U = 1): 

(a) Sample x\ ~ p(x\). 

(b) Compute the importance weights w\ = h(y\ \ x\) 
and normalize w[ = w\ / JA=i . 

2. For t = 2 to T do: 

(a) Resample {x^^w^} resulting in equally 
weighted particles {x l t _ x , l/N}. 

(b) Sample xj ~ f(x t \ x\_ x ). 

(c) Compute the importance weights wj = h(y t | xj) 

and normalize wj = wj/ w/. 


part amounts to selecting one of the N particles 
according to 

p(x,_i = x'_! | = w'_!, 

where the selected particle is denoted as x t -\. 
By repeating this N times, we obtain a set of 
equally weighted particles {x^}^, constituting 
an empirical approximation of p(x t -\ \ yut-i), 
analogously to (7). We can then draw x\ ~ 
f{x t | xJ_j) to generate a realization from the 
proposal (8). This procedure that turns a weighted 
set of samples into an unweighted one is com¬ 
monly referred to as resampling. 

Finally, using the approximation p N (x t | 
in (6a) and the proposal density according 
to (8) allows us to compute the weights as 
wj = h(y t | xf). Once all the N weights are 
computed and normalized, we obtain a collection 
of weighted particles {x[,wj}^ 1 targeting the 
filtering density at time t. We have now (in a 
slightly nonstandard fashion) derived the so- 
called bootstrap particle filter , which was the first 
particle filter introduced by Gordon et al. (1993) 
two decades ago. Since the introduction of 
Algorithm 1, the surrounding theory and practice 
have undergone significant developments; see, 
e.g., Doucet and Johansen (2011) for an up- 
to-date survey. The weights {w ? 1;r }^ 1 and 
the particles {x^}^ are random variables, 
and in executing the algorithm, we generate 
one realization from these. This is a useful 
insight both when it comes to understanding, 
but also when it comes to the analysis of the 


particle filters. There is by now a fairly good 
understanding of the convergence properties of 
the particle filter; see, e.g., Doucet and Johansen 
(2011) for basic results and further pointers into 
the literature. 

Particle Smoother 

A particle smoother is an SMC method targeting 
the joint smoothing density p{x\ : r \ yv.r) (or 
one of its marginals). There are several different 
strategies for deriving particle smoothers. Rather 
than mentioning them all, we introduce one pow¬ 
erful and increasingly popular strategy based on 
backward simulation , giving rise to the family 
of forward filtering/backward simulation (FFBSi) 
samplers. 

In an FFBSi sampler the joint smoothing den¬ 
sity p(x\:T | yv.r) is targeted by complementing 
a forward particle filter with a second recur¬ 
sion evolving in the time-reversed direction. The 
following factorization of the joint smoothing 
density 

P(xv.t I yv.r) = (q P( x ‘ I 
p (xt | yv.r), 

immediately suggests a highly useful time- 
reversed recursion. Start by generating a sample 
x^ ~ p(xr | yur). We then continue generating 
samples backward in time by sampling from the 
so-called backward kernel p(x t \ x t +\,y\ :t ) 
according to x ( ~ p(x t \ x t +i,yi: t ), for 
t = T — 1,..., 1. The resulting sample 
x 1:T = (x 1 , ... ,x T ) is then by construction a 
sample from the joint smoothing density. Hence, 
in performing M backward simulations, we 
obtain the following approximation of the joint 
smoothing density: 

M 

p M (x\:T I yv.r) = ^ — 8 ^t(x 1:T ). ( 9 ) 

i = 1 

For details on how to design algorithms 
implementing the backward simulation strategy, 
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derivations, properties, and references, we refer 
to the recent survey on backward simulation 
methods by Lindsten and Schon (2013). 


Bayesian Solutions 

Strategies 

The posterior density (2) is analytically 
intractable, but we can make use of Markov chain 
Monte Carlo (MCMC) samplers to address the 
inference problem. An MCMC sampler allows 
us to approximately generate samples from 
an arbitrary target density tt(z). This is done 
by simulating a Markov chain (i.e., a Markov 
process) {z[r]} r >i, which is constructed in such 
a way that the stationary distribution of the chain 
is given by tt(z). The sample paths {z[r]}^ =1 of 
the chain can then be used to draw inference 
about the target distribution. Two constructive 
ways of finding a suitable Markov chain to 
simulate are provided by the Metropolis Hastings 
(MH) and the Gibbs samplers, where the latter 
can be interpreted as a special case of the 
former. See, e.g., Robert and Casella (2004) 
for details on MCMC. A Gibbs sampler targeting 
p(0,xi-.T I yv.T) is given by 

(i) Draw 9' ~ p{6 \ x\ : T,y\-.T)- 

(ii) Draw x\. T ~ p(x UT \ d',y l:T ). 

The second step is hard, since it requires us 
to generate a sample from the joint smoothing 
density. Simply replacing step (ii) with a back¬ 
ward simulator does not result in a valid method 
(Andrieu et al. 2010). 

One interesting solution is provided by the 
family of particle MCMC (PMCMC) sampler, 
first introduced by Andrieu et al. (2010). PM¬ 
CMC provides a systematic way of combining 
SMC and MCMC, where SMC is used to con¬ 
struct the proposal density for the MCMC sam¬ 
pler. The so-called particle Gibbs (PG) sampler 
resolves the problems briefly mentioned above by 
a nontrivial modification of the SMC algorithm. 
Introducing the PG sampler lies outside the scope 
of this work; we refer the reader to the ground¬ 


breaking work by Andrieu et al. (2010). During 
the past 3 years, the PG samplers have developed 
quite a lot, and improved versions are surveyed 
and explained by Lindsten and Schon (2013). 

A Nontrivial Example 

To place PMCMC in the context of nonlinear sys¬ 
tem identification, we will now solve a nontrivial 
identification problem. The PG sampler is used 
to compute the posterior density for a general 
Wiener model (linear Gaussian system followed 
by a static nonlinearity) (Giri and Bai 2010): 

X/+i = (A B) j + v,, V, ~ N (0, Q), 

(10a) 

Zf = Cx t , (10b) 

y t = g(z<) + e,, e, ~ V(0, r). 

(10c) 

Based on observed inputs u\ : t and outputs yur, 
we wish to identify the model (10). We place 
a matrix normal inverse Wishart (MNIW) prior 
on {(A,B),Q}, an inverse gamma prior on r, 
and a Gaussian process (Rasmussen and Williams 
2006) prior on the function g, resulting in a 
semiparametric model. We can without loss of 
generality fix the matrix C according to C = 
(1,0,..., 0). For a complete model specification, 
we refer to Lindsten et al. (2013). 

The posterior distribution p(6,X\ : t \ yur ) 
is computed using a newly developed PG 
sampler referred to as particle Gibbs with 
ancestor sampling (PGAS); see Lindsten and 
Schon (2013). In the present experiment we 
make use of T = 1,000 observations. The 

dimension of the state-space is 6, the linear 
dynamics contains complex poles resulting in 
oscillations as seen in Fig. 1, and the nonlinearity 
is non-monotonic; see Fig. 2. A subspace 
method is used to find an initial guess for the 
linear system, and the static nonlinearity is 
initialized using a linear function (i.e., a straight 
line). 
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Nonlinear System 
Identification Using 
Particle Filters, Fig. 1 

Bode diagram of the 
sixth-order linear system. 
The black curve is the true 
system. The red curve is 
the estimated posterior 
mean of the Bode diagram, 
and the shaded area is the 
99 % Bayesian credibility 
interval 



Frequency (rad/s) 




Nonlinear System Identification Using Particle Fil¬ 
ters, Fig. 2 The black curve is the true static nonlinearity 
(non-monotonic). The red curve is the estimated posterior 
mean of the static nonlinearity, and the shaded area is the 
99 % Bayesian credibility interval 

It is worth pausing for a moment to reflect 
upon the posterior distribution p(0,x\ : T \ yur) 
that we are computing. The unknown “parame¬ 
ters” 0 live in the space © = M 64 x T, where T is 
an appropriate function space. The states x\ : t live 
in the space M 6 * 1,000 . Hence, p(6,x\ : t \ yur) 
is actually a rather complicated object for this 
example. 


Using the PGAS sampler (with N = 
15 particles), we construct a Markov chain 
Mr],xi :r [r]}? =1 with p(9,x UT \ y UT ) as 
its stationary distribution. We run this Markov 
chain for R = 25,000 iterations, where 

the first 10,000 are discarded. The result is 
visualized in Figs. 1 and 2, where we plot 
the Bode diagram for the linear system and 
the static nonlinearity, respectively. In both 
figures we also provide the 99% Bayesian 
credibility interval. MATLAB code for Bayesian 
identification of Wiener models is available 
from user.it.uu.se/~thoscl 12/research/software, 
html. 

The resonance peaks are accurately modeled, 
but the result is less accurate at low frequen¬ 
cies (likely due to a lack of excitation). The 
fact that the posterior mean is inaccurate at low 
frequencies is encoded in our estimate of the 
posterior distribution as shown by the credibility 
intervals. 

In Figs. 1 and 2, we have visualized not only 
the posterior mean but also the uncertainty for the 
entire model. We could do this since the model 
is a linear dynamical system followed by a static 
nonlinearity. It would be most interesting if we 
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can come up with ways in which we could visu¬ 
alize the uncertainty inherent in general nonlinear 
dynamical systems. 

Maximum Likelihood Solutions 

Identifying the parameters 9 in a general non¬ 
linear SSM using maximum likelihood amounts 
to solving the optimization problem (3). This 
is a challenging problem for several reasons, 
for example, it requires the computation of the 
predictor density peiyt \ yv.t-i)- Furthermore, its 
gradient (possibly also its Hessian) is very useful 
in setting up an efficient optimization algorithm. 
There are no closed-form solutions available for 
these objects, forcing us to rely on approxima¬ 
tions. The SMC methods briefly introduced in 
section “Sequential Monte Carlo” provide rather 
natural tools for this task, since they are capable 
of producing approximations where the accuracy 
is only limited by the available computational 
resources. 

To establish a clear interface between the 
maximum likelihood problem (3) and the SMC 
methods, it has proven natural to make use of 
the expectation maximization (EM) algorithm 
(Dempster et al. 1977). The EM algorithm 
proceeds in an iterative fashion to compute ML 
estimates of unknown parameters 9 in probabilis¬ 
tic models involving latent variables. The strategy 
underlying the EM algorithm is to exploit the 
structure inherent in the probabilistic model to 
separate the original problem into two closely 
linked problems. The first problem amounts to 
computing the so-called intermediate quantity 

Q(0,0')- J log Pe(x\-. T , ywr) 

Pe'(x\:T | yi:7’)dXi:r 
= Eg> [log pe(xi:T,yi:T) I Jl:r] , (11) 

where we have already made use of the 
fact that the latent variables in an SSM are 
given by the states. Furthermore, 9' denotes 
a particular value for the parameters 0. We 
can show that by choosing a new 9 such that 


Q(9 , 9') > Q{9 ', 9'), the likelihood is either 
increased or left unchanged, i.e., It (9) > It(9'). 

The EM algorithm now suggests itself in that 
we can generate a sequence of iterates {Q k }k> i 
that guarantees that the log-likelihood is not de¬ 
creased for increasing k by alternating the fol¬ 
lowing two steps: (1) (Expectation) compute the 
intermediate quantity Q(9, 9 k ) and (2) (maxi¬ 
mization) compute the subsequent iterate 9 k+1 
by maximizing Q(9,9 k ) w.r.t. 9. This procedure 
is then repeated until convergence, guaranteeing 
convergence to a stationary point on the likeli¬ 
hood surface. 

The FFBSi particle smoother offers an approx¬ 
imation of the joint smoothing density Pq'(X\ : t | 
yur) according to (9), which inserted into (11) 
provides an approximative solution Q M (9, 9') to 
the expectation step. In solving the maximization 
step, we typically want gradients of the interme¬ 
diate quantity V#Q M (9,9'). These can also be 
approximated using (9). The above development 
is summarized in Algorithm 2, providing a solu¬ 
tion where the basic building blocks are them¬ 
selves complex algorithms, an SMC algorithm 
for the E step and a nonlinear optimization algo¬ 
rithm for the M step. This means that we have the 
option of replacing the FFBSi particle smoother 
in step 2a with any other algorithm capable of 
producing estimates of the joint smoothing den¬ 
sity. The family of PMCMC methods introduced 
in section “Bayesian Solutions” contains several 
highly interesting alternatives. A detailed account 
on Algorithm 2 is provided by Schon et al. 
(2011); see also Cappe et al. (2005). 


Algorithm 2 EM for nonlinear system identifica¬ 
tion_ 

1. Initialization: Set k = 0 and initialize 0 k . 

2. Expectation (E) step: 

(a) Compute an approximation p™ k (x\ : t I yur), for 
example, using an FFBSi sampler. 

(b) Calculate Q M (9, 0 k ). 

3. Maximization (M) step: Compute 

8 k+l =argmax Q M (6,6 k ). 

e 

4. Check termination condition. If satisfied, terminate; 
otherwise, update k k + 1 and return to step 2. 
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Finally, we mention Fisher’s identity opening 
up yet another avenue for designing ML esti¬ 
mators using SMC approximations. Even if we 
are not interested in using EM when solving 
the nonlinear system identification problem, the 
intermediate quantity (11) is useful. The reason 
is provided via Fisher’s identity , 

V91t(0)\ 9=9 , = V*G(M0|* = *, 

= / V<9 lo gpe(X\'.T,yv.T)\ e=e , 

P0'(X\:T I yi:T)dXl:T, 

which provides a means to compute approxi¬ 
mations of the log-likelihood gradient. Hessian 
approximations are also available, but these are 
more involved. Hence, Fisher’s identity opens up 
for direct use of any off-the-shelf gradient-based 
optimization method in solving (4). 

Online Solutions 

Online (also referred to as recursive or adap¬ 
tive) identification refers to the problem where 
the parameter estimate is updated based on the 
parameter estimate at the previous time step and 
the new measurement. This is used when we 
are dealing with big data sets and in real-time 
situations. SMC offers interesting opportunities 
when it comes to deriving online solutions for 
nonlinear state- space models. The most direct 
idea is simply to make use of a gradient method 

0 t = 6 t -1 + y t V e log p e (y t \ yut-i), 

where {y t } is the sequence of step sizes. Fisher’s 
identity (12) opens up for the use of SMC in 
approximating V# log pe (yt \ yv.t-i) However, 
this leads to a rapidly increasing variance, 
something that can be dealt with by the so-called 
“marginal” Fisher identity; see Poyiadjis et al. 
(2011) for details. 

An interesting alternative is provided by an 
online EM algorithm; see, e.g., Cappe (2011) for 
a solid introduction. The online EM approaches 
rely on the additive properties of the Q-function. 
The area of online solutions via SMC is likely 


to grow in the future as there is a clear need 
motivated by the constantly growing data sets and 
there are also clear theoretical opportunities. 

Summary and Future Directions 

We have discussed how SMC samplers can be 
used to solve nonlinear system identification 
problems, by sketching both Bayesian and 
ML solutions. A common feature of the 
resulting algorithms is that they are (nontrivial) 
combinations of more basic algorithms. We 
have, for example, seen the combined use of 
a particle smoother and a nonlinear optimization 
solver in Algorithm 2 to compute ML estimates. 
As another example we have the class of 
PMCMC methods, where the basic building 
blocks are provided by SMC samplers and 
MCMC samplers. The use of SMC and MCMC 
methods for nonlinear system identification has 
only just started to take off, and it presents very 
interesting future prospects. Some directions for 
future research are as follows: (1) The family 
of PMCMC algorithms is rich and fast growing, 
with great potential for further developments. For 
example, its use in solving the state smoothing 
problem (i.e., computing p(x\ : t \ yv.r)) is likely 
to provide better algorithms in the near future. 
(2) Related to this is the potential to design new 
particle smoothers capable of generating new 
particles also in the time-reversed direction. (3) 
There are open and highly relevant challenges 
when it comes to designing backward simulators 
for Bayesian nonparametric methods (Hjort et al. 
2010). A key question here is how to represent 
the backward kernel p(x t \ x t +\,y\ :t ) in such 
nonparametric settings. (4) The use of Bayesian 
nonparametric models will open up interesting 
possibilities for hybrid system identification, 
since they allow us to systematically express and 
work with uncertainties over segmentations. 

Cross-References 

► Nonlinear System Identification: An Overview 
of Common Approaches 

► System Identification: An Overview 
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Recommended Reading 

An overview of SMC methods for system iden¬ 
tification is provided by Kantas et al. (2009), 
and a thorough introduction to SMC is provided 
by Doucet and Johansen (2011). The forthcom¬ 
ing monograph by Schon and Lindsten (2014) 
provides a textbook introduction to particle fil¬ 
ters/smoothers (SMC), MCMC, PMCMC, and 
their use in solving problems in nonlinear system 
identification and nonlinear state inference. A 
self-contained introduction to particle smoothers 
and the backward simulation idea is provided by 
Lindsten and Schon (2013). The work by Cappe 
et al. (2005) also contains a lot of very relevant 
material in this respect. 
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Abstract 

Nonlinear mathematical models are essential 
tools in various engineering and scientific 
domains, where more and more data are recorded 
by electronic devices. How to build nonlinear 
mathematical models essentially based on 
experimental data is the topic of this entry. Due 
to the large extent of the topic, this entry provides 
only a rough overview of some well-known 
results, from gray-box to black-box system 
identification. 


Keywords 

Black-box models; Block-oriented models; Gray- 
box models; Nonlinear system identification 

Introduction 

The wide success of linear system identification 
in various applications (Ljung 1999; ► System 
Identification: An Overview) does not necessarily 
mean that the underlying dynamic systems are 
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intrinsically linear. Quite often, linear system 
identification can be successfully applied to a 
nonlinear system if its working range is restricted 
to a neighborhood of some working point. Nev¬ 
ertheless, some advanced engineering systems 
may exhibit significant nonlinear behaviors under 
their normal working conditions, so do most 
biological or social systems. There is therefore an 
increasing demand on nonlinear dynamic system 
modeling theory. Nonlinear system identification 
is studied to partly answer this demand, when 
experimental data carry the essential information 
for modeling purpose. 

Nonlinear system identification, compared to 
its linear counterpart, is a much more vast topic, 
as in principle a nonlinear model can be any 
description of a system which is not linear. For 
this reason, this entry provides only a rough 
overview of some well-known results. 

An overview of the basic concepts of system 
identification can be found in ► System Iden¬ 
tification: An Overview, notably the five basic 
elements to be taken into account in each ap¬ 
plication, among which the (nonlinear) model 
structures will be mainly focused on by this 
entry, as they represent the essential particu¬ 
larities of nonlinear system identification prob¬ 
lems. 

The various model structures used in non¬ 
linear system identification are often classified 
by the level of available prior knowledge about 
the considered system: from white-box models 
to black-box models, via gray-box models. In 
principle, a white-box model is fully built from 
prior knowledge. Such a fully white-box ap¬ 
proach is rarely feasible for complex systems 
because of insufficient prior knowledge or of in¬ 
tractable system complexity. Therefore, the sys¬ 
tem identification methods summarized in this 
entry concern gray-box and black-box models, 
for which experimental data play an essential 
role. 

For ease of presentation, the main content of 
this entry will be restricted to the single-input 
single-output (SISO) case. The multiple-input 
multiple-output (MIMO) case will be discussed 
in the section “Multiple-Input Multiple-Output 
Systems” below. 


Gray-Box Models 

This section covers gray-box models, from the 
most to the least demanding ones in terms of prior 
knowledge. 

Parametrized Physical Models 

The dynamic behaviors of some engineering 
systems are governed by well-known physical 
laws, typically in the form of differential 
equations, possibly with unknown parameters. 
These parametrized physical equations can 
be used as gray-box models for system 
identification. In most situations, such a model 
can be written in the form of a vectorial first-order 
ordinary differential equation (ODE), known as 
state equation, and can be generally written as 

( 1 ) 

where t represents the time, x(t) is the state 
vector, u{t) the input, and /(•) a (nonlinear) 
function parametrized by the vector 6. 

The observation on the system (typically with 
electronic sensors), referred to as the output and 
denoted by y(t), is related to x(t) and u{t) 
through another known parametrized equation 

y (t) = h(x(t), u(t); 6) + v(t) (2) 

where v(t) represents the measurement error. 

With digital electronic instruments, the input 
u(t) and the output y(t) are sampled at some 
discrete-time instants, say t = r, 2r, 3r,..., At 
with some constant sampling period r > 0. For 
the sake of notation simplicity, let the sampling 
period r = 1 and assume ideal instantaneous 
samplers; then the sampled input-output data set 
is denoted by 

Z N = Ml), y( 1), i/(2), y( 2),..., u(N), y(N)} . 

(3) 

In some applications, data samples are made at 
irregular time instants. Some studies are particu¬ 
larly focused on system identification in this case 
(Gamier and Wang 2008). 
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The main remaining task of gray-box system 
identification is to estimate the parameter vec¬ 
tor 0 from the data set Z N . The identification 
criterion is typically defined with the aid of an 
output predictor derived from the system model. 
A natural output predictor is simply based on 
the numerical solution of the state equation (1): 
for some given value of 9 , initial state x(0) and 
some assumed inter-sample behavior of the input 
u(t ) (e.g., with a zero order hold), the trajectory 
of x(t ), denoted by x(t\9), is computed with a 
numerical ODE solver, then the output prediction 
is computed as 

y(t\9) = h(x(t\9), u(t); 9). (4) 

The parameter vector 6 is typically estimated 
by minimizing the sum of squared prediction 
error s(t\9) = y(t) — y(t\9). See ►System 
Identification: An Overview and Bohlin (2006) 
for more details. 

The predictor based on the numerical solution 
of the state equation (1) (known as a simulator) 
may be in trouble if this equation with the given 
value of 0 is unstable. Moreover, the state equa¬ 
tion (1) may also be subject to some modeling 
error that should be taken into account in the 
output predictor. In such cases, the output predic¬ 
tor can be made with the aid of some nonlinear 
state observer (Gauthier and Kupka 2001) or 
some nonlinear filtering algorithm (Doucet and 
Johansen 2011). 

Alternatively, sequential Monte Carlo (SMC) 
methods can also be applied to the identification 
of (small size) nonlinear state-space systems, 
typically assuming a discrete-time counterpart of 
the model described by Eqs. (1) and (2). See 
► Nonlinear System Identification Using Particle 
Filters. 

The gray-box approach is particularly useful 
in an engineering field when some software li¬ 
brary of commonly used components is available. 
In this case, a system model can be built by 
connecting available component models. Never¬ 
theless, the “connection” of the component mod¬ 
els may introduce algebraic constraints through 
variables shared by connected components, lead¬ 
ing to differential algebraic equations (DAE), 


which are a wider class of dynamic system mod¬ 
els than the abovementioned state-space mod¬ 
els (► Modeling of Dynamic Systems from First 
Principles). For most dynamic systems, it is pos¬ 
sible to avoid the DAE formulation by causality 
analysis, so that the connections between differ¬ 
ent system components are treated as information 
flow, instead of algebraic constraints. There ex¬ 
ist also some theoretic studies on DAE system 
identifiability (Ljung and Glad 1994) and some 
recent developments on the identification of such 
systems (Gerdin et al. 2007). 

Combined Physical and Black-Box Models 

It may happen that, in a complex system, part 
of the components is well described by physical 
laws (possibly with available models from a soft¬ 
ware library), but some other components are not 
well studied. In this case, the latter components 
can be dealt with black-box models (or possibly 
empirical models). The entire model can be fitted 
to a collected data set Z N , like in the case of the 
previous subsection. 

Block-Oriented Models 

Complex systems, notably those studied in en¬ 
gineering, are often made of a certain number 
of components; thus a system model can be 
built by connecting component models. In this 
sense, such component-based models could be 
said “block-oriented.” In the system identifica¬ 
tion literature, the term block-oriented model is 
often used in a particular context (Giri and Bai 
2010), where it is typically assumed that each 
component is either a linear dynamic subsystem 
or a nonlinear static one. Here, the term “static” 
means that the behavior of the component is 
memoryless and can be described by an algebraic 
equation. The study of system identification with 
such models is motivated by the fact that, when a 
controlled system is stabilized around a working 
point, its dynamic behavior can be well described 
by a linear model, but its actuators and sensors 
may exhibit significant nonlinear behaviors like 
saturation or dead zone. The choice of a particular 
block-oriented model structure depends on the 
prior knowledge about the underlying system, 
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with specific identification methods available for 
different model structures. 

The most frequently studied block-oriented 
models for system identification concern the 
Hammerstein system and the Wiener system, 
each composed of two blocks, as illustrated, 
respectively, in Figs. 1 and 2. 


estimated by a well-established linear system 
identification method. As the rib + m parameters 
bj and y/; are replaced by n^m parameters 
in the new parametrization, the term “over- 
parametrization” refers to the fact that typically 
rib +m < ribm. The estimated over-parametrized 
model can be reduced to the original parametriza¬ 
tion, usually through the singular value decom¬ 
position (SVD) of the matrix filled with the esti¬ 
mated parameter products bj yi . See Giri and Bai 
(2010) for other identification methods with vari¬ 
ant formulations of Hammerstein system model. 

When the linear subsystem is approximated 
by a finite impulse response (FIR) model, it is 
possible to first estimate the linear model before 
estimating a model for the nonlinear block (Gre- 
blicki and Pawlak 1989). 

Wiener System Identification 
A SISO Wiener system is typically formulated as 


Hammerstein System Identification 
A SISO Hammerstein system is typically formu¬ 
lated as 

x (t) = f (u (0) (5a) 

y (?) + aiy(t - 1) H- b a na y(t - n a ) 

= b\x (r - 1) H- b bn h x it - n b ) 

+ v(t). (5b) 

If the nonlinearity /(•) is expressed in the form 
of 

m 

fiu) = ^iu) ( 6 ) 

1 = 1 

with some chosen basis functions /c/(-), then 
the identification problem amounts to fitting the 
model parameters yi , ai , bj to a collected data 
set Z N . A well-known method is based on over- 
parametrization (Bai 1998): replace in (5b) each 
x(t — j ) with the right-hand side of (6) and treat 
each parameter product bjyi as an individual 
parameter, then the newly parametrized model is 
equivalent to a linear ARX model (► System 
Identification: An Overview), which can be 


oo 

z(t) = ^ h k u(t - k) (7a) 

k =l 

y(0 = g(z(0) + v (0 ( 7b ) 

where the sequence h\,h 2 ,-.. is the impulse re¬ 
sponse of the linear subsystem, g(-) is some non¬ 
linear function, and v(t ) is a noise independent of 
the input u(t). 

Some methods for Wiener system identifica¬ 
tion assume a finite impulse response (FIR) of 
the linear subsystem. In this case, the linear sub¬ 
system model is characterized by the vector col¬ 
lecting the FIR coefficients h T = h n ]. 

There are two typical kinds of efficient solutions, 
assuming either the Gaussian distribution of the 
input u(t) (Greblicki 1992) or the monotonicity 
of the nonlinear function g(-) (Bai and Reyland 
Jr 2008). In both cases, it is possible to directly 
estimate the FIR coefficients h from the input- 
output data Z N , without explicitly estimating the 
unknown nonlinear function g(-). The estimated 
h can be used to compute the internal variable 
z(t). It then becomes relatively easy to estimate 
the nonlinear function g(-) from the computed 
z(t ) and the measured y(t). 
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Other Block-Oriented Model Structures 
Among block-oriented models composed of 
more blocks, the most well-known ones concern 
Hammer stein-Wiener system and Wiener- 
Hammerstein system. They are both composed 
of 3 blocks connected in series, the former has 
a linear dynamic block preceded and followed 
by two nonlinear static blocks, and the latter 
has a nonlinear static block in the middle of 
two linear dynamic blocks. In general, the 
prediction error method (PEM) (Ljung 1999) 
is applied to the identification of such systems, 
with heuristic methods for the initialization 
of model parameters. Some recent results on 
Hammerstein-Wiener system identification have 
been reported in Wills et al. (2013). There exist 
also some other variants, with parallel blocks 
or feedback loops. In most cases, each block is 
either linear dynamic or nonlinear static, but 
there is a notable exception: hysteresis blocks. 
Hysteresis is a phenomenon typically observed 
in some magnetic or mechanic systems. Its 
mathematical description is both dynamic and 
strongly nonlinear and cannot be decomposed 
into linear dynamic and nonlinear static blocks. 
Due to the importance of hysteresis components 
in some control systems, system identification 
involving such blocks is currently an active 
research topic (Giri et al. 2008). 

LPV Models 

Linear parameter-varying (LPV) models could be 
classified as black-box models, because typically 
they rely more on experimental data than on prior 
knowledge. However, engineers often have good 
insights into such models; they are thus presented 
in the gray-box section. 

From Gain Scheduling to LPV Models 
Gain scheduling is a method originally developed 
for the control of nonlinear systems. It consists 
in designing different controllers for different 
working points of a nonlinear system and in 
switching among the designed controllers accord¬ 
ing to the actual working point. It is typically 
assumed that the working point is determined by 
some observed variable (vector) referred to as the 
scheduling variable and denoted by p. Around 


each considered working point, the nonlinear 
system is linearized so that the corresponding 
controller can be designed from the linear control 
theory. A by-product of this controller design 
procedure is a collection of linearized models 
indexed by the scheduling variable p. This col¬ 
lection, seen as a whole model of the globally 
nonlinear system, is known as an LPV model 
(Toth 2010). This approach has been particularly 
successful in the field of flight control. 

An LPV model can be formulated either in 
input-output form or in state-space form. In the 
input-output form, a SISO model can be written 
as 

y 00+«i (p) y(t- !) + ••• + a na ( p ) y(t- n a ) 

= bi(p)u(t - 1) 4-b b„ h (p) a (t - n b ) 

+ v (0 • ( 8 ) 

and in the state-space form as 

v (t + 1) = A (p) x (t) + B (p) u (t) + w (t) 

(9a) 

y(t) = c (p) x(t) + D (p) u(t) + v (t) 

(9b) 

As a global model of the whole nonlinear 
system, the p- dependent parameters (matrices) 
a i (p), bj (p), A(p), etc., are functions defined for 
all p e £2, where £2 is the relevant working 
range of the considered system (a compact subset 
of a real vector space). If originally the LPV 
model was built through a collection of linearized 
models around different working points, then the 
values of these functions are first defined for 
the corresponding discrete values of p. Lor other 
values of p e £2, these functions can be defined 
by interpolation. Alternatively, by choosing some 
parametric forms of aj(p),bj(p), A(p), etc., the 
whole LPV model can also be estimated by fitting 
it to a data set Z N , through nonlinear optimiza¬ 
tion (Toth 2010). 

Local Linear Models 

In an LPV model, the model parameters can in 
principle depend on the scheduling variable p in 
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any chosen manner. A particularly useful case 
is when they are formulated as expansions over 
local basis functions. For example, in (8), the 
parameter at (p) may be expressed as 

m 

a i(p) = (p) (10) 

/=l 

where ki (•) are some chosen bell-shaped (local) 
basis functions, typically the Gaussian function, 
centered at different positions p = c/ } G Q, and 
ciij are coefficients of the expansion. Similarly 

m 

bj(p) = bijKi (p). (11) 

/=1 

Assume that the basis functions are normalized 
such that 

m 

y>oo) = i d2) 

i=i 

for all p G £2. Then the LPV model (8) can be 
viewed as an interpolation of m “local” models 

y(t) + a u y(t - 1) H-b a na l y(t - n a ) 

— b\ ju(t — 1) + • • • + b nb ju(t — nb)-\-v(t) 

(13) 

indexed by / = 1,2 ,,m. Each of these linear 
models is valid for p close to c/, the center of the 
corresponding basis function *:/(•); hence, they 
are called local linear models. 

If the local basis functions /c/ (p) are viewed as 
membership functions of fuzzy sets, then the local 
linear model is strongly related to the Takagi- 
Sugeno fuzzy model (Takagi and Sugeno 1985). 
An advantage of this point of view is the possi¬ 
bilities of incorporating prior knowledge in the 
form of linguistic rules and of interpreting some 
local linear models resulting from system identi¬ 
fication. 

There are two approaches to building local 
linear models. The first one is the local ap¬ 
proach: for each chosen value of c/ 5 e Q, a 
local model is estimated from data corresponding 
to the values of p within a neighborhood of 
ci. This approach has the advantages of being 
computationally efficient, easily updatable, and 
well understood by engineers. The second one 


is the global approach: all the model parameters 
are estimated simultaneously by solving a single 
optimization problem for the whole model. This 
approach can produce more accurate models in 
terms of prediction error, but it is numerically 
much more expensive and may lead to models 
difficult to be interpreted by engineers. 

The practical success of local linear models 
strongly depends on the possibility of finding a 
scheduling variable p of small dimension rel¬ 
evantly determining the working point of the 
considered system. If there exists a nonlinear 
state-space model of the system, then in principle 
the working point is determined jointly by the 
state and the input of the system. As quite often 
physically meaningful state variables are not fully 
observed, they cannot be used in the definition of 
p. It is possible to define p as delayed output and 
input variables, e.g., 

p T = [;y(0> • • • > y(t-n a ), u(t-i),u{t-n b )] 

but it typically leads to a vector of quite large 
dimension. It is thus important to use practical 
insights about a given system to find a relevant 
vector p of reduced dimension. 

For a single-dimensional p, the choice of 
the local basis function centers ci can be made 
following some practical insight or equally 
spaced within Q. For a large-dimensional p, 
this task is more difficult. The equally spaced 
approach would lead to too many local models, 
as their number would exponentially increase 
with the dimension of p. In this case, an 
empirical approach, called local linear model 
tree (LOLIMOT) (Nelles 2001), can be applied. 
It iteratively partitions Q in order to place the 
local basis functions where the system is more 
likely nonlinear or where the available data are 
more concentrated. 

Black-Box Models 

Ideally speaking, a black-box model should be 
solely built from experimental data, without any 
prior knowledge. In practice, some prior knowl¬ 
edge is always necessary, though experimental 
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data play a much more important role. For in¬ 
stance, the choice of the input and output vari¬ 
ables, implying some causality relationship, is an 
important prior knowledge. 

With the fast development of electronic de¬ 
vices, more and more sensor signals are available 
in various fields, notably for engineering, envi¬ 
ronmental, and biomedical systems. Meanwhile, 
the processing power of modern computers in¬ 
creases every year. Black-box modeling has thus 
more and more potential applications. Neverthe¬ 
less, the importance of prior knowledge in a 
modeling procedure should not be forgotten. In 
general prior knowledge leads to more reliable 
models in terms of validity range, as the validity 
of physical equations is often well understood. In 
contrast, for a black-box model essentially based 
on experimental data, it may be hard to ensure 
its validity for interpolation and even harder for 
extrapolation. 

Input-Output Black-Box Models 

As the primary role of a mathematical model 
is to predict the output of the system for given 
input values, it is natural to design black-box 
models directly in the form of a predictor. As 
the output y{t) of a dynamic system depends on 
the past inputs, a predicted output y (t) may be 
formulated in the form of 

y (0 = f (u (t - 1) ,u (t - 2) ,..., u (t - rib)) 

(14) 

where / (•) is some nonlinear function (to be 
estimated from experimental data) and rib is a 
chosen integer. In principle, rib can be infinitely 
large (as y(t) depends on all the past inputs 
in general), but in practice, a model of finite 
complexity has to be chosen. If the considered 
system is stable in the sense that sufficiently old 
past inputs are (gradually) forgotten, then it is 
reasonable to truncate the dependence on the past 
inputs. 

The model structure (14) is similar to the 
linear finite impulse response (FIR) model (Ljung 
1999). It is known that, for linear system identi¬ 
fication, the use of ARX models, predicting y(t) 
from both past inputs and past outputs, is often 


more efficient than FIR models, in the sense of 
requiring fewer model parameters. By analogy, 
the nonlinear ARX model takes the form 

y(t) = f (y(t- 1 y (t-n a ), 

u(t — 1) ,_ u(t — rib)) • (15) 

This is likely the most frequently used black¬ 
box model structure for nonlinear dynamic sys¬ 
tem identification (Sjoberg et al. 1995; Juditsky 
et al. 1995). 

Nonlinear Function Estimators 

For a nonlinear ARX model in the form of (15), 
the nonlinear function /(•) has to be estimated 
from an available input-output data set Z N . Typ¬ 
ically, an estimator of /(•) with some chosen 
parametric structure is used. Let 

4> t (0 = [y (t - l),..., y (t - n a ), 

u (t — 1) ,..., u (t — rib )], (16) 

then system identification in this case amounts to 
solving a nonlinear regression problem 

y (0 = g (<p (f); 0) + v (f) (17) 

where g(-) is a chosen nonlinear function 
parametrized by 0 , capable of approximating 
a large class of nonlinear functions by appropri¬ 
ately adjusting 9 , and v(t) is the modeling error 
to be minimized in some sense. 

The most well-known nonlinear function es¬ 
timators implementing g(-) in practice are poly¬ 
nomials, splines, multiple-layer neural networks, 
radial basis networks, wavelets, and fuzzy-neural 
estimators. Most of these estimators can be writ¬ 
ten in the form 

m 

g{(pit)\9) = (0-/3,)) (18) 

/ = 1 

or in some close variant of this form, where 
/c(-) is some “mother” basis function dilated 
and translated by «/ and Pi before being 
weighted by yi in the sum forming the estimator 
(Sjoberg et al. 1995). For example, k(-) is 
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typically chosen as a (Gaussian) bell-shaped 
function in radial basis networks or a sigmoid 
(S-shaped) function in multiple-layer neural 
networks. 

Another approach to nonlinear function 
estimation is called nonparametric estimation. 
Its main idea is to estimate /(<p*) for any 
given value of cp* by the (weighted) average 
of the values of y(t) in the available data 
set corresponding to values of cp(t) close to 
cp*. This category includes kernel estimators 
(Nadaraya 1964) and memory-based estimators 
(Specht 1991). 

The nonlinear function estimation problem as 
formulated in (17) can also be addressed with the 
Gaussian process model. Assume that g in (17) is 
a Gaussian process whose covariance matrix for 
any regressor pair <p(t) and cp( r) is a known func¬ 
tion of the regressor pair, then the posterior dis¬ 
tribution of g given observations on y(t) can be 
computed by applying the Bayes’ theorem under 
certain assumptions (Rasmussen and Williams 
2006). This method is strongly related to the least 
squares support vector machines (Suykens et al. 
2002) and to some extent is similar to kernel 
estimators. 

The difficulty for estimating a nonlinear func¬ 
tion / (cp) strongly depends on the dimension of 
<p. In the single-dimensional case, most existing 
methods can produce satisfactory results. When 
the dimension of <p, say n , increases, in order to 
keep the data “density” unchanged, the number 
of data points must increase exponentially with 
n . This fact implies that, in the high-dimensional 
case (say n > 10), for most practically available 
data sets, the data points are sparse in the space 
of cp. It is thus practically impossible to estimate 
f(cp ) with a good accuracy everywhere in the 
space of cp. In order to remedy this problem, prior 
knowledge can be used to form a more elaborated 
vector <p of reduced dimension, instead of the 
simple form of past input and output variables. 
The resulting model will be more of gray-box 
nature. If this approach is not possible, one has 
to expect that the estimation algorithm automat¬ 
ically discovers some low-dimension nature of 
the nonlinear relationship being estimated. The 
success would depend on the suitability of the 


chosen particular nonlinear function estimator for 
the considered system. 

State-Space Black-Box Models 

For a gray-box model in the form of (1) and 
(2), it is assumed that the parametric forms of 
the nonlinear functions /(•) and h (•) are known 
from prior knowledge. If no such knowledge is 
available, it is possible to estimate these nonlinear 
functions with some function estimator, like those 
introduced in the previous subsection. Such an 
approach leads to state-space black-box models. 
In practice, it is easier to use the discrete-time 
counterpart of the state equation (1). Because 
typically the state vector x(t) is not directly 
observed, the estimation of /(•) and /z(-) cannot 
be formulated as nonlinear regression problems, 
in contrast to the case of input-output black¬ 
box models. Another difficulty is related to the 
nonuniqueness of the state-space representation 
of a given system: any (linear or nonlinear) state 
transformation would lead to a different state- 
space representation of the same system. In some 
existing methods, a linear state-space model is 
first estimated; then nonlinear function estimators 
are used to compensate the residuals of /(•) and 
/?(•) after their linear approximations (Paduart 
et al. 2010). 

Multiple-Input Multiple-Output 
Systems 

For multiple-input multiple-output (MIMO) 
systems, state-space models like (l)-(2) remain 
in the same form, by considering vector values 
of the notations u(t) and y(t) at each time 
instant, up to some similar adaptation of the other 
involved notations. For input-output models like 
(15), the involved notations can also be vector 
valued, but the fact that different inputs and/or 
outputs can have different delays makes the 
notations more complicated. For block-oriented 
models, though a MIMO linear block is usually 
described by a general linear model in state- 
space form or in input-output form, there is no 
consensus for the structural choice of MIMO 
nonlinear blocks. 
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Some Practical Issues 

The general practical aspects discussed in 
► System Identification: An Overview are 
of course also valid for nonlinear system 
identification, but some particularities in the 
nonlinear case should be highlighted. 

It is important to apply appropriate input sig¬ 
nals so that the collected data convey sufficient 
information for system identification. The de¬ 
sign of input signals for this purpose is known 
as experiment design. In the framework of lin¬ 
ear system identification, experiment design is 
usually formulated through the optimization of 
the covariance matrix of model parameter esti¬ 
mates (►Experiment Design and Identification 
for Control), which often leads to non-convex 
optimization problems. Experiment design in the 
nonlinear case has not been systematically stud¬ 
ied. If possible, the chosen input signal should 
be similar to what will be actually applied to 
the considered system and cover various working 
conditions. Another simple rule is that the input 
should excite a nonlinear system at different am¬ 
plitudes, whereas binary input signals are often 
used for linear systems. 

Model validation is a particularly delicate 
task for nonlinear black-box models. As already 
mentioned when such models are introduced, 
the available data points are usually sparse 
when a nonlinear function is estimated in a 
high-dimensional space; it is thus practically 
impossible to uniformly ensure the estimation 
accuracy of the nonlinear function. It is important 
to extensively perform cross-validation , by 
testing the validity of the model on large data sets 
that have not been used for model estimation. 

Regularization is also an important issue for 
nonlinear black-box models. Because of lack of 
prior knowledge, each nonlinear black-box model 
has a flexible structure in order to cover a large 
class of nonlinear systems, typically with many 
model parameters, implying large variances of 
parameter estimates (► System Identification: An 
Overview). Appropriately applying a regularized 
criterion for model parameter estimation can re¬ 
duce the variances. For gray-box models, prior 
knowledge can be used for regularization through 


a Bayesian approach, but this approach is not 
applicable to black-box models. 


Summary and Future Directions 

Compared to linear system identification, the non¬ 
linear case is a much more vast topic, of which 
this entry provides only a rough overview. The 
main lines that should be retained are that both 
prior knowledge and experimental data are re¬ 
quired for system identification and that the more 
prior knowledge is incorporated in a model, the 
better the extent of its validity is understood. The 
lack of prior knowledge should be compensated by 
the processing of large amounts of data. The data 
that can be processed within an acceptable time 
depend on the power of computers that progresses 
every year. Meanwhile, the research and develop¬ 
ment of efficient algorithms for large data process¬ 
ing with multiple or massively parallel processors 
are an exciting topic in system identification. 


Cross-References 

► Experiment Design and Identification for Con¬ 
trol 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Nonlinear System Identification Using Particle 
Filters 

► System Identification: An Overview 


Recommended Reading 

Nonlinear system identification is covered by a 
vast literature. After the readings about general 
topics on system identification (see ► System 
Identification: An Overview and references 
therein), the reader may further read (Nelles 
2001) for black-box system identification, 
(Bohlin 2006) for gray-box system identification, 
(Giri and Bai 2010) for block-oriented system 
identification, and (Toth 2010) for LPV system 
identification. 
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Abstract 

The notion of zero dynamics plays a role in 
nonlinear systems that is analogous to the role 
played, in a linear system, by the notion of zeros 
of the transfer function. In this article, we review 
the basic concepts underlying the definition of 
zero dynamics and discuss its relevance in the 
context of nonlinear feedback design. 


Keywords 
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Introduction 

The concept of zero dynamics of a nonlinear 
system was introduced in the early 1980s as the 
nonlinear analogue of the concept of transmission 
zero of a linear system. This concept played 
a fundamental role in the development of sys¬ 
tematic methods for asymptotic stabilization of 
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relevant classes of nonlinear systems. As a matter 
of fact, a nonlinear system in which the zero dy¬ 
namics possess a globally asymptotically stable 
equilibrium can be robustly stabilized, globally 
or at least with guaranteed region of attraction, 
by means of output feedback. This is a nonlinear 
analogue of a well-know property of linear sys¬ 
tems, namely, the property that an ft-dimensional 
linear systems having ft — 1 zeros with neg¬ 
ative real part can be stabilized by means of 
proportional output feedback, if the feedback 
gain is sufficiently large. The concept of zero 
dynamics also plays a relevant role in variety of 
other problems of feedback design, such as input- 
output linearization with internal stability, non¬ 
interacting control with internal stability, output 
regulation, and feedback equivalence to passive 
systems. 


The Zero Dynamics 

One of the cornerstones of the geometric the¬ 
ory of control systems (for linear as well as 
for nonlinear systems) is the analysis of how 
the observability property can be influenced by 
feedback. This study, originally conceived in the 
context of the problem of disturbance decoupling, 
had far reaching consequences in a number of 
other domains. One of these consequences is the 
possibility of characterizing in “geometric terms” 
the notion of zero of the transfer function of 
a system. In a (single-input single-output and 
minimal) linear system, a complex number z is 
a zero of the transfer function if and only if the 
input u(t ) = exp(z^) yields - for a suitable 
choice of the initial state - a forced response 
in which the output is identically zero. This 
“open-loop” and “time-domain” characterization 
has a “closed-loop and “geometric” counterpart: 
all such z’s coincide with the eigenvalues of the 
unobservable part of the system, once the latter 
has been rendered maximally unobservable by 
means of feedback. One of the earlier successes 
of the geometric approach to the analysis and 
design of nonlinear systems was the possibility 
of extending these equivalent characterizations to 
the domain of nonlinear systems. 


To see how this is possible, consider for sim¬ 
plicity the case of a system modeled by equations 
of the form 


X = fix) + g(x)u 
y = h(x) 


with state x e W 1 , input u e M, output y e 
M and in which fix'), g(x), h(x) are smooth 
functions. Systems of this forms are called input- 
affine systems. The analysis of such systems is 
rendered particularly simple if appropriate no¬ 
tations are used. Given any real-valued smooth 
function X(x) and any ft-vector valued smooth 
function X(x ), let LxX(x) denote the (direc¬ 
tional) derivative of A(x) along X(x ), that is the 
real-valued smooth function 


dX 


LxKx) = fZ- Xi(X) ' 


i = 1 


and, recursively, set L d x X = LxL x ~ l X(x) for 
any d > 1. 

Suppose there exists an integer r > 1 with the 
following properties 


L g h(x ) = L g Lfh(x ) = ••• = L g L r jr 2 h(x) 

= 0 Vx G W 1 

L g L r f l h(x) ^ 0 Vx G M” . 

If this is the case, it is possible to show that the 
set 


Z* = {i G R” : h(x) = L h (x ) = ... 
= L r f ~ l h(x) = 0} 


is a smooth sub-manifold of W 1 , of codimension 
r. It is also easy to show that the state-feedback 
law 

, _ LjHx) 

U X L g L r f l h{x) 

renders the vector 

f*(x) = fix) +g{x)u*{x) 
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tangent to Z*, at each point x of Z*. In other 
words, Z* is an invariant manifold of the 
feedback-modified system 


x = f*(x). 


It is seen from this construction that the output 
y(t) = h(x(t )) of the system is identically zero 
if and only if x(0) e Z* and u(t) = u*(x(t)), 
where x(t) is the solution of x = f*(x) passing 
through x(0) at time t = 0. As a consequence, 
the restriction of x = f*(x) to its invariant 
manifold Z* characterizes all internal dynamics 
that occur in the system once initial condition and 
input are chosen in such a way that the output is 
constrained to be identically zero. The dynamics 
in question are called the zero-dynamics of the 
system. Note that this construction demonstrates, 
as anticipated, the equivalence between an “open- 
loop” and a “closed-loop” characterization of all 
the (internal) dynamics of a given system that 
are compatible with the constraint that the output 
is identically zero. This construction can be ex¬ 
tended to multi-input multi-output systems, with 
the aid of an appropriate recursive algorithm, 
known as the zero dynamics algorithm (Isidori 
1995). 


Z = /o(z,£) 

k = A r % + B, [qo(z, £) + b(z, tj)u] 
y = C r f 


in which z e r , £ e M", the matrices 
A r ,B r ,C r have the form 



/0 1 0 

• • • o\ 


(0\ 


00 1 

•••0 


0 

A r = 

000 

... 1 

, B r = 

0 


^0 0 0 

"' 0/ 


U/ 


C r = (1 0 0 • • • 0) , 

and b(z, £) ^ 0 for all (z, £). These equations are 
said to be in normal form (Isidori 1995). 

It is easy to check that, in these coordinates, 
the manifold Z* is the set of all pairs (z, £) 
having £ = 0, the state feedback law u*(x ) is 
the function 


^ (a) 


g) 

b(z,$) 


and the restriction of x = f*(x) to the manifold 
Z* is nothing else than 


Normal Forms 


z = fo(z, 0). 


The coordinate-free construction presented above 
becomes even more transparent if special coordi¬ 
nates are chosen. To this end, set 


g*(x) 


L g L r f l h(x) g(X ^ 


and define, recursively, 


The latter provide a simple characterization of 
the zero dynamics of the system, once that the 
latter has been brought to its normal form. 

It is worth observing that, in the case of a 
linear system, functions /o(z, £) and qo(z,t;) are 
linear functions, and b(z, £) is a constant. Conse¬ 
quently, the normal form can be written as 


X 0 (x) = g*(x), X k (x) = [/*(*), x k- iW], 

for l < k < r — 1, in which [Y(x), X(x)\ denotes 
the Lie bracket of Y (x) and X(x). It is possible to 
show that if the vector fields Xo(x ),..., X n -\ (v) 
are complete , there exists a smooth nonlinear, 
globally defined , change of variables by means 
of which the system can be transformed into a 
system of the form 


z = Fz + 

£ = A r £ + 5 r [tfz + ^£ + H 

y — • 


It is also easy to check that the transfer func¬ 
tion of the system can be expressed as 


T(s) = b 


detfa/ - F) 
det(^/ — A) 
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in which 

A = ( F G ) 

\B r H A r + B r Kj ' 

From this it is concluded that in a (controllable 
and observable) linear system, the zeros of the 
transfer function T(s ) coincide with the eigen¬ 
values of F. In other words, in a linear system 
the zero dynamics are linear dynamics whose 
eigenvalues coincide with the zeros of the transfer 
function of the system. 

The Inverse System 

Another property associated with the notion of 
zero of the transfer function, in a (single-input 
single-output) linear system, is the fact that the 
zeros characterize the dynamics of the inverse 
system (the latter being - loosely speaking - a 
system able to reproduce the input u{t) from 
output y(t) that this input has generated). This 
property has an immediate analogue for nonlinear 
systems. Considering system in normal form and 
setting 

y r - 1 (0=col(y(0,/ 1) (0,...,3’ (r " 1) (0), 

it is easily seen that the input u(t) can be deter¬ 
mined as the output of a dynamical system, driven 
by y r-1 (0 and y^ r \t), modeled by 

z = fo(z,y r ~ l ) 

y (r) -<?o(z,y r ~ 1 ) (1) 

b(z,y r ~ l ) 

Thus, it is concluded that the unforced internal 
dynamics of the inverse system coincide with the 
zero dynamics as defined above. 

It should be stressed, though, that the coin¬ 
cidence is limited to the case of single-input 
single-output systems. For a multi-input multi¬ 
output nonlinear systems, the link between zero 
dynamics and the dynamics of the inverse system 
is more subtle. This is essentially due to the fact 
that while the concept of zero dynamics only 
seeks to determine the dynamics compatible with 


the constraint that the output is identically zero, 
the inverse system must describe all dynamics 
resulting in any admissible output function. As a 
consequence, computation of the zero dynamics 
and computation of the inverse system (whenever 
this is possible) are not equivalent and the lat¬ 
ter is possible only under substantially stronger 
assumptions. The computation of the zero dy¬ 
namics is based on an extension (Isidori 1995) 
of the classical algorithm of Wonham (1979) for 
the computation of the largest controlled invariant 
subspace in the kernel of the output map, while 
the computation of the inverse system is based 
on extensions, due to Hirschorn (1979) and Singh 
(1981) of the so-called structure algorithm intro¬ 
duced by Silverman (1969) for the computation 
of inverses and zero structure at the infinity. 
For a comparison of such assumptions and of 
their influence on the outcome of the associated 
algorithms, see Isidori and Moog (1988). 

Input-Output Linearization 

An appealing feature of the normal form de¬ 
scribed above is the straightforward observation 
that a (state) feedback law of the form 

u — tt— zrl-gofo £) + K r % + v] 

b(z, £) 

changes the system into a system 

i = /ofc£) 

£ — (A r + B r K r )£ + B r v 
y = c r £ 

whose input-output behavior (between input v 
and output y) is fully linear (and stable if K r is 
chosen so that the matrix A r + B r K r in Hurwitz). 
In fact, the law in question renders the sys¬ 
tem partially unobservable, with all nonlinearities 
confined to its unobservable part (Isidori et al. 
1981). This control law is clearly non-robust, 
as it relies upon exact cancelation of possibly 
uncertain terms, but it can be rendered robust 
by means of appropriate dynamic compensation 
(Freidovich and Khalil 2008). 
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The system obtained in this way has the struc¬ 
ture of a cascade of two sub-systems, one of 
which, modeled as 

z = /o(z,£) - 

is seen as “driven” by the input £. This motivates 
the interest in classifying the asymptotic proper¬ 
ties of such subsystem, as discussed below. 

Asymptotic Properties of the Zero 
Dynamics 

Linear systems with no zeroes in the right-half 
complex plane are traditionally called minimum- 
phase systems, in view of certain properties of the 
Bode gain and phase plots of its transfer function. 
Thus, in view of the interpretation given above, 
linear systems whose zero dynamics are asymp¬ 
totically stable are minimum-phase systems. This 
terminology has been (somewhat abusively, but 
with the clear intent of providing a concise and 
expressive characterization) borrowed to classify 
nonlinear systems whose zero dynamics have 
desirable (from the stability viewpoint) proper¬ 
ties. Assuming that z = 0 is an equilibrium of 
z = fo(z, 0), the following cases are consid¬ 
ered: 

• A nonlinear system is locally minimum-phase 
(respectively, locally exponentially minimum- 
phase) if the equilibrium z = 0 of z = /o(z, 0) 
is locally asymptotically (respectively locally 
exponentially) stable (Byrnes and Isidori 
1984). 

• A nonlinear system is globally minimum- 
phase if the equilibrium z = 0 of z = fo(z , 0) 
is globally asymptotically stable (Byrnes and 
Isidori 1991). 

• A nonlinear system is strongly minimum- 
phase if the system z = fo(z, £), viewed as 
a system with input £ and state z, is input-to- 
state stable (Liberzon 2002). 

According to the well-known criterion of 
Sontag (1995) for input-to-state stability, a 
system is strongly minimum phase if and only 
if there exists a positive definite and proper 
smooth real-valued function V(z), class /Coo 


functions a (•),«(•), «(•) and a class /C function 
/(•) satisfying 

«(|z|) < V{z) < a(|z|) Vz 

/o(z,£) < -a(|z|) V(z,£) 
such that |z| > /(|£|). 

As a special case, it is seen that a system is 
globally minimum phase if and only if there 
exists a function V(z), bounded as above, such 
that 

/o(z,0) < -a(|z|) Vz. 

oz 

If, instead, the weaker inequality 
dV 

-—-fo(z, 0) < 0 Vz 
oz 

holds, the system is said to be globally weakly 
minimum-phase. 

The criterion summarized above is of 
paramount importance in the design of feedback 
laws to the purpose of stabilizing nonlinear 
systems that are globally (or strongly) minimum 
phase, as it will be seen below. 

Zero Dynamics and Stabilization 

The first and foremost immediate implication of 
the properties described above is the fact that the 
feedback law 

« = , / [-go (z, £) + K r %], 

if K r is chosen so that the matrix A r + B r K r 
in Hurwitz, globally asymptotically stabilizes 
the equilibrium (z, £) = (0,0) of a strongly 
minimum-phase system. In fact, as observed, the 
corresponding closed-loop system can be seen as 
an asymptotically stable (linear) system driving 
an input-to-state stable (nonlinear) system. As 
already observed, this control mode is non- 
robust (as it relies upon exact cancelations) and 
requires the availability of the full state (z, £) 
of the controlled system. However, both these 
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deficiencies can be to some extent fixed, by 
means of appropriate techniques, that will be 
briefly reviewed below. 

If the requirement of global stability is re¬ 
placed by the (weaker) requirement of stability 
with a guaranteed region of attraction , then the 
desired control goal can be achieved by means 
of a much simpler law, depending only on the 
partial state £ and not requiring cancelations. 
Stability with a guaranteed region of attraction 
essentially means that a given equilibrium is 
rendered asymptotically stable, with a region of 
attraction that contains an a priori fixed compact 
set. In this context, the most relevant results can 
be summarized as follows. 

Assume the system possesses a globally de¬ 
fined normal form and, without loss of generality, 
let &(z, £) > 0. Let the system be controlled by a 
“partial state” feedback of the form 

u = —kK, fi , 

in which lc e M. Under this control mode, the 
following results are obtained: 

• Suppose the system is strongly minimum 
phase. Then, there is a matrix K r and, for 
every choice of a compact set C and of a 
number e > 0, there are a number k* and a 
time T* such that, if k > k*, all trajectories of 
the closed-loop system with initial condition 
in C are bounded and satisfy |x(f)| < £ for all 
t > r*. 

• Suppose the system is strongly minimum 
phase and also locally exponentially minimum 
phase. Suppose #o(0,0) = 0. Then, there 
is a matrix K r and, for every choice of a 
compact set C there is a number k* such 
that, if k > k*, the equilibrium x = 0 of 
the system is locally asymptotically stable, 
with a domain of attraction that contains the 
set C. 

In these results, the system is stabilized by 
means of a static control law that depends only 
on the partial state £ and not on the (possi¬ 
bly unknown) quantities qo(z, £), b(z, £). Bear¬ 
ing in mind the fact that the r components of 
£ coincide with the output y and its deriva¬ 
tives y( x \ ..., y( r ~ l \ it is possible to replace the 


control in question by means of a dynamic control 
law that only depends on the output y, following 
a design paradigm originally proposed by H. 
Khalil. In fact, if the system is strongly minimum 
phase and also locally exponentially minimum 
phase and if qo (0,0) = 0, asymptotic stability 
with a guaranteed region of attraction can be 
achieved by means of dynamical feedback law of 
the form Khalil and Esfandiari (1993) 

fl = ?2 + KC r -l(y-%l) 

12 = f 3 + K 2 C r -2(y ~ |l) 

lr-1 = |r + K r ~ l Ci(y -fl) 
lr = K r Co(y ~ |l) 

u = -o L (kK r %), 

in which k and the C/ are design parameters and 
Gl (s) is a smooth saturation function, character¬ 
ized as follows: gl(s) = s if |*y | < L, gl(s) 
is odd and monotonically increasing, with 0 < 
g' l (s) — an d Hindoo Gl(s) = L(l + c) with 
0 < c <£ 1. The number L is a design parameter 
also. 

It is also possible to show that a suitable 
“extension” of this dynamic feedback law can be 
used to asymptotically recover the effects of the 
input-output linearizing law considered earlier. 
In this way, the lack of robustness intrinsically 
present in such control law is overcome (Frei- 
dovich and Khalil (2008)). 


Output Regulation 

The concept of zero dynamics plays a fundamen¬ 
tal role in the problem of output regulation. The 
problem in question considers a controlled plant 
modeled by 

x = f(w, x , u ) 
e = h(yv, x ), 

in which u is the control input, w is a set of ex¬ 
ogenous variables (command and disturbances), 
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and e is a set of regulated variables. The exoge¬ 
nous variables are thought of as generated by an 
autonomous system 

w = s(w) 

known as the exosysten. The problem is to design 
a (possibly dynamic) controller 

x c = fc(x c ,e) 
u = h c (x c , e ) 

driven by the regulated variable e , such that 
in the resulting closed-loop system all trajecto¬ 
ries are ultimately bounded and lim^oo e(t) = 
0. The problem in question has been the ob¬ 
ject of intensive research in the past years. In 
what follows we limit ourselves to highlight the 
role of the concept of zero dynamics in this 
problem. 

Assume that the set W where the exosystem 
evolves is compact and invariant and suppose a 
controller exists that solves the problem of output 
regulation. Then, the associated closed-loop has a 
steady-state locus (see Isidori and Byrnes 2008), 
the graph of a possibly set-valued map defined on 
W. Suppose the map in question is single-valued, 
which means that for each given exogenous input 
function w(t ), there exists a unique steady-state 
response, expressed as x(t ) = jr(w(t)) and 
x c (t ) = j r c (w(t)). If, in addition, 7r(w) and tt c (w) 
are continuously differentiable, it is readily seen 
that 

L s jt(w) = f(w, j r(w), 

0 = h(w, 7t (w)) 

Vw eW 

L s 7t c (w) = / c (tt c (w),0) 
f(w) = h c (tt c (w) , 0) 

The first two equations, introduced in Isidori 
and Byrnes (1990), are known as the nonlinear 
regulator equations. They clearly show that the 
graph of the map 7t(w) is a manifold contained 
in the zero set of the output map e , rendered 
invariant by the control u = In particular, 

the steady-state trajectories of the closed-loop 
system are trajectories of the zero dynamics of 


the controlled plant. The second two equations, 
on the other hand, interpret the ability, of the 
controller, to generate the feedforward input nec¬ 
essary to keep e(t) = 0 in steady-state. This is 
a nonlinear version of the well-known internal 
model principle of Francis and Wonham (1975). 


Passivity 

Consider a nonlinear input-affine system having 
the same number m of inputs and outputs and 
recall that this system is said to be passive if 
there exists a continuous nonnegative function 
real-valued function W(x), with IT(0) = 0, that 
satisfies 

W(x(t )) - W(x (0)) < f y T (s)u(s)ds 

Jo 

along trajectories. The function W(x) is the so- 
called storage function of the system. 

It is well known that the notion of passiv¬ 
ity plays an important role in system analysis 
and that the theory of passive systems leads to 
powerful methodologies for the design of feed¬ 
back laws for nonlinear systems. In this context, 
the question of whether a given, non-passive, 
nonlinear system could be rendered passive by 
means of state feedback is indeed relevant. It 
turns out that this possibility can be simply ex¬ 
pressed as a property of the zero dynamics of the 
system. 

Suppose that L g h{x) is nonsingular and set 
g*(x) = g(x)[L g h(x)]~ x . If the m columns 
of g*(x) are complete and commuting vector 
fields, there exists a globally defined change of 
coordinates that brings the system in normal form 

z = fo(z,y) 
y = qo(z,y) + b(z,y)u 

Then, there exists a feedback law u = a(z, y) that 
renders the resulting closed-loop system passive, 
with a C 2 and positive definite storage function 
W(x), if and only if the system is globally weakly 
minimum phase (Byrnes et al. 1991). 
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Limits of Performance 

It is well-known that linear systems having zeros 
in the left-half plane are difficult to control, and 
obstruction exists to the fulfillment of certain 
control specifications. One of these is found in 
the analysis of the so-called cheap control prob¬ 
lem namely, the problem of finding a stabilizing 
feedback control that minimizes the functional 

Je = - [y T (t)y(t) + su T (t)u(t)]dt 
^ Jo 

when s > 0 is small. As s -> 0, the optimal 
value J* tends to / 0 *, the ideal performance. It 
is well-known that, in a linear system, Jf = 0 
if and only if the system is minimum phase and 
right invertible and, in case the system has zeros 
with positive real part, it is possible to express 
explicitly Jf in terms of the zeros in question. If 
the (linear) system is expressed in normal form as 

z = Fz + 

% = Hz + K$ + bu 

y = £ 

with b 7 ^ 0 , and the zero dynamics are antistable 
(that is all the eigenvalues of F have positive real 
part), it can be shown that Jf coincides with the 
minimal value of the energy 

1 r°° 

J = - 2 J o 

required to stabilize the (antistable) system z = 
Fz+G%. In other words, the limit as s -> 0 of the 
optimal value of J E is equal to the least amount of 
energy required to stabilize the dynamics of the 
inverse system. 

This result has an appealing nonlinear coun¬ 
terpart (Seron 1999). In fact, for a nonlinear 
input-affine system having the same number m of 
inputs and outputs in normal form, with /o(z, £) 
of the form / 0 (z,£) = fo(z) + go(z)$ and z = 
/o(-) antistable, under appropriate technical as- 
sumptions (mostly related to the existence of the 
solution of the associated optimal control prob¬ 
lems), the same result holds: the lowest attainable 


value of the L 2 norm of the output coincides with 
the least amount of energy required to stabilize 
the dynamics of z. 

Summary and Future Directions 

The concept of zero dynamics plays an important 
role in a large number of problems arising in 
analysis and design of nonlinear control systems, 
among which the most relevant ones are the 
problems of asymptotic stabilization and those of 
asymptotic tracking/rejection of exogenous com¬ 
mand/disturbance inputs. Essentially, all such ap¬ 
plications deal with single-input single-output 
systems, require the system to be preliminarily 
reduced to a special form by means of appropriate 
change of coordinates, and assume the dynamics 
in question to be globally asymptotically stable. 
The analysis of systems having many inputs and 
many outputs, of systems in which normal forms 
cannot be defined, and of systems in which the 
zero dynamics are unstable is still a challenging 
and unexplored area of research. 

Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Input-to-State Stability 

► Regulation and Tracking of Nonlinear Systems 
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Abstract 

This entry gives an overview of classical 
and state-of-the-art nonparametric time and 
frequency-domain techniques. In opposition to 


parametric methods, these techniques require 
no detailed structural information to get 
insight into the dynamic behavior of complex 
systems. Therefore, nonparametric methods are 
used in system identification to get an initial 
idea of the model complexity and for model 
validation purposes (e.g., detection of unmodeled 
dynamics). Their drawback is the increased 
variability compared with the parametric 
estimates. Although the main focus of this entry 
is on the classical identification framework 
(estimation of dynamical systems operating 
in open loop from known input, noisy output 
observations), the reader will also learn more 
about (i) the connection between transient and 
leakage errors, (ii) the estimation of dynamical 
systems operating in closed loop, (iii) the 
estimation in the presence of input noise, and (iv) 
the influence of nonlinear distortions on the linear 
framework. All results are valid for discrete- and 
continuous-time systems. The entry concludes 
with some user choices and practical guidelines 
for setting up a system identification experiment 
and choosing an appropriate estimation method. 

Keywords 

Best linear approximation; Correlation method; 
Empirical transfer function estimate; Errors- 
in-variables; Feedback; Frequency response 
function; Gaussian process regression; Impulse 
transient response modeling method; Local 
polynomial method; Local rational method; 
Noise (co)variances; Noise power spectrum; 
Spectral analysis 

Introduction 

Nonparametric representations such as frequency 
response functions (FRFs) and noise power spec¬ 
tra are very useful in system identification: they 
are used (i) to verify the quality of the identifi¬ 
cation experiment (high or poor signal-to-noise 
ratio?), (ii) to get quickly insight into the dy¬ 
namic behavior of the plant (complex or easy 
identification problem?), and (iii) to validate the 
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parametric plant and noise models (detection of 
unmodeled dynamics); see also ► System Identi¬ 
fication: An Overview. In addition, via specially 
designed periodic excitation signals, it is possible 
to detect and quantify the nonlinear distortions 
in the FRF estimate. As such, without estimating 
a parametric model, the users can easily decide 
whether or not the linear framework is accurate 
enough for their particular application. 

The estimation of the nonparametric models 
typically starts from sampled input-output signals 
u(nT s ) and y(nT s ),n = 0,1,..., N — 1, that 
are transformed to the frequency domain via the 
discrete Fourier transform (DFT) 

N-l 

X(k) = — x(nT s )e- j2nkn ' N (1) 

n=0 

with T s the sampling period, v = u or y, and 
X = U or Y. One of the main difficulties 
in estimating an FRF and noise power spec¬ 
trum is the leakage error in the DFT spectrum 
X(k) = DFT (x(t)) (1). It is due to the finite 
duration NT S of the experiment, and it increases 
the mean square error of the nonparametric esti¬ 
mates. Therefore, all methods try to suppress the 
leakage error as much as possible. 

This entry starts by a detailed analysis of 
the leakage problem (section “The Leakage 
Problem”), followed by an overview of standard 
and advanced nonparametric time (section 
“Nonparametric Time-Domain Techniques”) and 
frequency (section “Nonparametric Frequen¬ 
cy-Domain Techniques”) domain techniques. 
First, it is assumed that the system operates 
in open loop (see Fig. 1) and that known 
input, noisy output observations are available 
(sections “Nonparametric Time-Domain Tech¬ 
niques” and “Nonparametric Frequency-Domain 
Techniques”). Next, section “Extensions” 
extends the results to systems operating in 
closed loop (section “Systems Operating in 
Feedback”); to noisy input, noisy output 
observations (section “Noisy Input, Noisy Output 
Observations”); and to nonlinear systems (section 
“Nonlinear Systems”). Finally, some user choices 
are discussed (section “User Choices”) and 
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Fig. 1 Classical identification framework: discrete- or 
continuous-time plant operating in open loop; known in¬ 
put u(t), noisy output y(t) observations; and rfi) filtered 
discrete-time or band-limited continuous-time white noise 
e(t) that is independent of u(t). y 0 (0 denotes the true 
output of the plant. In the continuous-time case, it is 
assumed that the unobserved driving noise source e(t ) has 
finite variance and constant (white) power spectrum within 
the acquisition bandwidth 

some practical guidelines are given (section 
“Guidelines”). Unless otherwise stated, the input 
u(t) and the disturbing noise v(t) are assumed to 
be statistically uncorrelated. 


The Leakage Problem 

For arbitrary excitations u(t), the relationship 
between the true input U ( k ) and true output To ( k ) 
DFT spectra (1) of a linear dynamic system is 
given by 

Y 0 (k) = G(Q k )U(k) + T G (Q k ) (2) 

where Qk = jcok or exp(— jcokT s ) for, 
respectively, continuous- and discrete-time 
systems; o)k = Ink/(NT s ); G(Qk) the plant 
frequency response function; and Tc(Qk) 
the leakage error due to the plant dynamics 
(Pintelon and Schoukens 2012, Section 6.3.2). 
The leakage error 7 g(£ 2) is a smooth function of 
the frequency that decreases to zero as 0(N ~ ly/2 ) 
for N increasing to infinity. It depends on the 
difference between the initial and final conditions 
of the experiment and has exactly the same poles 
as the plant transfer function. Therefore, the 
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time-domain response of T G (£2) is decaying 
exponentially to zero as a transient error. 

From this short discussion, it can be concluded 
that the leakage error in the frequency domain 
is equivalent to the transient error in the time 
domain. The only difference being that the former 
depends on the difference between the initial and 
final conditions, while the latter solely depends 
on the initial conditions. 

Standard spectral analysis methods (see sec¬ 
tion “Spectral Analysis Method”) suppress the 
leakage term T G (Qk) in (2) by multiplying the 
time-domain signals with a window w(t) before 
taking the DFT (1) 

i N-l 

( X(k))W = -= -V MnT s )x(nT s )e- j2jlk Z 

VAfWrms n=() 

( 3 ) 

/N -1 \ l / 2 

with Wnns = I J2 | w(nT s )\ 2 /N j the root 
\n= 0 J 

mean square (rms) value of the window w{t). 
The scaling in (3) is such that the transformation 
preserves the rms value of the signal. The rela¬ 
tionship between the DFT spectra ( U(k))w and 
(Y 0 (k)) w of the windowed input-output signals 
w(t)u(t) and w(t)yo(t) is given by 

(Y 0 (k))W= G(Q k ) (U(k)) w + E int (k) + E lQak (k) 

( 4 ) 

where E mt (k) and E\^{k) are, respectively, the 
interpolation error and the remaining leakage 
error 

Eimik) = (' G(Qk)U(k))w — G(Qk) ( U(k))w 

( 5 ) 

^leak(^) = (T G (Qk))w (6) 

Note that E mi (k) = 0 if G(£^) is constant 
within the bandwidth of W(k ), while the interpo¬ 
lation error is large if the FRF varies significantly 
within the window bandwidth. To keep E mi (k) 
small, the frequency resolution \/(NT s ) should 
be sufficiently large and the window bandwidth 
should be small enough. On the other hand, a 
larger window bandwidth is beneficial for reduc¬ 
ing the leakage error E\ Q2 ^(k). Hence, choosing 
an appropriate window for nonparametric FRF 


and noise power spectrum estimation is making 
a trade-off between the reduction of the leakage 
error E\^(k) and the increase of the interpola¬ 
tion error E m (k) (Schoukens et al. 2006). 

Note that exactly the same analysis can be 
made for the continuous- or discrete-time dynam¬ 
ics of the disturbing output noise v(t ) in Fig. 1 

V(k) = H(n k )E(k) + T H (Q k ) (7) 

with H(£2k) the noise frequency response 
function, E(k) the DFT of the unobserved 
driving discrete-time or band-limited continuous¬ 
time white noise source e(t ) (Pintelon and 
Schoukens 2012, Section 6.7.3), and T H (&k) 
the noise leakage (transient) term. The noise 
leakage term is often neglected but can be 
important for lightly damped systems (e.g., in 
modal analysis). Most nonparametric techniques 
suppress the sum of the plant and noise leakage 
errors T G (Q k ) + T H (Q k )- 

If an integer number of periods of the 
steady-state response to a periodic excitation 
is measured, then the plant leakage error T G (Q>k) 
in (2) is zero, which simplifies significantly the 
estimation problem. Therefore, for the frequency- 
domain techniques, a distinction is made between 
periodic and nonperiodic excitations. Note, 
however, that the noise leakage (transient) 
term Tni^k) in (7) remains different from 
zero. 

Nonparametric Time-Domain 
Techniques 

The time-domain methods estimate the impulse 
response of the plant via the time-domain rela¬ 
tionship that the true output jo(0 equals the con¬ 
volution product between the impulse response 
g(t) and the true input u{t). For discrete-time 
systems, it takes the form 

oo 

JoO) = Yj s{n)u{t - n) (8) 

n =0 

In practice only a finite number of impulse re¬ 
sponse coefficients g(t ) can be estimated from 
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N input-output samples, and, therefore, (8) is 
approximated by a finite sum 

L 

yo(t) ~ X! s(n)u{t - n) (9) 

n =0 

where L < N — l should also be determined 
from the data. From (9), it can be seen that 
the response depends on the past input values 
u{— 1), u{— 2),..., u(—L). Since these values are 
unknown, an exponentially decaying transient 
error is present in the first L samples of the pre¬ 
dicted output (9). This transient error is the time- 
domain equivalent of the leakage error Tci^k) 
in (2). To remove the transient error, the first L 
output samples can be discarded in the predicted 
output (9). It reduces the amount of data from N 
to TV — L and, hence, increases the mean square 
error of the estimates. If it is known that the 
transfer function has no direct term, then g(0) = 
0, and the sum (9) starts from n = 1. 

Correlation Methods 

Correlation methods have been studied inten¬ 
sively since the end of the 1950s (see Eykhoff 
1974) and are nowadays still used in telecom¬ 
munication channel estimation and equalization. 
The impulse response coefficients are found by 
minimizing the sum of the squared differences 
between the observed output samples and the 
output samples predicted by (9) 

N -1 L 

XI CKO - E K«) «(* - «)) 2 0°) 

t=L n =0 

w.r.t. g(m),m = 0,1 ,...,L. The solution of 
this linear least squares problem is given by the 
famous Wiener-Hopf equation 

L 

Ryu(m) = y] g(n) R uu (m-n) (11) 

n =0 

for m = 0,1,..., L, where R yu and R uu are 
estimates of, respectively, the cross- and autocor¬ 
relation functions R yu (j) — E{y (t)u(t — r)} and 
Ruu(r) = E {u{t)u(t - r)} 


I ^_* 

Ryu{m) = y(t)u(t - m) (12) 

t = L 

1 N ~ l 

R uu (m - n) = — — — ^ u(t - n)u(t - m) 

t=L 

(13) 

(Godfrey 1993, Chapter 1; Ljung 1999, Chap¬ 
ter 6). Since the number of estimated impulse 
response coefficients L can grow with the amount 
of data N, the correlation method (11) is clas¬ 
sified as being nonparametric. If the input is 
white noise, then the expected value of R uu (pi) is 
proportional to the Kronecker delta 8(m ), and the 
cross-correlation R yu (m) (11) is - within a scal¬ 
ing factor - a good approximation of the impulse 
response. This property is used in blind channel 
estimation. 

Gaussian Process Regression 

The linear least squares (10) solution can be 
(very) sensitive to disturbing output noise if L 
is not much smaller than N. This problem is 
circumvented by the Gaussian process regression 
approach. The key idea consists in modeling 
the impulse response coefficients g(n ) as a 
zero-mean Gaussian process with a certain 
covariance structure Pl that depends on a few 
hyper-parameters (Pillonetto et al. 2011). In 
Chen et al. (2012), it has been shown that the 
Gaussian process regression is equivalent to 
the following regularized (see also ► System 
Identification Techniques: Convexification, Reg¬ 
ularization, and Relaxation) linear least squares 
problem 

N -1 L 

y^CKO - y]g(«)w(f -n)) 2 +a 2 g T P^ l g 

t=L n =0 

(14) 

where g = (g(0), g(l),..., g(L)) T and with 
o 2 the variance of the output disturbance. The 
hyper-parameters defining Pl and the noise vari¬ 
ance o 2 are estimated via an empirical Bayes 
method. 
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Nonparametric Frequency-Domain 
Techniques 

The frequency-domain techniques estimate the 
frequency response function (FRF) using rela¬ 
tionship (2) or (4) between the input-output DFT 
spectra. We start with the simplest approach and 
gradually increase the complexity of the esti¬ 
mation methods. Note that nonparametric FRF 
estimation is still a quickly evolving research 
area, such that the pros and cons of the advanced 
methods are yet not well established. 

Empirical Transfer Function Estimation 

If an integer number of periods P of the steady- 
state response to a periodic excitation is ob¬ 
served, then the leakage term in Tc(^k) in (2) 
is zero, and the FRF is estimated by dividing the 
output by the input DFT spectra at the excited 
frequencies (Pintelon and Schoukens 2012, Sec¬ 
tion 2.4) 

(i5> 

The output noise variance Oy ( k ) is estimated via 
the sample variance &y(k) of the output DFT 
spectra over the P consecutive signal periods. 
The variance of the FRF estimate (15) is then 
given by 


subrecords of the total response (Ljung 1999, 
Section 6.4). In Heath (2007), it is shown that the 
optimally (in mean square sense) weighted ETFE 
equals the spectral analysis method. 

Spectral Analysis Method 

The spectral analysis method is available in any 
digital spectrum analyzer. It is based on the 
relationship between the FRF and the cross- 
and autopower spectra of the input-output 
signals 


G(Q) = 


Syu(Q) 


F{R yu (r)} 


S uu m F{R UU ( r)} 


(17) 


with F{} the Fourier transform (Bendat and Pier- 
sol 1980, Chapter 4; Brillinger 1981, Chapter 8). 
Comparing (11) and (17), it can be seen that the 
spectral analysis method is the frequency-domain 
equivalent of the correlation method (take the 
Fourier transform of the expected value of (1 1)). 
There are basically two methods for estimating 
the cross- and autopower spectra in (17) from 
sampled data: the Blackman and Tukey (1958) 
and the Welch (1967) procedures. 

The Blackman-Tukey procedure (Blackman 
and Tukey 1958; Ljung 1999, Section 6.4) con¬ 
sists in taking the DFT (3) of the windowed cross- 
and autocorrelation functions, viz., 


var(G(£4)) = (16) 

P\U(k )\ 2 

where \U(k)\ is the magnitude of U(k). 

Applying (15) to random excitations gives 
the empirical transfer function estimate (Ljung 
1999, Section 6.3). Due to the presence of the 
plant leakage error TciS^k)/ U(k), the statistical 
properties of (15) for random inputs are quite 
different from those for periodic inputs. While 
the empirical transfer function estimate (ETFE) is 
unbiased and has finite variance (16) for periodic 
inputs, it is biased and has infinite variance for 
random inputs (Broersen 2004). To improve the 
statistical properties of the ETFE for random 
inputs, one can either approximate locally the 
ETFE by a polynomial (Stenman et al. 2000) 
or perform a weighted average of ETFEs over 


Ryuir) 


1 

N 


N -1 

£y(0 u(t-r) 


t= 


( 18 ) 


SRyu(k ) = 


Vn 


N -1 

y]w(r) R yu (x)e 

T=0 


~j^ k f 


(19) 


resulting in an FRF estimate (17) at the full 
frequency resolution 1 /(NT S ) of the measure¬ 
ment. It can be shown that (19) is a smoothed 
version of the periodogram Y(k)U(k ), where is 
U the complex conjugate of U (Brillinger 1981, 
Chapter 5). 

In the Welch approach (Welch 1967; Pin¬ 
telon and Schoukens 2012, Section 2.6), the N 
input-output samples are split into M subrecords 
of N/M samples each, and the DFT spectra 
(U^ m \k))w and ( Y^ m \k))w of the windowed 
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input and output samples are calculated via (3) 
where N is replaced by N/M, giving 

i M _ 

S Yw u w (k) = — ( Y [m] (k )) W(UM(k)) W 

m = 1 

( 20 ) 

1 M 

Su w u w (k ) = — J2 | (U [m] (k)) W | 2 (21) 

m = 1 

The spectral analysis estimate of the FRF and its 
variance are then given by 


more experiments (input-output data records) are 
available. If the number of measured records M 
increases to infinity, then (22) converges to the 
true value, provided a perfect suppression of the 
leakage error. 

In measurement devices, the quality of the 
spectral analysis estimate (22) is often quantified 
via the coherence y 2 (co) 


MS2)| 2 

Syy(Q)S m (Q) 


(25) 


/A ^ Sy w U w (ft) 
G (^) = e- 

ou w U w \ K ) 


( 22 ) 


which is comprised between 0 and 1. It is related 
to the variance of the spectral analysis estimate as 


var(G(£4)) ^ < 23 > 

(Brillinger 1981, Chapter 8; Heath 2007). Finally, 
the output noise variance Oy(k) in (23) is esti¬ 
mated as 


v 2 (k) 


M 


M - 1 


>y w y w 


(k) 


>Y w Uw 


(k) 


>U w Uw 


(k) 


(24) 


(Brillinger 1981, Chapter 8; Pintelon and 
Schoukens 2012, Section 2.5.4). Due to 
the spectral width of the window used, the 
estimates (22) and (24) are correlated over 
the frequency (the correlation length is about 
twice the spectral width). Note that (21) is used 
for estimating noise power spectra (Brillinger 
1981, Chapter 5). Note also that for periodic 
excitations combined with a rectangular window 
w(nT s ) = 1, the spectral analysis estimate (22), 
where each subrecord is equal to a signal period, 
simplifies to the ETFE (15). 

Compared with the Blackman-Tukey proce¬ 
dure (19), the FRF estimate (22) based on the 
Welch approach (20) and (21) has a frequency 
resolution and a variance (23) that are M times 
smaller. In measurement devices, the FRFs are 
estimated using the Welch approach (20)-(22) 
where each subrecord is an independent measure¬ 
ment with a fixed number of samples. The reason 
for this is that the cross- and autopower spectra 
estimates (20) and (21) can easily be updated as 


- 1 — y 2 (cok) 2 9 

var(G(£^)) = - (o)k) |G(£^)| 

y 

A coherence smaller than 1 indicates the presence 
of disturbing noise, residual leakage errors, non¬ 
linear distortions, or a nonobserved input. 

Following the same lines of Welch (1967), 
the statistical properties of the spectral analysis 
estimate (22) can be improved via overlapping 
subrecords in the cross- and autopower spectra 
estimates (20) and (21). This has been studied in 
detail for noise power spectra in Carter and Nut- 
tall (1980) and for FRFs in Antoni and Schoukens 
(2007). 

Advanced Methods 

The goal of the advanced methods is to estimate 
the FRF at the full frequency resolution 1 / (NT S ) 
of the experiment duration NT S while suppressing 
the influence of the leakage and the noise errors. 
Without some extra information, it is impossi¬ 
ble to achieve this goal via (2). The additional 
piece of information that allows one to solve the 
problem is that the FRF and the leakage error are 
locally smooth functions of the frequency. 

The local polynomial method (Pintelon and 
Schoukens 2012, Chapter 7) approximates the 
FRF and the leakage error in (2) locally in the 
frequency band [k — n , k + n\ by a polynomial. 
From the residuals of the local linear least squares 
solution, one also gets an estimate of the output 
noise variance Oy and, hence, also of the variance 
of the FRF. The whole procedure is repeated for 
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all DFT frequencies k in the frequency band of 
interest. The correlation length of the estimates 
equals d=2«, which is twice the local bandwidth 
of the polynomial approximation. 

The local rational method (McKelvey and 
Guerin 2012) follows the same lines as the local 
polynomial method, except that the FRF and the 
leakage error in (2) are locally approximated by 
rational forms with the same poles (G = B/A 
and Tq = I/A). Due to the common poles, 
the local rational approximation problem can be 
transformed into a local linear least squares prob¬ 
lem. The method is biased but suppresses better 
the plant leakage error of lowly damped systems. 

The transient impulse response modeling 
method (Flagg and Hjalmarsson 2012) ap¬ 
proximates the FRF and the leakage error by, 
respectively, finite impulse and transient response 
models, giving a large sparse global linear least 
squares problem. From the residuals of the global 
linear least squares solution, one gets an estimate 
of the output noise variance Oy and, hence, also 
of the variance of the FRF. This approach has the 
best smoothing properties and is recommended 
in case the noise error is dominant. 


Extensions 

In sections “Nonparametric Time-Domain Tech¬ 
niques” and “Nonparametric Frequency-Domain 
Techniques,” it is assumed that the linear plant 
operates in open loop and that the input is known 
exactly. If the plant operates in feedback and/or 
the input observations are noisy, then the pre¬ 
sented time and frequency-domain techniques are 
biased. In sections “Systems Operating in Feed¬ 


back” and “Noisy Input, Noisy Output Observa¬ 
tions,” it is shown that the estimation bias can 
be avoided if a known external reference signal 
is available (typically the signal stored in the 
arbitrary waveform generator). 

Since most real-life systems behave to some 
extent nonlinearly, it is important to detect and 
quantify the nonlinear effects in FRF estimates. 
This issue is handled in section “Nonlinear Sys¬ 
tems.” 


Systems Operating in Feedback 

The key difficulty of estimating the FRF of a plant 
operating in feedback (see Fig. 2) using nonpe¬ 
riodic excitations is that the true input u(t) is 
correlated with the process noise v{t). The direct 
approaches of sections “Nonparametric Time- 
Domain Techniques” and “Nonparametric Fre¬ 
quency-Domain Techniques” lead to biased esti¬ 
mates (Wellstead 1981). This can easily be seen 
from the ETFE (15) applied to the feedback setup 
in Fig. 2 


G(Q k ) = 


G(Q k )G act (Q k )R(k) + V(k) 
GU^k)R{k)-G^ k )V{k) 


(26) 


where G ac t(^) and Gf^(Q k ) are, respectively, 
the actuator and feedback dynamics. From (26), 
it follows that in those frequency bands where 
the process noise V(k ) dominates, one rather 
estimates minus the inverse of the feedback dy¬ 
namics instead of the plant FRF. On the other 
hand, at those frequencies where the reference 
signal injects most power, the ETFE (26) will be 
close to the plant FRF. 


V) 


Actuator 


Feedback 


v(t) 


U\l) 


)— 

Plant 


-*(±y 


y(t) 

- 


Nonparametric Techniques in System Identification, on the process noise v(t), and y(t) is the noisy output 
Fig. 2 Plant operating in closed loop: r{t) is the ex- observation 
temal reference signal, the known input u(t) depends 

















914 


Nonparametric Techniques in System Identification 


n g (t) 


r{t) 
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v{t) 
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m u (t )—>{+; 
u(t)v 


m y (t) —H+) 

y(t) 


Nonparametric Techniques in System Identification, 
Fig. 3 Errors-in-variables framework: r(t) is the external 
reference signal; n g (t) is the generator noise; m u (t), 


m y (t) are the input and output measurement errors; v(t) is 
the process noise; and u(t), y{t ) are the noisy input, noisy 
output observations 


If a known external reference signal is avail¬ 
able, then the bias is avoided via the indirect 
method proposed in Wellstead (1981) 

_ Syr(Q)/S rr (Q) _ £yr(£2) ^ . 

1 ' “ S Mr (£>)/S rr (ft) “ SurW 

The basic idea consists in modeling the feedback 
setup (see Fig. 2) from the known reference to the 
input and output simultaneously. This reduces the 
single-input, single-output closed loop problem 
to a single-input, two-output open loop problem. 
Since the process noise v(t) is independent of 
the reference signal r(t), the direct estimate of 
the single-input, two-output FRF is unbiased. 
Calculating the ratio of the two FRFs finally 
gives the indirect estimate (27). This procedure 
can be applied to any of the direct methods 
of sections “Nonparametric Time-Domain Tech¬ 
niques” and “Nonparametric Frequency-Domain 
Techniques.” Proceeding in this way, unstable 
plants operating in a stabilizing feedback loop 
can also be handled. 

If the excitation is periodic , then the process 
noise v(t) is independent of the periodic part 
of the input u(t), and the ETFE (15) converges 
to the true value as the number of periods P 
tends to infinity (Pintelon and Schoukens 2012, 
Section 2.5). Hence, in the periodic case, no 
external reference is needed. 

Noisy Input, Noisy Output Observations 

The key difficulty of estimating the FRF of a plant 
excited by a nonperiodic signal from noisy input, 
noisy output observations (see Fig. 3) is that the 


input autopower spectrum in (17) is biased. In¬ 
deed, due to the noise on the input, S UU (Q) is 
too large, resulting in too small direct FRF esti¬ 
mates. This is true for all direct FRF approaches 
in sections “Nonparametric Time-Domain Tech¬ 
niques” and “Nonparametric Frequency-Domain 
Techniques.” Applying the indirect method of 
section “Systems Operating in Feedback” re¬ 
moves the bias because the noise on the in¬ 
put is independent of the reference signal (e.g., 
see (27)). Proceeding in this way, the closed 
loop case (see Fig. 2) with noisy input, noisy 
output observations is also solved by the indirect 
method. 

If the excitation is periodic , then the mean 
value of the input-output DFT spectra over the 
P consecutive periods converges to the their 
true values as P tends to infinity (Pintelon 
and Schoukens 2012, Section 2.5). Hence, the 
ETFE (15) is still consistent, and no external 
reference is needed. The same conclusion is valid 
for systems operating in feedback. 

Nonlinear Systems 

The classes of nonlinear systems considered are 
those systems whose steady-state response to 
a periodic input is periodic with the same pe¬ 
riod as the input. It excludes phenomena such 
as chaos and subharmonics but allows for hard 
nonlinearities such as saturation, dead zones, and 
clipping. 

The classes of excitations considered are 
stationary random signals with a specified 
power spectrum and probability density function. 
An important special case is the class of 
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Fig. 4 Best linear approximation (. BLA ) of a nonlinear 
(NL) period in, same period out ( PISPO ) system, excited 
by a zero-mean random signal u{t ) with a given power 
spectrum and probability density function. y(t) is the 
zero-mean part of the actual output of the nonlinear 
system, u^c and jdc are the DC levels of the actual input 
and output of the nonlinear system. The zero-mean output 
residual y s (t) is uncorrelated with - but not independent 
of - the input u(t ) 


Gaussian excitation signals with a specified 
power spectrum. This class includes random 
phase multisines (a sum of harmonically 
related sinewaves with user-specified amplitudes 
and random phases) with the same Riemann 
equivalent power spectrum (Pintelon and 
Schoukens 2012, Section 4.2). 

Consider a nonlinear (NL) period in, same 
period out (PISPO) system excited by a random 
excitation belonging to a particular class (see 
Fig. 4). The FRF (17), where the expected value 
is taken w.r.t. the random realization of the exci¬ 
tation, is the best (in mean square sense) linear 
approximation (BLA) of the nonlinear PISPO 
system, because the difference y s (t ) between the 
actual output of the nonlinear system (DC value 
excluded) and the output predicted by the linear 
approximation is uncorrelated with the input u(t) 
(Enqvist and Ljung 2005). Although uncorre¬ 
lated with the input, the output residual y s (t) 
still depends on u(t). If the NL PISPO system 
operates in feedback (see Fig. 2), then the indirect 
method (27) is used for calculating the BLA, and 
the output residual y s (t ) is uncorrelated with - 
but not independent of - the reference signal r(t) 
(Fig. 2). 

For the class of Gaussian excitation signals, it 
can be shown that the DFT spectrum Ys(k) of 


y s (t ) has the following properties (Pintelon and 
Schoukens 2012, Section 3.4.4): 

1. Ys(k ) has zero-mean value: E{Ls(k)} = 0. 

2. Ys(k) it is uncorrelated with - but not inde¬ 
pendent of -U(k) : E{Y(k)U(k)} = 0. 

3. Ys(k) is asymptotically ( N oo) normally 
distributed. 

4. Ys(k) is asymptotically ( N —> oo) uncorre¬ 
lated over the frequency. 

These second-order properties are exactly the 
same as those of a filtered white noise distur¬ 
bance, except that the noise is independent of 
the input. It shows that it is impossible to dis¬ 
tinguish the nonlinear distortions y s (t ) from the 
disturbing noise v(t) in FRF measurements using 
stationary random excitations (only second-order 
statistics are involved in (22)-(24)). 

Using random phase multisines, it is possible 
to detect and quantify the nonlinear distortions 
because y s (t ) is then periodically related to the 
input u(t) (property of the NL PISPO system). 
Indeed, analyzing the FRF over consecutive sig¬ 
nal periods quantifies the noise variance v(t ) 
( y s (t ) does not change over the periods), while 
analyzing the FRF over different random phase 
realizations of the input quantifies the sum of the 
noise variance and the variance of the nonlinear 
distortions (y s (t) depends on the random phase 
realization of the input). Subtracting both vari¬ 
ances gives an estimate of the variance of the non¬ 
linear distortions. While this variance quantifies 
exactly the variability of the nonparametric FRF 
estimate due to the nonlinear distortions, it can 
(significantly) underestimate the variability of a 
parametric plant model. The basic reason for this 
is that the true variance of the parametric plant 
model also depends on the nonzero higher (>2) 
order moments between the input u(t ) and the 
nonlinear distortions y s (t). 

User Choices 

There is no clear answer to the question which 
of the presented techniques is the best. It strongly 
depends on the intended use of the nonparametric 
estimates and the particular application handled. 
For example, the intended use can be: 
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1. A smooth representation of the FRF 

2. Use of the nonparametric estimates as an in¬ 
termediate step for parametric modeling of the 
plant 

In the first case, one should opt for the mini¬ 
mum mean square error solution, while in the 
second case, it is crucial that the nonparametric 
estimates are unbiased, possibly at the price of an 
increased variance. Indeed, the parametric plant 
modeling step cannot eliminate the bias error in 
the nonparametric estimates while it suppresses 
the variance error. 

The application-dependent answers to the fol¬ 
lowing questions strongly influence the choice 
and the settings of the method used: 

1. Is a large frequency resolution needed and/or 
is leakage the dominant error? 

2. Is the noise or the leakage error dominant? 

3. Is it necessary to detect and quantify the non¬ 
linear behavior? 

If the answer to the first question is yes, then 
one should opt for one of the advanced methods 
(section “Advanced Methods”) or use the spec¬ 
tral analysis estimates (section “Spectral Analysis 
Method”) with a small number M of subrecords. 
On the other hand, if the noise error is dominant, 
then M in (22)-(24) should be chosen as large 
as possible. To detect and quantify the nonlinear 
effects, one should use periodic signals (random 
phase multisines) combined with the ETFE (sec¬ 
tion “Empirical Transfer Function Estimation”). 

Finally, comparing the different nonparamet¬ 
ric techniques is also not straightforward because 
of their different 

1. Frequency resolution 

2. Quality of the estimated noise model 

3. Correlation length over the frequency 

The latter is set by the spectral width of the 
window used in the spectral analysis method and 
the local bandwidth in the advanced methods. 


Guidelines 

While the previous sections give well-established 
facts about the different nonparametric techniques, 


in this section, we provide some advices/ 

guidelines based on our personal interpretation 

of these facts: 

• Always store the reference signal together 
with the observed input-output signals. The 
knowledge of the reference signal allows one 
to solve nonparametrically the closed loop and 
errors-in-variables problems. 

• Whenever possible use periodic excitation sig¬ 
nals (random phase multisines): they allow 
one to estimate from one experiment the FRF, 
the noise level, and the level of the nonlinear 
distortions. As such the deviation of the true 
dynamic behavior from the ideal linear time- 
invariant framework is quantified. 

• Select one of the advanced methods if fre¬ 
quency resolution is of prime interest. 

• If the goal of the identification experiment 
is to minimize the prediction error, then the 
Gaussian process regression method is a very 
promising approach. 

• For lowly damped systems and a limited fre¬ 
quency resolution, the local rational method is 
a good candidate solution. 

• Use a minimum mean square solution for a 
smooth representation of the FRF. 

• Choose unbiased nonparametric estimates for 
use in parametric plant modeling (estimation, 
validation, and model selection). 

• When comparing nonparametric techniques, 
always take into account all aspects of 
the estimates: the bias and variance of 
the FRF and noise model, the frequency 
resolution, and the correlation length over the 
frequency. 


Summary and Future Directions 

Nonparametric techniques are very useful 
because they simplify the parametric plant 
modeling in the initial selection of the model 
complexity and in the detection of unmodeled 
dynamics. The classical correlation and spectral 
analysis methods developed in the 1950s and 
refined till the 1980s are still widely used. 
Recently, advanced time- and frequency-domain 
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methods have been developed which all try to 
minimize the sensitivity (bias and variance) of the 
nonparametric estimates to disturbing noise, non¬ 
linear distortion, and transient (leakage) errors. 

The renewed research interest in nonparamet¬ 
ric techniques should be continued to handle 
the following challenging problems: short data 
sets, missing data, detection and quantification of 
time-variant behavior, modeling of time-variant 
dynamics, and modeling of nonlinear dynamics. 


Cross-References 

► Frequency Domain System Identification 

► Frequency-Response and Frequency-Domain 
Models 

► System Identification: An Overview 

► System Identification Techniques: Convexifica- 
tion, Regularization, and Relaxation 


Recommended Reading 

The classical correlation (see section “Correla¬ 
tion Methods”) and spectral analysis (see section 
“Spectral Analysis Method”) methods are well 
covered by the text books listed below. The rec¬ 
ommended reading list includes the basic papers 
on the spectral analysis methods (Blackman and 
Tukey 1958; Welch 1967; Wellstead 1981) and 
the most recent developments described in sec¬ 
tions “Gaussian Process Regression” and “Ad¬ 
vanced Methods.” 
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Introduction 

This expository article provides a brief review 
of numerical methods for stochastic control in 
continuous time. Leaving most of the technical 
details out with the broad general audience in 
mind, it aims to serve as an introductory reference 
for researchers, practitioners, and students, who 
wish to know something about numerical meth¬ 
ods for stochastic controls. 

The study of stochastic control has witnessed 
tremendous progress in the last few decades; see, 
for example, Fleming and Rishel (1975), Fleming 
and Soner (1992), Kushner (1977), and Yong and 
Zhou (1999) among others, for fundamentals of 


stochastic controls as well as historical remarks. 
Much of the development has been accompanied 
by the needs and progress in science, engineering, 
as well as finance. Typically, the problems are 
highly nonlinear, so a closed-form solution is 
very difficult to obtain. As a result, designing 
feasible numerical algorithms becomes vitally 
important. Among the many approximation 
methods, the Markov chain approximation 
methods have shown most promising features. 
Primarily for treating diffusions, the Markov 
chain approximation method was initiated in the 
1970s (Kushner 1977) and substantially devel¬ 
oped further in Kushner (1990b) and Kushner 
and Dupuis (1992). Nowadays, such method 
are used for more complex jump diffusions, or 
systems with random switchings. There were also 
efforts to incorporate the methods into an expert 
system so that the methods can be placed into 
an easily usable tool box (Chancelier et al. 1986, 
1987). In addition to the existing applications in 
a wide variety of engineering problems, recently 
applications include such areas as insurance, 
quantile hedging for guaranteed minimum death 
benefits, dividend payment and investment 
strategies with capital injection, singular control, 
risk management, portfolio selection with 
bounded constraints, and production planning 
and manufacturing problems; see Jin et al. (2011, 
2012, 2013), Sethi and Zhang (1994), and Yin 
et al. (2009) and references therein. 

Let us begin with the controlled diffusion 
problem. We wish to minimize the cost function 
defined by 

J{x, «(•)) = E x [J R(X(t), u(t))dt + 5(*(r))], 

( 1 ) 

with the -valued process X(r) defined by the 
solution of the stochastic differential equation 

dX(t) = b(X(t), u(t))dt + a(X(t))dW, 
X(0) = x (2) 
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where x e W, u(-) is a U -valued, measurable 
process with U C M. d being a compact control 
set, W(-) is an r-dimensional standard Brownian 
motion, and r is the first exit time of the diffusion 
from a bounded domain D , that is, r = min{£ : 
X(t) $ D 0 } with D° denoting the interior of D , 
b( •,.) : M r x R d i-> M r , a(-) : M r i-> W x W, 
and R( •, •) : M r x R d i-> M and £(•) : M r i-> M. 
In the above, b (•) is the control-dependent drift, 
cr(-) is the diffusion matrix, R(-) is the running 
cost, and B(-) is the terminal or boundary cost. 
Throughout the entry, we assume that the stop¬ 
ping time r < oo with probability one (w.p.l) for 
simplicity. Denote the value function by V (x) = 
inf M J(x, u(-)), where the inf is taken over all 
admissible controls. Write the transpose of Y G 
as y' with^i, <7 2 > 1, ci (x) = cr(x)cr / (x), 
and define the generator of the controlled Markov 
process by 

£“/(*) = f tr (a(x)f xx (x)) + b'(x,u)f x (x), 

(3) 

for a suitably smooth function /(•), where f x (-) 
and fxx() denote the gradient and Hessian 
of /(•), respectively. Note that the operator 
is control dependent. Using 3 D to denote the 
boundary of D , then the associated Hamilton- 
Jacobi-Bellman (HJB) equation satisfied by the 
value function is given by 

(inf [C u V(x) + R(x,u)\ = 0, x e D°, 

< " ( 4 ) 

( V(x) = B(x), x e dD. 

The subject matter of this article is to 
solve the optimal stochastic control problem 
numerically. 

The rest of the entry is arranged as follows. 
Section “Markov Chain Approximation” focuses 
on Markov chain approximation. It illustrates 
how one can construct the controlled Markov 
chain in discrete time for the approximation of 
the continuous-time stochastic control problems. 
Section “Illustration: A One-Dimensional Prob¬ 
lem” uses a one-dimensional case as an example 


for illustration. Section “Numerical Computa¬ 
tion” discusses the implementation issues. We 
conclude the entry with a few further remarks. 


Markov Chain Approximation 

The main idea was initiated in Kushner (1977) 
and streamlined, extended, and further developed 
in Kushner and Dupuis (1992). An earlier paper 
describing how to discretize the elliptic HJB 
equation and then interpret it according to a 
controlled Markov chain can be found in Kushner 
and Kleinman (1968). This section illustrates 
the Markov chain approximation methods with 
simple setup. The reader is suggested to read the 
references mentioned above for a comprehensive 
treatment. To begin, let h > 0 be a small 
“step size” in the approximation. Instead of 
the domain D , we need to work with a finite 
set to ensure computational feasibility. Set W h 
to be r-dimensional lattice cube, i.e., W h = 
{..., — 2h, — h, 0, h, 2h ,.. .} r (an r-dimensional 
product of the indicated set). Denote the interior 
of D by D°, and define D° h = D° n W h . 
We shall construct a controlled, discrete-time 
Markov chain, whose transition probabilities 
have desired properties in line with the controlled 
diffusion and whose values are in Dq. Suppose 
that {c^} is a time-homogeneous, discrete-time, 
controlled Markov chain with finite state space 
D q and transition probabilities P = (p(x,y\v)) 
with x,y G Dq. Here we only consider the 
case that the Markov chain has a finite state 
space. This is sufficient for our computational 
purposes. At any time n , the control action is a 
random variable denoted by taking values in a 
compact set U. Set the interpolation interval by 
At h (x,v ) > 0 and write At% = At h (ot n , ufy 
such that sup x At h (x, v) -> 0 as h -> 0 
but inf x , v At h (x, v) > 0 for each h > 0. The 
control is admissible if the Markov property 
P( a n+\ = < n) = P(a h n+l = 

y\a h n ,u h n ) = P(ot h n ,y\u h n ) holds. Use U h 
to denote the collection of controls, which 
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are determined by a sequence of measurable given {a h , u h : j <n,a„ = x,u„ = v} by E„. 
functions (•) such that u h n — F„(a k ,k < We say that a control policy is locally consistent 
rr,u k ,k < n). Denote the conditional expectation if 


= b(x, v)At h (x, v) + o(A h (x, v)), 

E^[Aoi h n - E h n Aa h n \[Aa h n - E h n Aot h J = a(x)At h (x , v) + o(At h (x , v)), (5) 

a(x) = a(x)a'(x), \Act„\ 0 as h 0 uniformly in n,co, 


where Aa„ = o^ +1 — a„. The meaning of the 
local consistency can be seen from the corre¬ 
sponding controlled diffusion (2) (with X(0) = x 
and u(t ) = v for t e [0, 8 ], where 8 > 0 is a small 
parameter) in that E x (X(8) — x) = b(x, v)8 + 
o(8), E x [X(S) - x][X(8) - x]' = a(x)8 + o(8). 
Let Vh be the first time that {ofy leaves the set D®. 
We have an approximation for the cost function of 
the controlled diffusion (1) given by 


facilitated by the use of the so-called relaxed 
controls (Kushner and Dupuis 1992, p. 267), 
which enables us to characterize the limit under 
the framework of weak convergence. The detailed 
argument is beyond the scope of this entry. We 
refer the reader to Kushner and Dupuis (1992, 
Chapter 10) for further reading on the proof of 
convergence and the conditions needed. 


J h {x,u h ) = E“ h 


Vh~ 1 

+ 5 (<) 


j= o 

( 6 ) 


Define t% = and the continuous¬ 
time interpolations o t h (t) = for 

t G Define the first exit time of a h (-) 

from D k by r h = . Corresponding to the 

continuous-time problems, the first term on the 
right-hand side of (6) represents the running cost 
and the last term gives the terminal cost. Denote 
the value function by V h (x). Then it satisfies the 
dynamic programming equation 


V h (x) 


inf [R(x, v)At h (x, v) 

veu h 

+T,y P h (x,y\v)V h (y)], X eD° h , 
B(x), x 4 D° h . 

(7) 


Proving the convergence of the numerical al¬ 
gorithms is an important task. This requires the 
use of local consistency, interpolation of the ap¬ 
proximating sequences in continuous time, as 
well as martingale representation. The proof is 


Illustration: A One-Dimensional 
Problem 


In this section, we use a one-dimensional exam¬ 
ple to illustrate the Markov chain approximation 
methods, which enables us to present the results 
with a better visualization. Consider (2) with x e 
M. We proceed to find the transition probabilities 
and interpolation intervals for the Markov chain 
{c^}. To construct a controlled Markov chain that 
is locally consistent, we first consider a special 
case, namely, the control space has only one 
admissible control u h e U h . In this case, min in 
(7) can be removed. Discretize the HJB equation 
using upwind finite difference method with step 
size h > 0 by 


V(x) -> V h (x) 


VAx) -* 

Vx(x) —> 

V x A x ) - 


VAx + h) - V f ‘(x) 

V h (x) — V h (x — h) 
h 


for b(x, v ) > 0, 
for b(x, v ) < 0, 


V h (x + h) - 2V h (x) + V h (x - h) 
~h 2 ' 


For x e D®, it leads to 
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V h (x + h) - V h (x) 


h 


b + (x, v ) — 


V h (x)-V h (x -h) 


h 


+ 


a(x) V h (x + h) - 2V h (x) + V h (x - h ) 
~2 h 2 


b (x, v ) 

+ R(x , v) = 0, 


where b + and b~ are the positive and nega¬ 
tive parts of b , respectively. Comparing with the 
dynamic programming equation, we obtain the 
transition probabilities 


(a(x)/2) + hb + (x, v) 
A 

(a(x)/2) + hb ( x,v ) 
^ h 2 

/(•) = 0, otherwise, At h (x,v ) = —, 


p h (x, x + h)\v) = 
p h (x, x — h)\v) = 


with A = a(x) + h\b(x, u)| being well defined. 
With the transition probabilities given above, we 
can proceed to verify the local consistency by 
straight forward calculations and prove the de¬ 
sired convergence. 


Numerical Computation 

To numerically approximate the controlled diffu¬ 
sions, frequently used methods are either value 
iterations or policy iterations (iteration in policy 
space). Using Markov chain approximation in 
conjunction with either value iteration or iteration 
in policy space, we can further obtain a sequence 
of value functions {V h,n } such that V h,n V h 
as n —> oo. The procedures can be described as 
follows. 

Value Iteration 

1. Given a tolerance s > 0, set n = 0; for v e 
D%, set V h, ° = constant (for instance, 0). 

2. Using V h,n obtained in (7) to obtain V h ' n+l . 

3 If I V h,n+ 1 _ yh,n | > g0 to Step 3 above with 
n n + 1. 

Policy Iteration 

1. Given a tolerance s > 0, set n = 0; for v e 
D^, take an initial control u^{x) = constant. 
Use Uq(x) in lieu of v, solve (7) to find U /z,0 (-). 


2. Find an improved control by 

U h ’ n+ \x) := argmin„ e[/ „ p h ((x, j)|u) 
V h,n (y) + R(x, v)At h (x, u)]. 

3. Find V h,n+l (•) with u h,n+l (•) by solving (7). If 
\V h ' n+l — V h,n \ > s, go to Step 2 above with 
n —> n + 1. 

Further Remarks 

Variations of the Problems. Variants of the 
problems can be considered. For example, one 
may consider nonlinear filtering problems or sin¬ 
gularly perturbed control and filtering problems. 
For problems arising in manufacturing systems, 
one often needs to treat controlled Markov chain 
with no diffusion terms. Such a case can also 
be handled by the Markov chain approximation 
methods; see Sethi and Zhang (1994) for the 
problem and Yin and Zhang (2013, Chapter 9) for 
the numerical methods. In this article, we mainly 
discussed the approach by using probabilistic 
approach for getting the weak convergence of 
the interpolations of the controlled Markov chain. 
One can also use the so-called viscosity solution 
methods to treat the convergence; see Barles 
and Souganidis (1991) (also Kushner and Dupuis 
1992, Chapter 11). 

Variance Control. In this entry, only drift in¬ 
volves control term. When the diffusion term is 
also subject to controls, the problem becomes 
more difficult. In Peng (1990), the idea of using 
backward stochastic differential equations was 
initiated, which had significant impact in the 
development of such stochastic control problems. 
Detailed discussions can be found in Yong and 
Zhou (1999). The numerical problems for diffu¬ 
sion term involving controls can also be treated; 
see Kushner (2000) for further discussion. In this 
case, the so-called numerical noise or numerical 
viscosity can be introduced, so care must be 
taken. 
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Complex Models Involving Jump and Switch¬ 
ing. Note that only controlled diffusions are 
considered in this entry. More complex models 
such as controlled jump diffusions (Kushner and 
Dupuis 1992), switching diffusions (Yin and Zhu 
2010), and switching jump diffusions can be 
treated (Song et al. 2006). Differential games can 
also be treated (Kushner 2002; Song et al. 2008). 

Differential Delay Systems. Stochastic differ¬ 
ential delay systems may come into play. The 
corresponding numerical algorithms have been 
studied extensively in Kushner (2008). Due to 
their inherent infinite dimensionality, a main is¬ 
sue here concerns suitable finite approximation to 
the memory segments. 

Rates of Convergence. This entry mainly 
discusses the convergence of the approximation 
methods. There is also much interest in ascertain¬ 
ing rates of convergence. Such effort goes back 
to the paper Menaldi (1989) (see also Zhang 
2006). Subsequently, it has been resurgent effort 
in dealing with this issue from a nonlinear partial 
differential equation point of view; see Krylov 
(2000). Our recent work Song and Yin (2009) 
complements the study by providing a probabilis¬ 
tic approach for treating switching diffusions. 

Stochastic Approximation. In certain optimal 
control problems, the optimal controls or near- 
optimal controls turn out to be of threshold 
type. An alternative way of solving such 
problems leading to at least suboptimal or 
near-optimal control is to use a stochastic 
approximation approach; see Kushner and 
Yin (2003) for a comprehensive treatment of 
stochastic approximation algorithms. Some 
successful examples include manufacturing 
systems (Yin and Zhang 2013, Section 9.3) and 
liquidation decision making (Yin et al. 2002). 


Cross-References 

► Stochastic Dynamic Programming 

► Stochastic Maximum Principle 
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Abstract 

In this article we describe the three most 
common approaches for numerically solving 
nonlinear optimal control problems governed by 
ordinary differential equations. For computing 
approximations to optimal value functions and 
optimal feedback laws, we present the Hamilton- 
Jacobi-Bellman approach. For computing 
approximately optimal open-loop control 
functions and trajectories for a single initial 
value, we outline the indirect approach based 
on Pontryagin’s maximum principle and the 
approach via direct discretization. 

Keywords 

Direct discretization; Hamilton-Jacobi-Bellman 
equations; Optimal control; Ordinary differential 
equations; Pontryagin’s maximum principle 

Introduction 

This article concerns optimal control problems 
governed by nonlinear ordinary differential equa¬ 
tions of the form 

x(t) = f(x(t),u(t)) (1) 

with / : M x W 1 x —>► W 1 . We assume that for 
each initial value x eW 1 and measurable control 
function w(-) e L°°(M, M m ) there exists a unique 
solution x(t ) = x(t,x,u(-)) of (1) satisfying 
v(0, v, u(-)) = x. 

Given a state constraint set X c W 1 and a 
control constraint set U c M m , a running cost 
g : XxU M, a terminal cost F : X U, and 
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a discount rate 8 > 0, we consider the optimal 
control problem 

minimize J T (x,u (•)) (2) 

u(-)eU t (x) 


where 



e Ss g(x(s, x,u(-)),u(s))ds 


and 


+ e st F(x(T, x,u(-))) 

(3) 


U T (x) := 

x(s,x,u{-)) e X ) 
for all se[0,f] | ’ 


jw(-) e L°°(E, U ) 


In addition to this finite horizon optimal con¬ 
trol problem, we also consider the infinite horizon 
problem in which T is replaced by “ooi.e., 


minimize J°°(x,u (•)) (5) 

u(-)eU°°(x) 


where 


J°°(x, 

and 


f c 

«(■)) := 

Jo 


e Ss g(x(s, x, u(-)), u(s))ds 

( 6 ) 


U°°(x) := 

x(s,x,u (•)) el) 
for all 5 > 0 ) 

respectively. 

The term “solving” (2)-(4) or (5)-(7) can 
have various meanings. First, the optimal value 
functions 


{«(•) € L°°(R,U) 


V T (x) = inf J T (x,u (•)) 

u(-)eU T (x) 

or 

F°°(x) = inf J°°(x,u(-)) 

u(-)eU°°(x) 

may be of interest. Second, and often more im¬ 
portantly, one would like to know the optimal 


control policy. This can be expressed in open- 
loop form u* : M —> U, in which the function 
u* depends on the initial value x and on the 
initial time which we set to 0 here. Alternatively, 
the optimal control can be computed in state- 
and time-dependent closed-loop form, in which 
a feedback law /i* : I x X ^ [/ is sought. Via 
u*(t) = fi*(t,x(t)), this feedback law can then 
be used in order to generate the time-dependent 
optimal control function for all possible initial 
values. Since the feedback law is evaluated along 
the trajectory, it is able to react to perturbations 
and uncertainties which may make x(t) deviate 
from the predicted path. Finally, knowing u * or 
/z*, one can reconstruct the corresponding opti¬ 
mal trajectory by solving 

x(t) = f(x(t),u*(t )) or 

x(t) = f(x(t),n*(t,x(t))). 


Hamilton-Jacobi-Bellman Approach 

In this section we describe the numerical ap¬ 
proach to solving optimal control problems via 
Hamilton-Jacobi-Bellman equations. We first de¬ 
scribe how this approach can be used in order 
to compute approximations to the optimal value 
function V T and F°°, respectively, and after¬ 
wards how the optimal control can be synthesized 
using these approximations. In order to formulate 
this approach for finite horizon T, we interpret 
V T (x) as a function in T and v. We denote 
differentiation w.r.t. T and x with subscript T 
and x, i.e., Vj (x) = dV T (x)/dx, (x) = 
dV T (x)/dT etc. 

We define the Hamiltonian of the optimal 
control problem as 

H(x, p) := max{— g(x, u) — p • /(x, u )}, 
ueu 


with x, p G M", / from (1), g from (3) or (6), and 
denoting the inner product in W 1 . Then, under 
appropriate regularity conditions on the problem 
data, the optimal value functions V T and V°° 
satisfy the first order partial differential equations 
(PDEs) 
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V r r (x) + 8V T (x ) + H(x, V X T (x)) = 0 
and 

<5F°°(;t) + H(x, V x °°(x)) = 0 

in the viscosity solution sense. In the case of 
V T , the equation holds for all T >0 with the 
boundary condition V°(x) = F(x). 

The framework of viscosity solutions is 
needed because in general the optimal value 
functions will not be smooth; thus, a generalized 
solution concept for PDEs must be employed (see 
Bardi and Capuzzo Dolcetta 1997). Of course, 
appropriate boundary conditions are needed at 
the boundary of the state constraint set X. 

Once the Hamilton-Jacobi-Bellman char¬ 
acterization is established, one can compute 
numerical approximations to V T or V°° by 
solving these PDEs numerically. To this end, 
various numerical schemes have been suggested, 
including various types of finite element and 
finite difference schemes. Among those, semi- 
Lagrangian schemes Falcone (1997) or Falcone 
and Ferretti (2013) allow for a particularly 
elegant interpretation in terms of optimal control 
synthesis, which we explain for the infinite 
horizon case. 

In the semi-Fagrangian approach, one takes 
advantage of the fact that by the chain rule for 
p = V£° (x) and constant control functions u , 
the identity 

<5F°°(x) — p ■ f(x,u) = -{\-8t)V°° 

at t=0 

( x(t , v, u )) 

holds. Hence, the left-hand side of this equality 
can be approximated by the difference quotient 

V°°(x) - (1 -8h)V°°(x(h,x,u)) 

h 

for small h > 0. Inserting this approximation 
into the Hamilton-Jacobi-Bellman equation, re¬ 
placing x(h,x,u ) by a numerical approxima¬ 
tion x(h,x,u ) (in the simplest case, the Euler 
method x(h,x,u ) = x hf(x,u)), multiplying 


by h , and rearranging terms, one arrives at the 
equation 

v h°(x) =min {hg(x,u) 

ueU 

+ (l-8h)V£°(x(h,x,u))} 

defining an approximation V£° ^ V °°. This is 
now a purely algebraic dynamic programming- 
type equation which can be solved numerically, 
e.g., by using a finite element approach. The 
equation is typically solved iteratively using a 
suitable minimization routine for computing the 
“min” in each iteration (in the simplest case, U 
is discretized with finitely many values and the 
minimum is determined by direct comparison). 
We denote the resulting approximation of V 00 by 
V £°. Here, approximation is usually understood 
in the L°° sense (see Falcone 1997 or Falcone 
and Ferretti 2013). 

The semi-Fagrangian scheme is appealing for 
synthesis of an approximately optimal feedback 
because V£° is the optimal value function of 
the auxiliary discrete-time problem defined by x. 
This implies that the expression 

lJL%(x) :=argmin {hg(x,u) 

uEU 

+ (l-8h)V£°(x(h,x,u))}, 

is an optimal feedback control value for this 
discrete-time problem for the next time step, i.e., 
on the time interval [t, t + h) if v = x(t). This 
feedback law will be approximately optimal for 
the continuous-time control system when applied 
as a discrete-time feedback law, and this ap¬ 
proximate optimality remains true if we replace 
V h °° in the definition of fi^ by its numerically 
computable approximation V£°. A similar con¬ 
struction can be made based on any other numer¬ 
ical approximation V°° % F°°, but the explicit 
correspondence of the semi-Fagrangian scheme 
to a discrete-time auxiliary system facilitates the 
interpretation and error analysis of the resulting 
control law. 

The main advantage of the Hamilton- 
Jacobi approach is that it directly computes an 
approximately optimal feedback law. Its main 
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disadvantage is that the number of grid nodes 
needed for maintaining a given accuracy in 
a finite element approach to compute Vf° 
in general grows exponentially with the state 
dimension n. This fact - known as the curse 
of dimensionality - restricts this method to low¬ 
dimensional state spaces. Unless special structure 
is available which can be exploited, as, e.g., in 
the max-plus approach (see McEneaney 2006), it 
is currently almost impossible to go beyond state 
dimensions of about n = 10, typically less for 
strongly nonlinear problems. 

Maximum Principle Approach 

In contrast to the Hamilton-Jacobi-Bellman ap¬ 
proach, the approach via Pontryagin’s maximum 
principle does not compute a feedback law. In¬ 
stead, it yields an approximately open-loop op¬ 
timal control u * together with an approximation 
to the optimal trajectory x* for a fixed initial 
value. We explain the approach for the finite 
horizon problem. For simplicity of presentation, 
we omit state constraints in our presentation, i.e., 
we set X = W 1 and refer to, e.g., Vinter (2000), 
Bryson and Ho (1975), or Grass et al. (2008) for 
more general formulations as well as for rigorous 
versions of the following statements. 

In order to state the maximum principle 
(which, since we are considering a minimization 
problem here, could also be called minimum 
principle), we define the non-minimized 
Hamiltonian as 

p , u ) = g(x, u) + p • f{x , u). 

Then, under appropriate regularity assumptions, 
there exists an absolutely continuous function p : 
[0, T] —> such that the optimal trajectory x* 
and the corresponding optimal control function 
u* for (2)—(4) satisfy 

pit) = Spit) - H x (x*it), pit), M*(0) (8) 

with terminal or transversality condition 

(9) 


and 

u*{t) = argmin H{x*{t), pit), u), (10) 

uEU 

for almost all t e [0, T] (see Grass et al. 2008, 
Theorem 3.4). The variable p is referred to as the 
adjoint or costate variable. 

For a given initial value Xo e M", the numer¬ 
ical approach now consists of finding functions 
v : [0,7] -* R n , u : [0,7] -> U and p : 
[0, T] -> W 1 satisfying 

x(t) = f(x(t),u(t)) (11) 

pit) = Spit) - H x (x(t ), pit), u(t)) (12) 

uf) = argminH(v(0, pit),u) (13) 

ueu 

xiO) = x 0 , p(T) = F x ixiT)) (14) 

for t e [0, 7]. Depending on the regularity of 
the underlying data, the conditions (1 1)— (14) may 
only be necessary but not sufficient for x and 
u being an optimal trajectory x* and control 
function u *, respectively. However usually x and 
u satisfying these conditions, are good candidates 
for the optimal trajectory and control, thus justi¬ 
fying the use of these conditions for the numerical 
approach. If needed, optimality of the candidates 
can be checked using suitable sufficient optimal¬ 
ity conditions for which we refer to, e.g., Maurer 
(1981) or Malanowski et al. (2004). Due to the 
fact that in the maximum principle approach first 
optimality conditions are derived which are then 
discretized for numerical simulation, it is also 
termed first optimize then discretize. 

Solving (11)—(14) numerically amounts to 
solving a boundary value problem, because the 
condition x*(0) = Xo is posed at the beginning 
of the time interval [0, T] while the condition 
piT) = F x ix*iT)) is required at the end. 
In order to solve such a problem, the simplest 
approach is the single shooting method which 
proceeds as follows: 

We select a numerical scheme for solving the 
ordinary differential equations (11) and (12) for 
t e [0,7] with initial conditions x(0) = xq, 


piT) = F x (x*iT)) 
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/>(0) = po and control function u(t). Then, we 

proceed iteratively as follows: 

(0) Find initial guesses p q e W 1 and u°(t ) for 
the initial costate and the control, fix s > 0, 
and set k := 0. 

(1) Solve (11) and (12) numerically with initial 
values Xo and p q and control function u k . 
Denote the resulting trajectories by x k ( t ) and 
p k (t). 

(2) Apply one step of an iterative method for 
solving the zero-finding problem G(p) = 0 
with 


G(pfo := p k (T)-F x (x k (T)) 

for computing Pq +1 ■ For instance, in case of 
the Newton method we get 

4 + ‘ ■= P^-DG(ti)-'G(p k ). 

If WPo^ 1 — Po II < £ > stop; else compute 


... < t N = T and in addition to p 1 ^ intro¬ 
duces variables x k ,..., x k N _ x , p k ,..., p k N _ x £ 
W 1 . Then, starting from initial guesses p®,u°, and 
Xj,..., , p ®,..., p ° N -\»in each iteration the 

Eqs. (11 )—( 1 4) are solved numerically on the in¬ 
tervals [tj,tj+ 1 ] with initial values x k and p k , 
respectively. We denote the respective solutions 
in the k -th iteration by x k and p k . In order 
to enforce that the trajectory pieces computed 
on the individual intervals [tj ,tj + \] fit together 
continuously, the map G is redefined as 

G(x k ,...,x k N _ v p k ,p k ,...,p k N _ l ) = 

/ x k (h)-x\ \ 


*N— 2^l) X N —1 
Poih)- Pi 

p k N - 2 (h) ~ Pn-\ 
\p k N _i(T)-F x (x k N _,(T))J 


u k+l (t ) := aigmin'H(x k (t), p k (t),u), 

UEU 

set k := k + 1, and go to (1). 

The procedure described in this algorithm is 
called single shooting because the iteration 
is performed on the single initial value p%. 
For an implementable scheme, several details 
still need to be made precise, e.g., how to 
parameterize the function u(t) (e.g., piecewise 
constant, piecewise linear or polynomial), how 
to compute the derivative DG and its inverse 
(or an approximation thereof), and the argmin in 
(2). The last task considerably simplifies if the 
structure of the optimal control, e.g., the number 
of switchings in case of a bang-bang control, is 
known. 

However, even if all these points are set¬ 
tled, the set of initial guesses p® and u° for 
which the method is going to converge to a 
solution of (11)—(14) tends to be very small. 
One reason for this is that the solutions of (11) 
and (12) typically depend very sensitively on 
Pq and u°. In order to circumvent this problem, 
multiple shooting can be used. To this end, one 
selects a time grid 0 = to < t\ < t 2 < 


The benefit of this approach is that the so¬ 
lutions on the shortened time intervals depend 
much less sensitively on the initial values and 
the control, thus making the problem numerically 
much better conditioned. The obvious disadvan¬ 
tage is that the problem becomes larger as the 
function G is now defined on a much higher di¬ 
mensional space but this additional effort usually 
pays off. 

While the convergence behavior for the multi¬ 
ple shooting method is considerably better than 
for single shooting, it is still a difficult task to 
select good initial guesses x®, p°j and u°. In order 
to accomplish this, homotopy methods can be 
used (see, e.g., Pesch 1994) or the result of a 
direct approach as presented in the next section 
can be used as an initial guess. The latter can 
be reasonable as the maximum principle-based 
approach can yield approximations of higher ac¬ 
curacy than the direct method. 

In the presence of state constraints or mixed 
state and control constraints, the conditions (12)- 
(14) become considerably more technical and 
thus more difficult to be implemented numeri¬ 
cally (cf. Pesch 1994). 
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Direct Discretization 


and 


Despite being the most straightforward and sim¬ 
ple of the approaches described in this article, 
the direct discretization approach is currently the 
most widely used approach for computing single 
finite horizon optimal trajectories. In the direct 
approach, we first discretize the problem and then 
solve a finite dimensional nonlinear optimiza¬ 
tion problem (NLP), i.e., w e first discretize, then 
optimize. The main reasons for the popularity 
of this approach are the simplicity with which 
constraints can be handled and the numerical ef¬ 
ficiency due to the availability of fast and reliable 
NLP solvers. 

The direct approach again applies to the finite 
horizon problem and computes an approximation 
to a single optimal trajectory x*{t) and control 
function u*(t) for a given initial value Xo G X. 
To this end, a time grid 0 = to < t\ < t^ < ... < 
t N = T and a set Ud of control functions which 
are parameterized by finitely many values are 
selected. The simplest way to do so is to choose 
u(t ) = uj G U for all t e , /)+1 ]. However, 
other approaches like piecewise linear or piece- 
wise polynomial control functions are possible, 
too. We use a numerical algorithm for ordinary 
differential equations in order to approximately 
solve the initial value problems 

i(0 = f(x(t),Ui), x(ti ) = Xi (15) 

for i = 0, ...,N — 1 on [U,ti+ 1 ]. We de¬ 
note the exact and numerical solution of (15) 
by x(t, U , Xi , ui) and x(t, t \, xi ,ui ), respectively. 
Finally, we choose a numerical integration rule in 
order to compute an approximation 


I (ji > t\ 


M+i 

i + l,Xi,Ui)&J 


.—St 


g(x(t, ti , Xi , u ), u(t))dt. 


In the simplest case, one might choose x as 
the Euler scheme and I as the rectangle rule, 
leading to 


x(ti + \,ti, Xi , Ui) = Xi + (t i+ 1 - t,)f{Xi , Ui) 


I(ti,t i+ i,Xi,Ui) = (f,+i -ti)e 5 ti g(xi,Ui). 

Introducing the optimization variables 
uo ,..., un -i G M m and x\, ..., £ M", the 

discretized version of (2)-(4) reads 


N -1 

minimize > /(/), U+\, Xi , u) + e~ 8 T F(xn) 

] ] i=0 

subject to the constraints 

Uj G U, j = 0,... ,N — 1 

Xj G X, j = 1,...,7V 

Xj +1 = x(tj+\, tj , Xj ,u), j = 0,... ,N 

This way, we have converted the optimal control 
problem (2)-(4) into a finite dimensional nonlin¬ 
ear optimization problem (NLP). As such, it can 
be solved with any numerical method for solving 
such problems. Popular methods are, for instance, 
sequential quadratic programming (SQP) or in¬ 
terior point (IP) algorithms. The convergence of 
this approach was proved in Malanowski et al. 
(1998); for an up-to-date account on theory and 
practice of the method, see Gerdts (2012) and 
Betts (2010). These references also explain how 
information about the costates p(t ) can be ex¬ 
tracted from a direct discretization, thus linking 
the approach to the maximum principle. 

The direct method sketched here is again a 
multiple shooting method, and the benefit of this 
approach is the same as for solving boundary 
problems, thanks to the short intervals [/), /)+1 ]; 
the solutions depend much less sensitively on the 
data than the solution on the whole interval [0,7], 
thus making the iterative solution of the resulting 
discretized NLP much easier. The price to pay is 
again the increase of the number of optimization 
variables. However, due to the particular structure 
of the constraints guaranteeing continuity of the 
solution, the resulting matrices in the NLP have 
a particular structure which can be exploited 
numerically by a method called condensing (see 
Bock and Plitt 1984). 
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An alternative to multiple shooting methods 
are collocation methods, in which the internal 
variables of the numerical algorithm for solv¬ 
ing (15) are also optimization variables. However, 
nowadays, the multiple shooting approach as de¬ 
scribed above is usually preferred. For a more 
detailed description of various direct approaches, 
see also Binder et al. (2001), Sect. 5. 

Further Approaches for Infinite 
Horizon Problems 

The last two approaches only apply to finite 
horizon problems. While the maximum princi¬ 
ple approach can be generalized to infinite hori¬ 
zon problems, the necessary conditions become 
weaker and the numerical solution becomes con¬ 
siderably more involved (see Grass et al. 2008). 
Both the maximum principle and the direct ap¬ 
proach can, however, be applied in a receding 
horizon fashion, in which an infinite horizon 
problem is approximated by the iterative solution 
of finite horizon problems. The resulting control 
technique is known under the name of model 
predictive control (MPC; see Grime and Pannek 
2011), and under suitable assumptions, a rigorous 
approximation result can be established. 

Summary and Future Directions 

The three main numerical approaches to optimal 
control are: 

• The Hamilton- Jacobi-Bellman approach, 

which provides a global solution in feedback 
form but is computationally expensive for 
higher dimensional systems 

• The Pontryagin maximum principle approach 
which computes single optimal trajectories 
with high accuracy but needs good initial 
guesses for the iteration 

• The direct approach which also computes sin¬ 
gle optimal trajectories but is less demanding 
in terms of the initial guesses at the expense of 
a somewhat lower accuracy 

Currently, the main trends in numerical optimal 
control lie in the areas of Hamilton-Jacobi- 


Bellman equations and direct discretization. For 
the former, the development of discretization 
schemes suitable for increasingly higher 
dimensional problems is in the focus. For the 
latter, the popularity of these methods in online 
applications like MPC triggers continuing effort 
to make this approach faster and more reliable. 

Beyond ordinary differential equations, the 
development of numerical algorithms for the 
optimal control of partial differential equations 
(PDEs) has attracted considerable attention 
during the last years. While many of these 
methods are still restricted to linear systems, 
in the near future we can expect to see many 
extensions to (classes of) nonlinear PDEs. It is 
worth noting that for PDEs, maximum principle¬ 
like approaches are more popular than for 
ordinary differential equations. 

Cross-References 

► Discrete Optimal Control 

► Economic Model Predictive Control 

► Nominal Model-Predictive Control 

► Optimal Control and the Dynamic Program¬ 
ming Principle 

► Optimal Control and Pontryagin’s Maximum 
Principle 

► Optimization Algorithms for Model Predictive 
Control 
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Abstract 

An observer-based controller is a dynamic 
feedback controller with a two-stage structure. 
First, the controller generates an estimate of the 
state variable of the system to be controlled, 
using the measured output and known input 
of the system. This estimate is generated by a 
state observer for the system. Next, the state 
estimate is treated as if it were equal to the exact 
state of the system, and it is used by a static 
state feedback controller. Dynamic feedback 
controllers with this two-stage structure appear 
in various control synthesis problems for linear 
systems. In this entry, we explain observer-based 
control in the context of internal stabilization by 
dynamic measurement feedback. 

Keywords 

Detectability; Dynamic output feedback control; 
Internal stabilization; Separation principle; 


Stabilizability; State observers; Static state 
feedback 


Introduction 

In this entry, we explain the notion of observer- 
based feedback control. Given a to-be-controlled 
system in input-state-output form, together with 
a control objective, the problem is to design a 
feedback controller such that the closed-loop 
system meets the objective. In the case when 
all state variables of the system are available 
for control, the design problem is considered 
to be simpler, and often the controller can be 
chosen to be a static state feedback control law. 
In the more general case where the controller 
has access only to a linear function of the 
state variables, the problem is more involved 
and requires the design of a dynamic feedback 
control law. The key idea of observer-based 
feedback control is the following. As a first 
step, one determines a state observer for the 
system, i.e., a system that estimates the state of 
the system based on the measured outputs and 
inputs of the system. Next, the state estimate 
is treated as if it were exactly equal to the 
actual state of the system and is used by a 
static state feedback controller. In this way, a 
dynamic feedback controller is obtained that is 
composed of a (dynamic) state observer and a 
static feedback part. 
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Dynamic Output Feedback Control 

Consider the controlled and observed system E: 

x(t) — Ax(t ) + Bu(t ) + Ed{t ), 
y(t) = Cx(t ), (1) 

z(t) = Hx(t ), 

with x(t) e X = M" the state, u{t) G M m 
the control input, and y(7) G the mea¬ 
sured output. The signal d(t) may represent a 
disturbance input or a desired reference signal, 
while the signal z(t) is a controlled output signal. 
A, B, C, E , and // are maps (or matrices). 
In general, a linear controller for this system is 
a finite-dimensional linear time-invariant system 
T represented by 

w(t) = Kw(t ) + Lj(0, 
u(t) = Mw(t) + Ny(t). 

The state space of the controller is assumed to be 
W = for some positive integer q. K, L, M , 
and N are assumed to be linear maps (or matri¬ 
ces). The controller (2) takes the observations y 
as its input and generates the control function u as 
its output. The closed-loop system resulting from 
the interconnection of E and T is described by 
the equations 



( 3 ) 


The control action of interconnecting the con¬ 
troller T with the system (1) is called dynamic 
feedback. The state space of the closed-loop sys¬ 
tem (3) is called the extended state space and is 
equal to the Cartesian product X x W = W 1 . In 
general, a feedback control problem amounts to 
finding linear maps K, L, M, and N such that the 
closed-loop system (3) satisfies the control design 
specifications. 


Observer-Based Controllers 

Given the system (1) and a control objective, 
the problem thus arises on how to determine the 
maps K,L,M , and N so that the closed-loop 
systems meet the objective. As an example, take 
the special case when E in (1) is equal to zero 
(i.e., the system has no external disturbances or 
reference signals) and that we wish the closed- 
loop system (3) to be internally stable, i.e., we 
want to find the maps K, L, M, and N so that 
the eigenvalues A/ of the system map of (3) are in 
the open left half-plane, i.e., satisfy Re(A/) < 0 
for all i. If we had access to the entire state 
variable x (instead of only to the linear function 
y = Cx ), then this problem would be simpler: 
assuming that the system is stabilizable (The 
system x = Ax + Bu is called stabilizable if 
there exists a map F such that A + BF has all 
its eigenvalues in the open left half-plane), find 
a map F such that the eigenvalues of A + BF 
are in the open left half-plane; then take the static 
state feedback controller u = Fx as the control 
law. That is, we would choose the state space 
dimension of the controller T equal to 0 and the 
maps K , L, and M to be void, and we would take 
N = F. 

In general, however, we only have access to a 
given linear function y = Cx of x, determined 
by the output map C. The key idea of observer- 
based control is the following: 

Use the theory of observer design to find an 
observer for the state x of the system (1), i.e., 
an observer that generates an estimate £ of the 
system state x based on the measured output 
y and the control input u. Next, apply a static 
feedback u = F £ mimicking the (notpermissible) 
control law u = Fx. 

This idea leads to a dynamic feedback con¬ 
troller (2) of a very particular structure: the con¬ 
troller is the combination of a state observer 
(with a certain state space dimension) and a static 
control law acting on the state estimate. This two- 
stage structure, separating estimation and control, 
is often called the separation principle. We will 
work out this idea in more detail for the case 
when E = 0 (no external disturbances or ref¬ 
erence signals) and the aim is to obtain internal 
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stability of the system. Before doing this, we first 
explain the most important material on observers 
that is needed in the sequel. 

State Observers 

If the state is not available for measurement, one 
can try to reconstruct it using a system, called 
observer, that takes the control input and the 
measured output of the original system as inputs 
and yields an output that is an estimate of the 
state of the original system. Again in case that 
in the system (1) we have E = 0, i.e., there are 
no disturbance signals. This is illustrated in the 
following picture: 



The quantity £ is supposed to be an estimate, 
in some sense, of the state, and w is the state 
variable of the observer. In general, the observer, 
denoted by Q, has equations of the form 

w(t) = Pw(t ) + Quit) + Ry{t), 
m = Sw(t). (V 

It turns out that particular choices for P, Q, R, 
and S , specifically P = A —GC (where the map 
G has to be determined), Q = B, R = G, and 
S = I , lead to 

l(t) = (A-GC)S(t) + Bu(!) + Gy(t). ( 5 ) 


Introducing the estimation error e : = £ — x and 
interconnecting the system (1) with (5), we find 
that the error e satisfies the differential equation 

e(t) = (A-GC)e(t). (6) 

Hence all possible errors converge to 0 as t tends 
to infinity if and only if A — GC is a stability 
matrix, i.e., has all its eigenvalues in the open left 
half-plane. In that case, we call (5) a stable state 
observer. Thus, a stable state observer exists if 
and only if G can be found such that A — GC is 
a stability matrix. The problem of finding such a 
G is dual to the problem of finding a matrix F 
to a pair (A, B) such that A + BE is a stability 
matrix. 

Definition 1 The pair (C, A ) is called detectable 
if there exists a matrix G such that A — GC is a 
stability matrix, i.e., has all its eigenvalues in the 
open left half-plane. 

Theorem 1 Given system E, the following state¬ 
ments are equivalent : 

1. E has a stable state observer. 

2. ( C , A) is detectable. 

The equation for £ can be rewritten using an 
artificial output q = C£ as £ = A% -\- Bu + 
G(y — rj). The interpretation of this is as follows. 
If £ is the exact state, then rj = y, and hence £ 
obeys exactly the same differential equation as x. 
Otherwise, the equation for £ has to be corrected 
by a term determined by the output error y — rj. 
Consequently, the state observer consists of an 
exact replica Y< dup of the original system with an 
extra input channel for incorporating the output 
error and an extra output, the state of the observer, 
which serves as the desired estimate for the state 
of the original system. The following diagram 
depicts the situation: 
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Observer-Based Stabilization 

We now work out the ideas put forward in the 
previous sections for the special case of stabi¬ 
lization by dynamic measurement feedback, i.e., 
to find a controller (2) such that the closed- 
loop system (3) is internally stable; equivalently, 
the system mapping of (3) is a stability matrix. 
Again, we restrict ourselves to the case when 
E = 0. 

We assume that we know how to stabilize by 
state feedback and how to build a state observer. 
If we have a plant of which we do not have the 
state available for measurement, we use a state 
observer to obtain an estimate of the state, and 
we apply the state feedback to this estimate rather 
than to the true state. This is illustrated by the 
following picture: 



Assume that E is stabilizable and detectable. 
Then F and G can be found such that A + BF 
and A — GC are stability matrices. Since the 
set of eigenvalues of A e is the union of those 
of A -T BF and A — GC, it follows that A e 
is a stability matrix. Consequently, the system 
x e = A e x e is asymptotically stable; equivalently, 
every solution = ( x,e ) converges to 0 as t 
tends to infinity. Of course, if (x, £) is a solution 
of (7), then £ = x-he, with = (x, e) a solution 
of x e = A e x e . Hence (x, £) also converges to 0 
as t goes to infinity. Thus we have proved the “if” 
part of the following theorem: 

Theorem 2 There exists an internally stabilizing 
dynamic feedback controller for E if and only if 
E is stabilizable and detectable. A controller is 
given by 

W) = (A- GCMt) + Bu(t) + Gy(t), 

u(t ) = K ’ 

where F is any map such that A + BF is a 
stability matrix and G is any map such that 
A — GC is a stability matrix. 

The controller (8) is an observer-based dynamic 
feedback controller, since it is composed of a 
state observer and a static feedback part. 


Again, consider the system E given by (1) and 
let the observer Q be given by (5) . Combining 
this with u = Ftj yields 

x(t) = Ax(t) + BF%, 

(7) 

£(0 = (A-GC + BF)$(t) + GCx(t). 

Introducing again e : = £ — x, we obtain, in 
accordance with the previous section, 


x(t) = (A + BF)x(t) + BFe(t), 
e(t) = (A-GC)e(t). 


That is, the equation x e = A e x e with 


: = 



A e 


(A + BF BF \ 

V 0 A-GC)' 


Summary and Future Directions 

We have given an introduction to observer-based 
feedback controllers and have explained that such 
controllers are dynamic feedback controllers that 
can be represented as the composition of a state 
observer for the system, together with a static 
control law mimicking a (not permitted) static 
state feedback control law. We have given a 
detailed description of this principle for the case 
that the system to be controlled has no external 
disturbances or reference signals and the control 
objective is internal stability of the system. More 
intricate versions of the principle of observer- 
based feedback control appear in control design 
problems for linear systems with external distur¬ 
bances and reference signals and with different, 
more sophisticated, control objectives. Examples 
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of these are the regulator problem, the problem 
of disturbance decoupling with internal stability, 
the H 2 optimal control problem, and the Hoq 
suboptimal control problem. 


Cross-References 

► Linear State Feedback 

► Observers in Linear Systems Theory 
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Abstract 

Observers are objects delivering estimation of 
variables which cannot be directly measured. 
The access to such hidden variables is made 
possible by combining modeling and measure¬ 
ments. But this is bringing face to face real world 
and its abstraction with, as a result, the need 
for dealing with uncertainties and approxima¬ 
tions leading to difficulties in implementation and 
convergence. 


Keywords 
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Introduction 

Observers are answers to the question of 
estimating, from observed/measured/empirical 
variables, denoted y, and delivered by sensors 
equipping a real-world system, some “theoreti¬ 
cal” variables, called hidden variables in this text, 
denoted z, which are involved in a mathematical 
model related to this system. The measured 
variables make what is called the a posteriori 
information on the hidden variables, whereas the 
model is part of the a priori information. Because 
a model cannot fit exactly a system, introduction 
of uncertainties is mandatory. 

Typically this model describing the link be¬ 
tween hidden and measured variables is made of 
three components: 

• A dynamic model describes the dynam¬ 
ics/evolution (x denotes the time deriva- 



x{t) =f(x(t),t,8 s (t)) resp. x k +i = f k (x k ,8 s k ), 

( 1 ) 

where t, in the continuous case, or k, in the 
discrete case, is an evolution parameter, called 
time in this text; 1 is a state, assumed finite 
dimensional in this text; and 8 s represents 
the uncertainties in the state dynamics. Any 
possible known inputs are represented here by 
the time dependence of /. 

• A sensor model relates state and measured 
variables: 

y(t) = h (x(t),t,8 m (?))resp. y k =h k (x k ,8 k ) 

( 2 ) 

with 8 m representing the uncertainties in the 
measurements. 

• A model which relates state and hidden vari¬ 
ables: 

z(t) = H (x,t,8 h (t)) resp.zj; = H k (x k ,8 k ) 

( 3 ) 

where again 8 h represents the uncertainties in 
the hidden variables. 

In a deterministic setting, the a priori information 
on the uncertainties (8 s , 8 m ,8 h ) may be that the 
values of 8 s , 8 m , and 8 h are unknown but belong 
to known sets A s , A m , and A h . Namely, we have: 
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8 s (t)eA s (t), 8 m (t)e A m (t),8 h (t)e A h (t), 
respectively, 8 s k € A s k , 8™ e A™ , 8 k e A k . 

( 4 ) 

In a stochastic setting and more specifically in a 
Bayesian approach, it may be that 8 s , 8 m , and 8 h 
are unknown realizations of stochastic processes 
for which we know the probability distributions. 

Similarly we may also know a priori that we 
have: 

x(t ) G X(t), z(t) G Z(t) 
respectively, Xk G Xk, Zk G 

( 5 ) 

where the sets A and Z are known or we may 
have a priori probability distribution for x and z. 

In this context, the a priori information is the 
data of the functions f,h , and H, of the sets 
A s , A m , and A h or the corresponding probability 
distribution and so may be also of the sets 
and Z or the corresponding a priori probability 
distribution. 

In the next section, we state the observation 
problem and give the solutions which are direct 
consequences of the deterministic and stochastic 
setting given above. This will allow us to see 
that an observer is actually a dynamical system 
with the measurements as inputs and the estimate 
as output. But approximations in the implemen¬ 
tation of these solutions, not knowing how to 
initialize, may lead to convergence problems even 
when the uncertainties disappear. The second part 
of this text is devoted to this convergence topic. 

To ease the presentation, we deal only with 
the discrete time case in section “Set Valued 
and Conditional Probability Valued Observers” 
and the continuous time case in sections “An 
Optimization Approach” and “Convergent 
Observers.” 

Observation Problem and Its 
Solutions 

The Observation Problem 

ss 8 s 

Let X° (x,t,s), respectively X l (x,k), denote a 
solution of (1) at time s , respectively /, going 
through x at time t, respectively k , and under the 
action of 8 s . 


Observation problem At each time t, respectively 
k, given the function s e]t — T,t] \-+ y(s), 
respectively the sequence l G {k — K,... ,k} \-+ 
yi,find an estimation z(t), respectively Zk, ofz(t), 
respectively, Zk, satisfying 

z(t) = M (x(t),t,8 h (t)) resp.z k = ^k(x k ,8 k ) . 

where x(t), respectively Xk, is to be found as a 
solution of 

x(t) G X(t) , 

y(s) = h(X s \x(t),t,s),s,8 m (s)) Vs e]t-T,t], 
respectively 

%k ^ 5 

yi=h, (xf {x k ,k),8fj VI e {k - K,... ,k} 

and where the time functions 8 s , 8 m , and 8 h must 
agree with the a priori (deterministic/stochastic) 
information or minimized in some way. 

In this statement T, respectively K , quantifies 
the time window length or memory length during 
which we record the measurement. The accu¬ 
mulation with time of measurements, together 
with the model equations (1)—(3) and the as¬ 
sumptions on (8 s , 8 m ,8 h ), gives a redundancy of 
data compared with the number of unknowns that 
the hidden variables are. This is why it may be 
possible to solve this observation problem. 

To simplify the following presentation, we 
restrict our attention on the case where the hidden 
variables are actually the full model state, i.e., 

z = l l( x ) = x . 

When z differs from x, observers are called 
functional observers. 

Set-Valued and Conditional 
Probability-Valued Observers 

Conceptually the answer to this problem is easy 
at least when the memory increases with time 
(T(t) = 1 resp. Kk+\ = Kk + 1) leading to an 
infinite non-fading memory. It consists in starting 
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from all what the a prion information makes 
possible and to eliminate what is not consistent 
with the a posteriori information. In the set¬ 
valued observer setting, in the discrete time case, 
this gives the following observer. To ease its read¬ 
ing, we underline the data given by the a priori 
information. It requires the introduction of two 
sets and %k\k-\ which are updated at each time 
k when a new measurement yk is made available. 
$jk is the set which Xk is guaranteed to belong to at 
time k , knowing all the measurements up to time 
k, and %k\k-\ is the same but with measurements 
known up to time k — 1. 

Set-valued observer: 


Initialization : 

At each time k: pre¬ 
diction (flowing) 
restriction 

(consistency) 

estimation 


%o — Xq 

%k\k—l — fk —1 fe—1 9 ^k— i) 
& = {* e (%k\k-i O^k) : 
y k e hk(x, A£)| 

Xk ^ & 


A key feature here is that this observer has a state 
%k - a set - and is a dynamical system in the form: 


%k +1 = <Pk(!;k,yk), *k £ %k 

with y as input and x as output which is not 
single valued. Important also, the initial condition 
of the state £ is given by the a priori information. 

In the stochastic setting, following the 
Bayesian paradigm, the observer has the same 
structure but with the state being a conditional 
probability. See Jazwinski (2007, Theorem 6.4) 
or Candy (2009, Table 2.1). In that setting too the 
observer is not a single state; it is the (a posteriori) 
conditional probability of the random variable Xk 
given the a priori information and the sequence 
of measurements / e {k — K ,... ,k} i-> yi. 
Comments 

Implementation: For the time being, except for 
very specific cases (Kalman filter,...), the set¬ 
valued and the conditional probability-valued 
observers remain conceptual since we do not 
know how to manipulate numerically sets 
and probability laws. Their implementation 
requires approximations. For instance, see 


Milanese et al. (1996) and Witsenhausen 
(1966) for the set case and Arulampalam 
et al. (2002), Bucy and Joseph (1987), 
Candy (2009), and Jazwinski (2007) for the 
conditional probability case. 
ed of finite or infinite but fading memory: 

In these observers, model states x which are 
consistent with the a priori information but 
do not agree with the a posteriori information 
are eliminated (set intersection or probability 
product). But once a point is eliminated, this is 
forever. As a consequence if there is, at some 
time, a misfit between a priori and a posteriori 
information, it is mistakenly propagated in 
future times. A way to round this problem 
is to keep the information memory finite or 
infinite but fading. In particular, with fixed 
length memory, consistent points which were 
disregarded due to measurements which are 
no more in the memory are reintroduced. This 
says also that observers should not be sensitive 
to their initial condition. 

Not single-valued estimate. The observers intro¬ 
duced above realize a lossless data compres¬ 
sion with extracting and preserving all what 
concerns the hidden variables in the redun¬ 
dant data given by a priori and a posteriori 
information. But this “lossless compression” 
answer is not single valued (set valued or 
conditional probability valued) as a result of 
taking uncertainties into account. Actually, to 
get a single-valued answer, the observation 
problem must be complemented by making 
precise for what the estimation is made. For 
instance, we may want to select the most likely 
or the average or more generally some cost¬ 
minimizing estimate x among all the possible 
ones given by £. In this way we obtain an 
observer giving a single-valued estimate: 

^k-\- 1 — (Pkij^ki yk) > Xk — Tkifik) 
respectively 

k ( 0 =( 0 . y(t), t), x(t) = r(f ( 0 , t ) 

( 6 ) 

But then, in general, we lose information, and 
in particular we have no idea on the confidence 
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level this estimate has. Also, since the function 
r, at least, encodes for what the estimate x is 
used, for different uses, different functions r 
may be needed. 

An Optimization Approach 

A shortcut to obtain directly an observer giving 
a single-valued estimate is to design it by trading 
off among a priori and a posteriori information 
(see Cox 1964, pages 7-10; Alamir 2007). For 
example, in the continuous time case, we can 
select the estimate x(t) among the minimizers (in 
x) of 

C({s i-> 8 s (s)},x,t) = f C (8 s (s),y(s), 

7—00 ' 

X^(x,Cs),^ ds 

where X 8 \x,t,s) is still the notation for a so¬ 
lution to (1) and {s i-> <^(s)}, representing the 
unmodelled effect on the dynamics, is among the 
arguments for the minimization. The infinitesimal 
cost C is chosen to take nonnegative values 
and be such that C(0, h(x, s), x, s) is zero. For 
instance, it can be 

C(8 s ,y,x,s) = P'll^ + d y (y,h(x,s)) 2 

where ||.|| x is a norm at the point x and d y 
is a distance in the measurement space. In the 
same spirit, instead of optimization, a minimax 
approach can be followed. See, for instance, Bert- 
sekas and Rhodes (1971), Ba§ar and Bernhard 
(1995, Chapter 7), and Willems (2004). 

With x fixed, the minimization of C is an 
infinite horizon optimal control problem in 
reverse time. Solving on line this problem is 
extremely difficult and again approximations 
are needed. We do not go on with this 
approach, but we remark that, under extra 
assumptions, the observer we obtain following 
this approach can also be implemented in 
the form of a dynamical system (6) but with 
the specificity that the estimate x is part 
of the observer state £ and its dynamics 
are a copy of the undisturbed model with a 
correction term which is zero when the estimated 


state reproduces the measurement. Namely, 
we get 

x(?) = f(x(t), t, 0 )+E (a h» y(a)}, x(t),y(t), tj 

where E is zero when h(x(t), t ) = y(t ). But, as 
opposed to what we saw in the previous section, 
the initial condition for the part x of the observer 
state is unknown. Hence, we encounter again 
the need for the observer to forget its initial 
condition. 

Convergent Observers 

We have mentioned that often an observer can be 
implemented as a dynamical system, but without 
knowing necessarily how to initialize it. Also 
approximation is involved both in its design and 
its implementation. So, at least when it gives a 
single-valued estimate, we are facing the problem 
of convergence of this estimate to the “true” 
value, at least when there is no uncertainties. We 
concentrate now our attention on the study of this 
convergence, but, to simplify, in the continuous 
time case only. 

Let the model and observer dynamics be 
x(t ) = f(x(t),t), y(t) = h(x(t),t) (7) 

f x(t ) = z(%(t),y(t),t) 

( 8 ) 

with the observer state £ of finite dimension 
m. We denote by (X(x, t, s), E((x, £), t, s)) a 
solution of (7)-(8). 

Since we are dealing with convergence, the 
focus is on what is going on when the time 
becomes very large and in particular on the set 
Q of model states which are accumulation points 
of some solution. Specifically we are interested in 
the stability properties of the set 

3(0 = j(x,£):xe Q& x = r(£, h(x, t), oj 

which is contained in the zero estimation error set 
associated with the given model-observer pair. 

Definition 1 (convergent observer) We say the 

observer (8) is convergent if for each t, there 
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exists a set 5a(0 C 3(0> such that on the 
domain of existence of the solution, a distance 
between the point (X(x,t,s), E ((x, £), L s)) and 
the set 3 a (s) is upperbounded by a real function 
s i-> P x ^ t (s), may be dependent on (x, £y), with 
nonnegative values, strictly decreasing and going 
to zero as s goes to infinity. 

Necessary Conditions for Observer 
Convergence 

No Restriction on r 

It is possible to prove that if the observer is 
convergent, then, 

Necessity of detectability: When h and r are 
uniformly continuous in x and £, respectively, 
the estimate x does converge to the model state 
x. In this case, two solutions of the model (7) 
which produce the same measurement must 
converge to each other. This is an asymptotic 
distinguishability property called detectability. 
If we are interested not only in the asymptotic 
behavior but also in the transient (as for output 
feedback), a property stronger than detectabil¬ 
ity is needed. In particular instantaneous dis¬ 
tinguishability (see section “Observers Based 
on Instantaneous Distinguishability”) is neces¬ 
sary if we want to be able to impose the decay 
rate of the function f x ^ r 
Necessity ofm>n — p: For each t, there exists 
a subset X a (t) of £2, supposed to collect the 
model states which can be asymptotically esti¬ 
mated and such that we can associate, to each 
of its point x, a set x l (x,t) allowing us to 
redefine the set 3 a (0 as 

3a(0 = {(*.£) : X € X a (t)&% € x'{x,t)} . 

This implies that for each t and each x in 
X a (t), there is a point £ satisfying 

x = r(£, h(x, t), t) . (9) 

This is a surjectivity property of the function 
r but of a special kind since h(x,t ) is an 
argument of r. We say that, for each t, the 
function x is surjective to X a (t) given h. In 
a “generic” situation this property requires 


the dimension m of the observer state £ to 
be larger or equal to the dimension n of the 
model state x minus the dimension p of the 
measurement y . 

r Is Injective Given h 

We consider now the case where the observer 
has been designed with a function r which is 
injective given h, namely, we have the following 
implication, when x is in X a (t), 

r(£i ,h(x,t),t) = r (^2 ,h(x,t),t) 

& £1 G x l (x, t ) 

=* £i = h • 

In a “generic” situation, this property, together 
with the surjectivity given h, implies that the 
dimension m of the observer state £ should be 
between n — p and n . 

If a convergent observer has such a function 
r, then (x, t) x 1 (x, t), which is (of course) a 
(single valued) function, admits a Lie derivative 
(L f x l (x, t) = lim^o 
LfX 1 satisfying 

LfX 1 (x, t) =(p{x l (x, t), h(x, t), t) Vx e X a (t) 

( 10 ) 

This says (very approximatively) that cp is nothing 
but the image of the vector field /, under the 
change of coordinates ( x,t ) i-^ (r ? (x,£),0 but 
again all this given h. As partly obtained in the 
optimization approach, the observer dynamics are 
then a copy of the model dynamics with maybe a 
correction term which is zero when the estimated 
state reproduce the measurement. 

If moreover the functions h and r are uni¬ 
formly continuous in x and £, respectively, then, 
given £i and £2 a distance between E((x,£i),M) 
and E((x,£ 2 ),L^) goes to zero as s goes to 
infinity. This property is related to what was 
called extreme stability (see Yoshizawa 1966) in 
the 1950s and 1960s and is called incremental 
stability today (see Angeli 2002). It holds when, 
with denoting by E y (%,t,s) the solution at time 
s of the observer dynamics : 
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m = ) 

going through £ at time t and under the action 
of y, the flow £ m- S• v (£, t, s) is a strict con¬ 
traction (see Jouffroy (2005) for a bibliography 
on contraction) for each s > t or, at least, if a 
distance between any two solutions E y (^\,t,s) 
and E y (£2, t, s), with the same input y, converges 
to 0. 

Sufficient Conditions 

Knowing now how a convergent observer should 
look like, we move to a quick description of some 
such observers. 

Observers Based on Contraction 
Since the flow generated by the observer should 
be a contraction, we may start its design by 
picking the function cp as 

m = ) = Am + B(y(t),t) 

where A, not related to /, is a matrix whose 
eigenvalues have strictly negative real part. Under 
weak restriction, there exists a function x l satis¬ 
fying (10), namely, 

LfT l (x,t) = Ax l (x,t) + B(h(x,t),t) . 

( 11 ) 

To obtain a convergent observer, it is then suf¬ 
ficient that there exists a (uniformly continuous) 
function r satisfying 

v = r(r 1 (x,t),h(x,t),t) 

For this to be possible, the function x l should 
be injective given h. This injectivity holds when 
the observer state has dimension m > 2 (n + 1), 
the model is distinguishable, and provided the 
eigenvalues of A have a sufficiently negative 
real part and are not in a set of zero Lebesgue 
measure. 

Unfortunately, we are facing again a possible 
difficulty in the implementation since an expres¬ 
sion for a function x 1 satisfying (11) is needed 
and the function r : (£,y,0 i-> x(t) is known 
implicitly only as 


* = r‘'(x(0,0- 

See Andrieu and Praly (2006), Luenberger 
(1964), and Shoshitaishvili (1990). 

Observers Based on Instantaneous 
Distinguishability 

Instantaneous distinguishability means that we 
can distinguish as quickly as we want two model 
states by looking at the paths of the measurements 
they generate. A sufficient condition to have this 
property can be obtained by looking at the Taylor 
expansion in s of h(X(x,t,s),s). Indeed, we 
have: 

m_1 (s - tY 

h(X(x,t,s),s) = Yhi(x,t) y - -- 

U /! 

where hi is a function obtained recursively as 
ho(x,t) = h(x,t) 

h i+ i(x,t) = hi(x,t ) = x,t)f(x,t) 

+ f(x,t). 

If there exists an integer m such that, in some 
uniform way with respect to t , the function 

v 1 -^ H m (x,t) = (h 0 (x,t) , ... , h m -\(x,t)) 

is injective, then we do have instantaneous distin¬ 
guishability. We say the system is differentially 
observable of order m when this injectivity prop¬ 
erty holds. When a system has such a property, 
the model state space has a very specific struc¬ 
ture as discussed in Isidori (1995, Section 1.9). 
It means that we can reconstruct v from the 
knowledge of y and its m— 1 first time derivatives, 
i.e., there exists a function <£> such that we have: 

x = O (H m (x, t), t ). 

This way, we are left with estimating the deriva¬ 
tives of y. This can be done as follows. With the 
notation rji = hj-\(x, t ), we obtain: 




Observers for Nonlinear Systems 


941 


f)(t) = F r] + Gh m ($(r](t),t) , t) 

where 

Fr) = (r)2, , ri m , 0), G = (0, ... ,0, 1). 

When the last term on the right hand side is 
Lipschitz, we can find a convergent observer in 
the form: 

i(0 = F£(t) + Gh m (x(t),t) + K(y(t) - ^(t)), 
x(t) = t(£(0 ,t), 

with £ being actually an estimation of rj and 
where K is a constant matrix and r is a modified 
version of d> keeping the estimated state in its a 
priori given set X{t). 

This is the high-gain observer paradigm. 
See Gauthier and Kupka (2001) and Tornambe 
(1988). The implementation difficulty is in 
the function <f>, not to mention sensitivity to 
measurement uncertainty. 

Observers with r Bijective Given h 

Case Where r Is the Identity Function A con¬ 
vergent observer whose function r is the identity 
has the following form: 

l = mo 

+ E ({a h» ,x(t) = £(*)• 

( 12 ) 

The only piece remaining to be designed is the 
correction term E. It has to ensure convergence 
and may be also other properties like symmetry 
preserving (see Bonnabel et al. 2008). 

For this design, a first step is to exhibit some 
specific properties of the vector field / by writing 
it in some appropriate coordinates. For example, 
there may exist coordinates such that the 
expression of / takes the form f(v(t), h(x,t),t) 
and the corresponding observer (12) is such 
that there exists a positive definite matrix P 
for which the function s i-> (X(x,t,d) — 

X((x,x),t,s))'P(X(x,t,d) - X((x,x),t,s)) 
is strictly decaying (if not zero). A necessary 
condition for this to be possible is that f is 


monotonic tangentially to the level sets of the 
function h, i.e., for all (x,y,v,t) satisfying 
y = h(x, t) and ||(v, t)v = 0, we have: 

v T P p-(x,y,t)v < 0. (13) 

ax 

This is another way of expressing a detectability 
condition. This expression is coordinate depen¬ 
dent, hence the importance of choosing the coor¬ 
dinates properly. 

When this condition is strict and uniform in t , 
it is sufficient to get a locally convergent observer 
and even a nonlocal one when h is linear in x, i.e., 
h(x,t) = H(t)x , again a coordinate-dependent 
condition. In this latter case the observer takes the 
form 

m = HHt),y(t),t) + mt))p- x n{t) T 

[y(t] - H(t)^(r)], 
x{t) = f(0, 

where l is a real function to be chosen with suf¬ 
ficiently large values. If (13) is strict and uniform 
and holds for all v, the correction term is not 
needed. 

There are many other results of this type, 
exploiting one or the other specificity of the 
dependence on v of the function f - monotonicity, 
convexity,.... See Fan and Arcak (2003), Krener 
and Isidori (1983), Respondek et al. (2004), San- 
felice and Praly (2012), ... 

Case Where (x,t) I-*- {r l (x, t), h(x, t), t) Is 
a Diffeomorphism At each time t we know 
already that the model state v we want to es¬ 
timate satisfy y(t) = h(x,t). So, as remarked 
in Luenberger (1964), when ( h(x,t),t ) can be 
used as part of coordinates for (x,t), we need 
to estimate the remaining part only. This can be 
done if we find a function r l , whose values are 
n — p dimensional, such that (x,t) \-> ( y,r),t ) = 
(j h(x , t), x 1 (x, t), t ) is a diffeomorphism and the 
flow rj rj y (rj , t , s) generated by 

dx i dx i 

f](t) = 
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is a strict contraction for alls > t. Indeed in this 
case the observer dynamics can be chosen as 

m = 

and the estimate x(t) is obtained as solution of 

r‘(x(t),t) = m, nm.t ) = yit). 

This is the reduced-order observer paradigm. 
See, for instance, Besan 9 on (2000, Proposi¬ 
tion 3.2), Carnevale et al. (2008), and Luenberger 
(1964, Theorem 4). 


Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Observers in Linear Systems Theory 

► Regulation and Tracking of Nonlinear Systems 
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Abstract 

Observers are dynamical systems which process 
the input and output signals of a given dynamical 
system and deliver an online estimate of the 
internal state of the given system which asymp¬ 
totically converges to the exact value of the state. 
For linear, finite-dimensional, time-invariant sys¬ 
tems, observers can be designed provided a weak 
observability property, known as detectability, 
holds. 


Keywords 


with x(t) G W 1 , u(t) G R m , y(t) G R p and A, 
B , C, and D matrices of appropriate dimensions 
and with constant entries, and the problem of 
estimating its state from measurements of the 
input and output signals. In Eq. (1) ax(t) stands 
for x(t ), if the system is continuous-time, and 
for x(t + 1), if the system is discrete-time. In 
addition, if the system is continuous-time, then 
t G M + , i.e., the set of nonnegative real numbers, 
whereas if the system is discrete-time, then t G 
Z + , i.e., the set of nonnegative integers. 

We are interested in determining an online 
estimate x e (t) G R n , i.e., the estimate at time t 
has to be a function of the available information 
(input and output) at the same time instant. This 
implies that the estimate is generated by means of 
a device (known as filter) processing the current 
input and output of the system and generating a 
state estimate. The filter may be instantaneous, 
i.e., the estimate is generated instantaneously by 
processing the available information. In this case 
we have a static filter. Alternatively, the state es¬ 
timate can be generated processing the available 
information through a dynamical device. In this 
case we have a dynamic filter. 

Assume, for simplicity, that D = 0. This 
assumption is without loss of generality. In fact, 
if y = Cx + Du and u are measurable, then 
also y = Cx is measurable. Assume, in addition, 
that the filter which generates the online estimate 
is linear, finite-dimensional, and time-invariant. 
Then we may have the following two configura¬ 
tions: 

• Static filter. The state estimate is generated via 

the relation 


Linear systems; Observers; Reduced order 
observer; State estimation 


Introduction 


x e — My -j- Nu, (2) 

with M and N constant matrices of appropri¬ 
ate dimensions. The resulting interconnected 
system is described by the equations 


Consider a linear, finite-dimensional, time- 
invariant system described by equations of the 
form 


ax = Ax + Bu, 
y = Cx + Du, 


(1) 


ax = Ax + Bu, 
x e — MCx + N u. 


( 3 ) 


Dynamic filter The state estimate is generated 
by the system 
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1 7^ — Fi- + Ly + Hu, 
x e = + Ny + Pu, 


(4) 


with F, L, H, M, N and P constant matrices 
of appropriate dimensions. The resulting in¬ 
terconnected system is described by the equa¬ 
tions 

crx = Ax + Bu, 

(j{j = + LCx T Hu, (5) 

= M^ + NCx + Pu. 


the output signal y. This observer is therefore an 
open-loop observer. 

To exploit the knowledge of y, we modify 
the observer (6) adding a term which depends 
upon the available information on the estimation 
error, which is given by y e = Cx e — y. This 
modification yields a candidate state observer 
described by 


crlj — Al- + Bu + Ly e , 
X e = £. 


(7) 


In what follows we study in detail the dynamic 
filter configuration. This is mainly due to the fact 
that this configuration allows us to solve most es¬ 
timation problems for linear systems. Moreover, 
while the use of a static filter is very appealing, it 
provides a useful alternative only in very specific 
situations. 

State Observer 

A state observer is a filter that allows to estimate, 
asymptotically or in finite time, the state of a sys¬ 
tem from measurements of the input and output 
signals. 

The simplest possible observer can be 
constructed considering a copy of the system, 
the state of which has to be estimated. This 
means that a candidate observer for system (1) is 
given by 


To assess the properties of this candidate state 
observer, let e = x — x e be the estimation error 
and note that ve = Ae. As a result, if e(0) = 0, 
then e(t) = 0 for all t and for any input signal 
u. However, if e(0) ^ 0, then, for any input 
signal u, e{t) is bounded only if the system (1) 
is stable and converges to zero only if the system 
(1) is asymptotically stable. If these conditions do 
not hold, the estimation error is not bounded and 
system (6) does not qualify as a state observer 
for system (1). The intrinsic limitation of the 
observer (6) is that it does not use all the available 
information, i.e., it does not use the knowledge of 


To assess the properties of this candidate state 
observer, note that e = x — x e is such that 

ae = (A + LC)e. (8) 

The matrix L (known as output injection gain) 
can be used to shape the dynamics of the estima¬ 
tion error. In particular, we may select L to assign 
the characteristic polynomial p(s) of A + LC . To 
this end, note that 

p(s) = det(j/ -04 + LC)) = det(j/ - (. A ' + C'L')). 

Hence, there is a matrix L which arbitrarily 
assigns the characteristic polynomial of A + LC 
if and only if the system 

= A'$ + C'v 

is reachable or, equivalently, if and only if the 
system (1) is observable. 

We summarize the above discussion with two 
formal statements. 

Proposition 1 Consider system (1) and suppose 
the system is observable. Let p(s) be a monic 
polynomial of degree n. Then there is a matrix 
L such that the characteristic polynomial of A -\- 
LC is equal to p(s). Note that for single-output 
systems, the matrix L assigning the characteristic 
polynomial of A + LC is unique. 

Proposition 2 System (1) is observable if and 
only if it is possible to arbitrarily assign the 
eigenvalues of A + LC. 
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Detectability 

The main goal of a state observer is to provide an 
online estimate of the state of a system. This goal 
may be achieved, as discussed in the previous 
section, if the system is observable. However, 
observability is not necessary to achieve this goal: 
in fact the unobservable modes are not modified 
by the output injection gain. This implies that 
there exists a matrix L such that system (8) is 
asymptotically stable if and only if the unob¬ 
servable modes of system (1) have negative real 
part, in the case of continuous-time systems, or 
have modulo smaller than one, in the case of 
discrete-time systems. To capture this situation, 
we introduce a new definition. 

Definition 1 (Detectability) System (1) is de¬ 
tectable if its unobservable modes have negative 
real part, in the case of continuous-time systems, 
or have modulo smaller than one, in the case of 
discrete-time systems. 

Example 1 (Deadbeat observer) Consider a 
discrete-time system described by equations of 
the form 

x(t + 1) = Ax(t) + Bu(t), 
y(t ) = Cx(t), 

and the problem of designing a state observer, 
described by the equation (7), such that, for any 
initial condition v(0) and for any u , e(k) = 
0, for all k > N, and for some N > 0. 
A state observer achieving this goal is called a 
deadbeat state observer. To achieve this goal, it is 
necessary to select L such that (A + LC) N = 
0 or, equivalently, such that the matrix A + 
LC has all eigenvalues equal to 0. Note that 
N < n. 


Reduced Order Observer 

We have shown that, under the hypotheses of 
observability or detectability, it is possible to 
design an asymptotic observer of order n for the 
system (1). However, this observer is somewhat 


oversized, i.e., it gives an estimate for the n 
components of the state vector, without making 
use of the fact that some of these components can 
be directly determined from the output function, 
e.g., if y = x\ there is no need to reconstruct 
x\. It makes, therefore, sense to design a re¬ 
duced order observer , i.e., a device that esti¬ 
mates only the part of the state vector which is 
not directly attainable from the output. To this 
end consider the system (1) with D = 0 and 
assume that the matrix C has p independent 
rows. This is the case if rank C = p, whereas 
if rank C < p it is always possible to elimi¬ 
nate redundant rows. Then there exists a matrix 
Q such that, possibly after reordering the state 
variables, 

QC = [I C 2 \. 

Let 


v = Qy = QCx = x\ + C 2 x 2 , 

in which x\ (t) e R p and x 2 (t) e R n ~ p denote 
the first p and the last n — p components of x(t). 
Observe that the vector v is measurable. 

From the definition of v, we conclude that 
if v and x 2 are known, then x\ can be easily 
computed, i.e., there is no need to construct an 
observer for x\. 

Define now the new coordinates 


Xi 

= Tx = 

"/ c 2 - 

~Xi~ 

_X 2 _ 


0 I 

.x 2 _ 


and note that, by construction, v = Qy = x\ . In 
the new coordinates, the system, with output v, is 
described by equations of the form 

axi = A\\X\ + A\ 2 x 2 + B\u, 
ax 2 = A 21 xi + A 22 x 2 + B 2 u, 

v = X\. 
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To construct an observer for x 2 , consider the 
system 

(7^ — F T~ Hv T" Gu, 

with state £, driven by u and v, and with output 


w — i~ H- Fu. 

The idea is to select the matrices F, H, G , and 
F in such a way that w be an estimate for x 2 . Let 
w — x 2 be the observation error. Then 


gw — gx 2 = F£ + //u + Gw + F 


^ 11-^1 + ^ 12-^2 + B\U 


A 2 \X\ + A 22 £l + B 2 M 


F£ + + FAn — + 


LAn — A22 


x 2 + 


G + FFi — F 2 


M. (9) 


To have convergence of the estimation error to 
zero, regardless of the initial conditions and of 
the input signal, we must have 

g(w - x 2 ) = F(w - x 2 ) (10) 

and F must have all eigenvalues with negative 
real part, in the case of continuous-time systems, 
or with modulo smaller than one, in the case 
of discrete-time systems. Comparing Eqs. (9) and 
(10), we obtain that the matrices F, //, G, and L 
must be such that 

LAu — A22 = —F, 

H +LA n ~A 21 = FL , 

G + LB\ — B 2 — 0. 

We now show how the previous equations can be 
solved and how the stability condition of F can 
be enforced. Detectability of the system implies 
that the (reduced system) g% = A 22 % with output 
y = A\ 2 % is detectable. As a result, there exists a 
matrix L such that the matrix 

F = A 22 — LAn 

has all eigenvalues with negative real part, in the 
case of continuous-time systems, or with modulo 
smaller than one, in the case of discrete-time 
systems. Then the remaining equations are solved 
by 

H = FL-LAn+Mu 
G = —LB\ + B 2 . 


Finally, from x\ = v and the estimate w of x 2 , 
we build an estimate of the state v inverting 
the transformation T, i.e., 


~X\ e ~ 


"/ 

-c 2 ‘ 

V 

_X 2e _ 


_0 

1 

w 


Summary and Future Directions 

The problem of estimating the state of a linear 
system from input and output measurements can 
be solved provided a weak observability condi¬ 
tion holds. The problem addressed in this entry 
is the simplest possible estimation problem: the 
underlying system is linear and all variables are 
exactly measured. Observers for nonlinear sys¬ 
tems and in the presence of signals corrupted by 
noise can also be designed exploiting some of 
the basic ingredients, such as the notions of error 
system and of output injection, discussed in this 
entry. 


Cross-References 

► Controllability and Observability 

► Estimation, Survey on 

► Hybrid Observers 

► Kalman Filters 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 
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► Observer-Based Control 

► Observers for Nonlinear Systems 

► State Estimation for Batch Processes 


Recommended Reading 

Classical references on observers for linear sys¬ 
tems are given below. 
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Abstract 

There are very natural close connections between 
mechanics and optimal control as both involve 
variational problems. This is a huge subject and 
we just touch on some interesting connections 
here. A survey and history may be found in 
Sussman and Willems (1997). Other aspects may 
be found in Bloch (2003). 


Keywords 

Nonholonomic integrator; Sub-Riemannian opti¬ 
mal control; Variational problems 


Variational Nonholonomic Systems 
and Optimal Control 

Variational nonholonomic problems (i.e., con¬ 
strained variational problems) are equivalent to 
optimal control problems under certain regularity 
conditions. This issue was investigated in Bloch 
and Crouch (1994), employing the classical re¬ 
sults of Rund (1966) and Bliss (1930), which re¬ 
late classical constrained variational problems to 
Hamiltonian flows, although not optimal control 
problems. We outline the simplest relationship 
and refer to Bloch (2003) for more details. 

Let Q be a smooth manifold and TQ 
its tangent bundle with coordinates (« q\q l ). 
Let L : TQ —> R be a given smooth Lagrangian 
and let d> : TQ be a given smooth 

function. We consider the classical Lagrange 
problem: 

min^(.) / L(q,q)dt (1) 

Jo 

subject to the fixed endpoint conditions q (0) = 0, 
q(T) = qr and subject to the constraints 

®(q,q)=0. 

Consider a modified Lagrangian A(q,q,X) = 
L(q,q) + A • Q(q,q) with Euler-Lagrange equa¬ 
tions 

= 0, <t>(q,q) = 0. 

at oq aq 

( 2 ) 

We can rewrite this equation in Hamiltonian 
form and show that the resulting equations are 
equivalent to the equations of motion given by 
the maximum principle for a suitable optimal 
control problem. Set p = ^(q,q,k) and con¬ 
sider this equation together with the constraints 
<&(q,q) = 0. We can solve these two equations 
for (q, A) under suitable conditions as discussed 
in Bloch (2003). We obtain the standard Hamil¬ 
tonian equations with H(q, p ) = p • cf)(q, p) — 
L(q,<p(q,p)). 

We now compare this to the optimal control 
problem 
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min M( .) 



( 3 ) 


subject to q(0) = 0, q(T ) = qj, q = f{q,u), 
where u e M m and /, g are smooth functions. 
Then we have the following: 


Theorem 1 The Lagrange problem and optimal 
control problem generate the same ( regular ) ex¬ 
tremal trajectories, provided that: 

(i) <§(q,q) = 0 if and only if there exists a u 
such that q = f(q,u). 

(ii) L(q,f(q,u )) = g(q,u). 


For the proof and more details, see Bloch (2003). 


The ^-Dimensional Rigid Body 

An interesting mechanical example is the n- 
dimensional rigid body. See Manakov (1976) and 
Ratiu (1980). 

One can introduce a related system which we 
will call the symmetric representation of the rigid 
body ; see Bloch et al. (2002). 

By definition, the left invariant representa¬ 
tion of the symmetric rigid body system is given 
by the first-order equations 

Q = QQ; P = PQ (4) 

where Q, P e SO (n) and where Q is regarded as 
a function of Q and P via the equations 

Q,:=J~ l (M) eso(n) and M:=Q T P-P T Q. 

One can check that differentiating M yields 
the classical form of the n -dimensional rigid body 
equations. For more on the precise relationship, 
see Bloch et al. (2002). 

Now we can link the symmetric representation 
of the rigid body equations with the theory of 
optimal control. This work, developed in Bloch 
and Crouch (1996) and more generally in Bloch 
et al. (2002), has been further extended to optimal 
control problems for the infinitesimal generators 
of group actions (so-called Clebsch optimal con¬ 
trol problems) in Gay-Balmaz and Ratiu (2011) 
and Bloch et al. (2011, 2013) and even further to 


a class of embedded control problems in Bloch 
et al. (2011,2013). 

Let T > 0, Qo, Qt g SO(n) be given and 
fixed. Let the rigid body optimal control problem 
be given by 


1 r T 

min - / (U, J(U))dt (5) 

Ue$o(n ) 4 J o 

subject to the constraint on U that there be a curve 
Q(t ) G SO(n) such that 

Q = QU 2 ( 0 ) = Qo, Q(T) = Q t . 

( 6 ) 

Proposition 1 The rigid body optimal control 
problem has optimal evolution equations (4) 
where P is the costate vector given by the 
maximum principle. 

The optimal controls in this case are given by 
U = J-\Q t P - P T Q). (7) 


Kinematic Sub-Riemannian Optimal 
Control Problems 

Optimal control of underactuated kinematic sys¬ 
tems give rise to very interesting mechanical 
systems. 

The problem is referred to as sub-Riemannian 
in that it gives rise to a geodesic flow with 
respect to a singular metric (see the work of 
Strichartz (1983, 1987) and Montgomery (2002) 
and references therein). This problem has an 
interesting history in control theory (see Brock¬ 
et! 1973, 1981; Baillieul 1975). See also Bloch 
et al. (1994) and Sussmann (1996) and further 
references below. 

We consider control systems of the form 


m 

x = y j XjUj, x e M, u e £2 c M m , (8) 

i = 1 

where Q contains an open subset that contains 
the origin, M is a smooth manifold of dimension 
n, and each of the vector fields in the collection 
F := {X\, ..., Xk} is complete. 
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We assume that the system satisfies the acces¬ 
sibility rank condition and is thus controllable, 
since there is no drift term. Then we can pose the 
optimal control problem 

r*T y m 

min / - u 2 (t)dt (9) 

«(•) Jo 2 

subject to the dynamics (8) and the endpoint 
conditions x(0) = xo and x(T) = xt . These 
problems were studied by Griffiths (1983) from 
the constrained variational viewpoint and from 
the optimal control viewpoint by Brockett (1981, 
1983). In the sub-Riemannian geodesic problem, 
abnormal extremals play an important role. See 
work by Strichartz (1983), Montgomery (1994, 
1995), Sussmann (1996), and Agrachev and 
Sarychev (1996). 

Example: Optimal Control and a Particle 
in a Magnetic Field The control analysis 
of the Heisenberg model or nonholonomic 
integrator goes back to Brockett (1981) and 
Baillieul (1975), while a modern treatment of 
the relationship with a particle in a magnetic 
field may be found in Montgomery (1993), for 
example. A nice treatment of the pure mechanical 
aspects of a particle in a magnetic field may be 
found in Marsden and Ratiu (1999). 

The Heisenberg optimal control equations are 
a particular case of planar charged particle mo¬ 
tion in a magnetic field. This may be seen by 
considering the slightly more general problem 
below. 

We now consider the optimal control problem 
min / ( u 2 + v 2 )dt (10) 


subject to the equations 

x = u, 
y = v, 

z = A\u + A 2 v, (11) 


where A\(x,y) and A 2 (x, y) are smooth func¬ 
tions of x and y. A\ = y and A 2 = —x 
recover the Heisenberg/nonholonomic integrator 
equations. More generally we get the flow of a 
particle in a magnetic field - it is not hard to 
carry out the optimal control analysis to see this. 
Details are in Bloch (2003). 


Cross-References 

► Discrete Optimal Control 

► Optimal Control and Pontryagin’s Maximum 
Principle 

► Optimal Control with State Space 
Constraints 

► Singular Trajectories in Optimal 
Control 
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Optimal Control and Pontryagin's 
Maximum Principle 

Richard B. Vinter 
Imperial College, London, UK 


Abstract 

Pontryagin’s Maximum Principle is a collection 
of conditions that must be satisfied by solutions 
of a class of optimization problems involving 
dynamic constraints called optimal control prob¬ 
lems. It unifies many classical necessary condi¬ 
tions from the calculus of variations. This article 
provides an overview of the Maximum Principle, 
including free-time and nonsmooth versions. A 
time-optimal control problem is solved as an 
example to illustrate its application. 


Keywords 

Dynamic constraints; Hamiltonian system; Maxi¬ 
mum principle; Nonlinear systems; Optimization 


Optimal Control 

A widely used framework for studying mini¬ 
mization problems, encountered in the optimal 
selection of flight trajectories and other areas 
of advanced engineering design and operation 
involving dynamic constraints, is to view them as 
special cases of the problem: 


(p) 


Minimize /(x(.), u(.)) : 

= / 0 r L(t, x(t), u(t))dt + g(x(0),x(T)) 

over measurable functions u (.) : 

[0, T] -> R m and 

absolutely continuous functions v(.) : 
[0, T] —> R n satisfying 

x(t) = f(t,x(t),u(t)) a.e., 
u(t ) e £2 a.e., 

(jc(0), jc(T’)) e C, 
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the data for which comprise a number T > 0, 
functions / : [0 ,T]xR n xR m -> R n ,L : [0 ,T]x 
R n x R m ^ R and g : R n x R n -> and sets 
C C and ft C . 

It is assumed that set C has the functional in¬ 
equality and equality constraint set representation 

C = {(*o,*i) £ R n • 0 ? (xo,*i) 5 0 

for / = 1 , 2 ,... and ( 1 ) 

/ ? (xo,xi) = 0 for i = 1 , 2 ,..., A 2 }, 

in which fi 1 : R n x R n ^ R, i = 1,..., Ai and 
xf 1 : R n x R n —>► /?, / = 1,...,A 2 are given 
functions. 

A control function is a measurable function 
i/(.) : [ 0 , T] -> satisfying w(f) e ft a.e. 
/ e [0,7] A state trajectory v(.) associated 
with a control function u(.) is a solution to the 
differential equation x (t) = f (t , x (t ), u (t)). A 
pair of functions (x(.), u(.)) comprising a control 
function u(.) and an associated state trajectory 
x(.) satisfying the condition (x(0),x(T)) e 
C is a feasible process. A feasible process 
(x(.),//(.)) which achieves the minimum of 
J(x(.), u(.)) over all feasible processes is called 
a minimizes 

Frequently, the initial state is fixed, i.e., C 
takes the form 

C={vo} x C i for some Vo eR n and some C\CR n . 

In this case, (P) is a minimization problem 
over control functions. Allowing freedom in the 
choice of initial state introduces a flexibility 
into the formulation which is useful in some 
applications however. 

Optimization problems involving dynamic 
constraints (such as, but not exclusively, those 
expressed as controlled differential equations) 
are known as optimal control problems. Various 
frameworks are available for studying such 
problems. (P ) is of special importance, since 
it embraces a wide range of significant dynamic 
optimization problems which are beyond the 
reach of traditional variational techniques and, 
at the same time, it is well suited to the 


derivation of general necessary conditions of 
optimality. 


The Maximum Principle 

The centerpiece of optimal control theory is a 
set of conditions that a minimizer (v(.),fi(.)) 
must satisfy, known as Pontryagin’s Maximum 
Principle or, simply, the Maximum Principle. It 
came to prominence through a 1961 book, which 
appeared in English translation as Pontryagin LS 
et al. (1962). It bears the name of L S Pontryagin, 
because of his role as leader of the research group 
at the Steklov Institute, Moscow, which achieved 
this advance. But the first proof is attributed to 
Boltyanskii. For given A > 0, define the Hamil¬ 
tonian function Hx : [0, T] x R n xR n xR m R' 

Hx(t,x,p,u ) := p T f(t,x,u) — XL(t,x,u). 

Theorem 1 (The Maximum Principle) Let 

(v(.),fi(.)) be a minimizer for (P). Assume that 
the following hypotheses are satisfied: 

(i) g is continuously differentiable. 

(ii) <p l ,i = 1 ,..., k\ and if 1 , i = 1 ,..., are 
continuously differentiable. 

(iii) With f(t,x,u ) = (L(t, x,u), f(t, x,u)), 
/(.,.,.) is continuous, f (t, .,u) is continu¬ 
ously differentiable for each (t, u), and there 
exist c > 0 and k(.) e L l such that 

\f(t,x,u) — f(t,x',u)\ < k(t)\x—x'\ 

for all x , x' e R n such that \x — v(/)| < 6 
and \x' — x(t)\ < €, and u e Q, a.e. t e 
[0, T) 

(iv) £2 is a Borel set. 

Then, there exist a number A (A = 0 or l), an 
absolutely continuous arc p : [0, T] R n , 
numbers a 1 > 0 for i = l,... ,k\ and numbers 
ft 1 for i = 1 ,... ,kx satisfying 

(p(.),A, {a'}, {/?' }) 7 ^ ( 0 , 0 , { 0 ,.. . 0 }, { 0 ,.. . 0 }) 

and such that the following conditions are satis¬ 
fied: 




952 


Optimal Control and Pontryagin's Maximum Principle 


The Adjoint Equation: 

-pit) = ff T (t,x(t),u(t))p(t) 
ox 

3 T 

—X—L (t,x(t),u(t)), a.e., 
ox 

The Maximization of the Hamiltonian Condi¬ 
tion: 

H x (t,x(t), p(t),u(t)) 

= ma xH\(t,x(t), p(t),u) a.e., 

ueQ 

The Transversality Condition: 

(p T (0),-p T (T)) = XVg(x(0),x(T)) 
k\ 

+ ^a'V</>'(jc(0),x(7’)) 

i = 1 

ki 

+Y J ^VimxiT)) 

i = 1 

and a 1 = 0 for all i G {1,... ,k\} such 
that (x( 0 ), x(T)) < 0 , in which 


the differential equations for the /?/(.)’s is, first, 
to construct the Hamiltonian Hx(t, x, p,u) = 
p T f(t,x,u)p T f(t,x,u) — X L(t,x,u) and, 
second, to use the fact that the / th component 
Pi(.) of the costate p(t) = [p\{t),..., p n {t)] T 
satisfies the equation: 

3 

-MO = -x— Hx(t,x(t), p(t),u(t)) 

OXi 

for i = 1 ,... ,n. 

The preceding equations are of course merely a 
component-wise statement of the costate equa¬ 
tion above. In many applications the endpoint 
constraints take the form 

xi (0) = £q for i e Jq and xi (0) e R n for i <£ j 0 
Xj (T) = for i e J\ and x z (0) e R n for i J\ 

for given index sets Jo, Ji C {0 ,,n} and n- 
vectors £, l 3 for / e Jo and for / e J\, i.e., the 
endpoints of each state trajectory component are 
either “fixed” or “free.” In such cases the rules for 
setting up the boundary conditions on the pi (.)’s 
are 


V/z(* 0 ,*i)(*o,*i) • 


’ 3 

_3v 0 


h(x 0 ,xi), — 
3xi 


h(xo,x i) 


•( 2 ) 


Pi (0) G R n for i G Jo and pi (0) 


= 2 -—g(x( 0 ),x(r)) for / ^ Jo 
OXoi 


If the functions L(t,x,u) and f(t,x,u) are 
independent of t, then also 

Constancy of the Hamiltonian for Au¬ 
tonomous Problems: 

Hx(x(t), p(t),u(t)) = c a.e. 
for some constant c. 

We allow the cases k\ =0 (no inequality 
constraints) and = 0 (no equality endpoint 
constraints). In the first case, the non-degeneracy 
condition becomes (p(.), X, {fi l }) ^ ( 0 , 0 , 0 ) 
and the summation involving the a 1 ’s is dropped 
from the transversality condition. The second 
case, or any combination of the two cases, is 
treated similarly. 

Derivation of the costate equation and 
boundary conditions. A simple way to derive 


Pi (T) G R n for i G Ji and - p t (T) 

3 

= A-—g(x(0),x(T)) for i <£ J u 

OXii 

i.e., if V/(0) (respectively Xi(T)) is fixed, 
then pi( 0 ) (respectively pi(T )) is free, and if 
Xi( 0 ) (respectively Xi(T)) is free, then Pi(0) 
(respectively pi ( T )) is fixed. 

The optimal control problem (P) is a general¬ 
ization of the following problem in the calculus 
of variations: 

Minimize L(t,x(t), x(t))dt 

over absolutely continuous arcs v (.): 

[0, T] -> R n satisfying 

Xx(P ),x(T)) = (a,b). 
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for given L : [0, T] x R n x R n —> R and (a, b) e 
R n x R n . This problem is a special case of (P) in 
which /(7, x, u) = u, £2 = R n , k\ = 0, k 2 = 2« 
and 

(W fl (Xo,Xi),...,^"(Xo,Xi)), 

(V f ” +1 (x 0 , *1) • • •, ir 2 n (x 0 , Xi)) 

= (xq — a T ,x\ — b T ). 

It is a straightforward exercise to deduce 
from the Maximum Principle, in this special 
case, that a minimizer satisfies the classical 
Euler-Lagrange and Weierstrass conditions 
and also that the minimizer and associate 
costate arc satisfy Hamilton’s system of 
equations, under an additional uniform con¬ 
vexity hypothesis on L(t,x,.). Thus, the 
Maximum Principle unifies many of the classical 
necessary conditions from the calculus of 
variations and, furthermore, validates them 
under reduced hypotheses. But it has far- 
reaching implications, beyond these conditions, 
because it allows the presence of pathwise 
constraints on the velocities, expressed in terms 
of a controlled differential equation and a 
control constraint set, which are encountered 
in engineering design, econometrics, and other 
areas. 

The Hamiltonian System 

In favorable circumstances, we are justified in 
setting the cost multiplier A = 1 and, fur¬ 

thermore, the maximization of the Hamiltonian 
condition permits us, for each t , to express u as a 
function of x and p: 

u = u*(t,x,p). 

The Maximum Principle now asserts that a min¬ 
imizing arc x(.) is the first component of a pair 
of absolutely continuous functions (v(.),/>(.)) 
satisfying Hamilton’s system of equations: 

(~p T (t),x T (/)) = V v //|(r,x(r), p(t),u* 

(t,x(t),p(t))) a.e., (4) 


in which S/ xp H\ denotes the gradient of 
H(t, x, p , u ) w.r.t. the vector [x T , p T ] T variable 
for fixed ( t,u ), together with the endpoint 
conditions 

(x(0),x(r)) € C and (p T (0), -p T (T)) 

= AVg(x( 0 ),x(T)) 

ki 

+ y>'Vf(x( 0 ),x(r)) 

i = \ 

ki 

+ y>v^(x(o ),x(n), 

i = 1 

for some nonnegative numbers {a 1 } and numbers 
{/3 l } satisfying 

a 1 = 0 for all i e {1 ,..., k\} such that 

xf(x(o),x(r)) < o, 

where Vg, V 0 and Vi/r etc., are as defined in ( 2 ). 
The minimizing control satisfies the relation 

u(t) = u*(t, x(t), p(t)). 

Notice that the first-order vector differential 
equation (4) is a system of 2 n scalar, first- 
order differential equations. Let us suppose 
that k\ inequality endpoint constraints are 
active at (x(0), x(T)). Then, satisfaction of 
the active constraints and the transversality 
condition impose 2n + k\ + ^2 on the boundary 
values of (x(.),p(.)). Taking account of the 
fact, however, that there are k\ + £2 unknown 
endpoint multipliers, we see that the effective 
number of endpoint constraints accompanying 
the differential equation (4) is 

2 n H - k\ H - A 2 — (k\ + kf) — 2 n. 

Thus, the set of In scalar first-order differen¬ 
tial equations (4) defining the “two-point bound¬ 
ary value problem” to determine ( x,p ) has the 
“right” number of endpoint conditions. 
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Refinements 

Free-Time Problems : Consider a variant on the 
“autonomous” case of problem ( P ) (L and / 
do not depend on t), call it (FT), in which the 
terminal time T is no longer fixed, but is a choice 
variable along with the control function and the 
initial state, and the cost function is 

7(7\ *(.),«(.)) 

:= f L(x(t), u(t))dt + g(T,x(0),x(T)) 

Jo 

for some function g Take a minimizer 
(T,x(.),u(.)) for (FT). Assume, in addition 
to hypotheses (i)-(iii), that Q is bounded and 
the function k(.) in (iii) is bounded. Then the 
Maximum Principle conditions (for data in which 
the end time is frozen at T = T) continue to 
be satisfied for some p(.) : [0, f] -> R n and 
A, including the constancy of the Hamiltonian 
condition 

Hx(x(t), p(t), u(t)) = c a.e t c [0, f] 

for some constant c. But a new condition is 
required to reflect the extra degree of freedom 
in the new problem specification, namely, the 
free end time. This is an additional transversality 
condition involving the constant value c of the 
Hamiltonian: 

Free Time Transversality Condition: c = 

Other Refinements'. Versions of the Maximum 
Principle are available to take account of 
pathwise functional inequality constraints on 
state variables (“pure” state constraints) and 
of both state and control variables (“mixed” 
constraints). Maximum Principle-like conditions 
have also been derived for optimal control 
problems in which the dynamic constraint takes 
the form of a retarded differential equation with 
control terms and in which the class of control 
functions is enlarged to include Dirac delta 
functions (“impulse” optimal control problems). 


The Nonsmooth Maximum Principle 

In early derivations of the Maximum Principle, 
it was assumed that the functions f(t,x,u) and 
L(t,x,u) were continuously differentiable with 
respect to the x variable. A major research en¬ 
deavor since the early 1970s has been to find 
versions of the Maximum Principle than remain 
valid when the functions f(t,x,u) and L(t,x,u) 
satisfy merely a “bounded slope” or, synony¬ 
mously, a Lipschitz continuity condition with 
respect to x. Such functions are “nonsmooth” in 
the sense that they can fail to be differentiable, 
in the conventional sense, at some points in their 
domains. An overview of the Maximum Principle 
would be incomplete without reference to such 
advances. 

The search for nonsmooth optimality 
conditions is motivated by a desire to solve 
optimal control problems where, in particular, 
the function f(t,x,u) is a piecewise linear 
function of x (for fixed (t,u)). Such functions 
arise, for example, when the f(t,x,u) is 
constructed empirically via a lookup table and 
linear interpolation. Nonsmooth cost integrands 
are encountered when they are constructed using 
“pointwise” supremum and/or “absolute value” 
operations. The function 

J(x(.)) = f \x(t)\dt + max{x(l), 0} , 

Jo 

which penalizes the L l norm of the state trajec¬ 
tory and the terminal value of the scalar state, but 
only if this is nonnegative, is a case in point. 

When attempting to generalize the Maximum 
Principle to allow for nonsmooth data, we en¬ 
counter the challenge of interpreting the adjoint 
equation, which can be written as 

9 

-p{t) = —H x (t,x(t),u(t)p(t)), 

OX 

in circumstances when the x-gradients of / and 
L are not defined, at least not in a conventional 
sense. One approach to dealing with this problem 
is via the Clarke generalized gradient dm of 
function m: R n —> R at a point x: 
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3m(x) := co | there exist sequences x l —>x, 
£* —> f such that, for each /, m(.) is Frechet 
differentiable at x l and £/ = 


Here, “co” means closed convex hull. In a land¬ 
mark paper, Clarke FH 1976, Clarke proved 
a necessary condition commonly referred to as 
the nonsmooth Maximum Principle, in which the 
adjoint equation is replaced by a differential in¬ 
clusion involving the (partial) generalized gradi¬ 
ent 3 x H(t, xit), p{t), uit)) of H(t ,., p(t), uit)) 
w.r.t x, evaluated at xit), namely, 

—p T it) e d x Hit, xit), uit)) a.e. te[0, T]. 

This formulation of the “adjoint inclusion” for 
the nonsmooth Maximum Principle and the un¬ 
restricted hypothesis under which it is derived in 
this paper remain state of the art. 


Example 

We illustrate the application of the Maximum 
Principle with a simple example. It has the fol¬ 
lowing interpretation. A 1 kg mass is located 1 m 
along the line and has zero velocity. We seek a 
time T > Os. which is the minimum over all 
times T > 0 having the property: there exists a 
time-varying force uit), 0 < t < 1 satisfying 


— 1 < u(t) < +1 


such that, under the action of the force, the mass 
is located at the origin with zero velocity at time 
T. Note that, in consequence of Newton’s second 
law, the vector xit) = (xi(£), x 2 (^)) comprising 
the displacement and velocity of mass satisfies 


*i(0 

* 2(0 


"0 r 

"xi( 0 " 

+ 

" 0 " 

_ 00 _ 

_x 2 (t) _ 

_ 1 _ 


(5) 


Minimize T 

overtimes T > 0, measurable functions w(.): 
[0, T] -> R and 

absolutely continuous functions x(.): 

< [0, T] R 2 such that 

xit) = Ax it) + buit) a.e. 
uit) G Q a.e. 

(xi( 0 ),X 2 ( 0 )) = (1,0) and 

(xi(T),x 2 (T)) = (0,0). 

in which A = 

H.+ii- 

The (free-time) Maximum Principle provides 
the following information about a minimizing 
end time f, control fi(.), and corresponding 
state x(.) = (xi(.),x 2 (.)). There exists an arc 
P (-) = [P\Q,PiQ] t such that 

x\it) = x 2 (0 and x 2 it) = uit), (6) 

Pi (0 =0 and - p 2 (t) = Pi(t), (7) 
uit) = arg msLx{p 2 it)u \u e[-l, +1]} (8) 
(xi, x 2 )(0)=(l, 0)and(xi, x 2 )(T)=(0,0) (9) 
P\(t)x 2 (t) + \p 2 (t)\ = A foralH. (10) 

Condition (1) permits us to express fi(.) in terms 
of p 2 i .), thus 


b = 


and Q = 


uit) = sign{/> 2 (f)}, 

and thereby eliminate fi(.). It can be shown that 
relations ( 6 )-( 2 ) have a unique solution for T, 
uit), xit), pit) and A = 0 or 1 . Furthermore, 
these relations cannot be satisfied with A = 0 . 
The unique solution (with A = 1) is 


(xi it),x 2 it)) 

( (1 - \t 2 ,-t) if t G [0,1) 

= (l-(r-i) + l(r-i) 2 , 

[ —l + ^(t — l)) if t G [1,2], 


uit) 


-1 if t e [0,1) 
+ 1 if t G [1,2], 


piit)=— 1 and p 2 (t)=— 1 + t for t e [ 0 , 2 ]. 


This is a special case of the free-time problem 
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The Maximum Principle is a necessary condition 
of optimality. Since a minimizer exists and since 
(f,x(.),u(.),p(.)) is a unique solution to the 
Maximum Principle relations, it follows that 
(f, x(.), u(.)) is the solution to the problem. 

This problem is amenable to simpler, more 
elementary, solution techniques. But the above 
solution is enlightening, because it highlights 
important generic features of the Maximum Prin¬ 
ciple. We see how the “maximization of the 
Hamiltonian condition” can be used to eliminate 
the control function and thereby to set up a two- 
point boundary problem for x(.) and p(.) (a very 
nonclassical construction). 
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► Numerical Methods for Nonlinear Optimal 
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► Optimal Control and the Dynamic Program¬ 
ming Principle 

► Optimal Control and Mechanics 
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Abstract 

This entry illustrates the application of Bellman’s 
dynamic programming principle within the con¬ 
text of optimal control problems for continuous¬ 
time dynamical systems. The approach leads to a 
characterization of the optimal value of the cost 
functional, over all possible trajectories given 
the initial conditions, in terms of a partial dif¬ 
ferential equation called the Hamilton-Jacobi- 
Bellman equation. Importantly, this can be used 
to synthesize the corresponding optimal control 
input as a state-feedback law. 

Keywords 

Continuous-time dynamics; Hamilton-Jacobi- 
Bellman equation; Optimization; Nonlinear 
systems; State feedback 

Introduction 

The dynamic programming principle (DPP) is 
a fundamental tool in optimal control theory. 
It was largely developed by Richard Bellman 
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in the 1950s (Bellman 1957) and has since been 
applied to various problems in deterministic and 
stochastic optimal control. The goal of optimal 
control is to determine the control function and 
the corresponding trajectory of a dynamical sys¬ 
tem which together optimize a given criterion 
usually expressed in terms of an integral along 
the trajectory (the cost functional) (Fleming and 
Rishel 1975; Macki and Strauss 1982). The func¬ 
tion which associates with the initial condition 
of the dynamical system the optimal value of the 
cost functional among all the possible trajectories 
is called the value function. The most interest¬ 
ing point is that via the dynamic programming 
principle, one can derive a characterization of the 
value function in terms of a nonlinear partial dif¬ 
ferential equation (the Hamilton-Jacobi-Bellman 
equation) and then use it to synthesize a feedback 
control law. This is the major advantage over 
the approach based on the Pontryagin Maximum 
Principle (PMP) (Boltyanskii et al. 1956; Pon¬ 
tryagin et al. 1962). In fact, the PMP merely gives 
necessary conditions for the characterization of 
the open-loop optimal control and of the corre¬ 
sponding optimal trajectory. The DPP has also 
been applied to construct approximation schemes 
for the value function although this approach suf¬ 
fers from the “curse of dimensionality” since one 
has to solve a nonlinear partial differential equa¬ 
tion in a high dimension. Despite the elegance of 
the DPP approach, its practical application is lim¬ 
ited by this bottleneck, and the solution of many 
optimal control problems has been accomplished 
instead via the two-point boundary value problem 
associated with the PMP. 


The Infinite Horizon Problem 

Let us present the main ideas for the classical infi¬ 
nite horizon problem. Let a controlled dynamical 
system be given by 

| j0) = /(jO), «(•*)) (1) 

( y(to) = No¬ 


where Xo , y (s ) e K d , and 


a : [r 0 , T] -» A c IT, 

with T finite or +oo. Under the assumption that 
the control is measurable, existence and unique¬ 
ness properties for the solution of (1) are ensured 
by the Caratheodory theorem: 

Theorem 1 (Caratheodory) Assume that: 

1. is continuous. 

2. There exists a positive constant Lf> 0 such 
that 


\f(x,a)~ f(y,a)\ <L f \x-y\, 


for all x,y e K d , t e R + and a e A. 

3. f(x, a(t)) is measurable with respect to t. 

Then, there is a unique absolutely continuous 
function y : [to , T] that satisfies 


y(s) = x 0 + [ f(y(r),a(r))dr. 

j to 


( 2 ) 


which is interpreted as the solution of ( 1). 

Note that the solution is continuous, but only a.e. 
differentiable, so it must be regarded as a weak 
solution of (1). By the theorem above, fixing a 
control in the set of admissible controls 


a e A := {a : [to, T] A, measurable} 

yields a unique trajectory of (1) which is denoted 
by y XOf to (s; a). Changing the control policy 
generates a family of solutions of the controlled 
system (1) with index a. Since the dynamics (1) 
are “autonomous,” the initial time to can be 
shifted to 0 by a change of variable. So to simplify 
the notation for autonomous dynamics, we can 
set to = 0 and we denote this family by y Xo (s; a) 
(or even write it as y(s) if no ambiguity over 
the initial state or control arises). It is customary 
in dynamic programming, moreover, to use the 
notations v and t instead of Xo and to (since v 
and t appear as variables in the Hamilton-Jacobi- 
Bellman equation). 

Optimal control problems require the intro¬ 
duction of a cost functional J : A M which 
is used to select the “optimal trajectory” for (1). 
In the case of the infinite horizon problem, we set 
to = 0 , xo = x, and this functional is defined as 
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p OO 

Jxipi) = / g(y x (s,a),a(s))e~ Xs ds (3) 
Jo 

for a given A > 0. The function g represents 
the running cost and A is the discount factor , 
which can be used to take into account the re¬ 
duced value, at the initial time, of future costs. 
From a technical point of view, the presence of 
the discount factor ensures that the integral is 
finite whenever g is bounded. Note that one can 
also consider the undiscounted problem (A = 
0) provided the integral is still finite. The goal 
of optimal control is to find an optimal pair 
(y*,a*) that minimizes the cost functional. If 
we seek optimal controls in open-loop form, i.e., 
as functions of t , then the Pontryagin Maximum 
Principle furnishes necessary conditions for a 
pair (y*, a*) to be optimal. 

A major drawback of an open-loop control is 
that being constructed as a function of time, it 
cannot take into account errors in the true state 
of the system, due, for example, to model errors 
or external disturbances, which may take the evo¬ 
lution far from the optimal forecasted trajectory. 
Another limitation of this approach is that a new 
computation of the control is required whenever 
the initial state is changed. 

For these reasons, we are interested in the 
so-called feedback controls , that is, controls ex¬ 
pressed as functions of the state of the system. 
Under feedback control, if the system trajectory 
is perturbed, the system reacts by changing its 
control strategy according to the change in the 
state. One of the main motivations for using the 
DPP is that it yields solutions to optimal control 
problems in the form of feedback controls. 

DPP for the Infinite Horizon Problem 

The starting point of dynamic programming is to 
introduce an auxiliary function, the value func¬ 
tion , which for our problem is 

v(x) = inf J x (a), (4) 

aeA 

where, as above, v is the initial position of the 
system. The value function has a clear meaning: 
it is the optimal cost associated with the initial 


position v. This is a reference value which can 
be useful to evaluate the efficiency of a control - 
if J x {pt) is close to v(x), this means that a is 
“efficient.” 

Bellman’s dynamic programming principle 
provides a first characterization of the value 
function. 

Proposition 1 (DPP for the infinite horizon 
problem) Under the assumptions of Theorem 1, 
for all x eW 1 and r > 0, 


v(x) = inf < / g(y x (s;a),a(s))e Xs ds 
aeA ( J o 


+ e Xr v(y x (x;a))\ . (5) 


Proof Denote by v(x) the right-hand side of (5). 
First, we remark that for any and a e A, 


p OO 

Jxipt) = / g(y(s),d(s))e~ Xs ds 
Jo 

= [ g(y(s),a(s))e~ Xs ds 
Jo 

/ oo 

g(y(s),d(s))e~ Xs ds 

= T g(y(s),cc(s))e- Xs ds + e-^ 
Jo 


poo 

x / g(y(s + x),d(s + x))e~ Xs ds 

Jo 

: f g(yO),u(s))e- Xs ds + e~ Xr v(y(x)) 

Jo 


(here, y x (^,a) is abbreviated as y(s)). Taking 
the infimum over all trajectories, first over the 
right-hand side and then the left of this inequality, 
yields 


v(x) > v(x) (6) 

To prove the opposite inequality, we recall that 
v is defined as an infimum, and so, for any v e 
R d and s > 0, there exists a control a e (and the 
corresponding evolution y e ) such that 
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u(x)+£> 

(7) v{x)-e~ x 'v{y*(z))= g(y* (s),a* (s))e~ Xs ds 

On the other hand, the value function v being also ^ 0 

defined as an infimum, for any x eR d and £ > 0, 

there exists a control a e such that so that addin S and subtracting e *v(x) and 

dividing by r, we get 


/; 


g(y e (s),a e (s))e Xs ds-\e Ar v(y s ( r)). 




that is, 


v(y s (r)) + s > J Mr) (a s ). (8) 

Inserting (8) in (7), we get 

v(x)> f g(y £ (s),a £ (s))e~ Xs ds 
Jo 

+ e Xt J y e {-c)ipi £ ) — (1 + e Xt )s 

> J x (a) - (1 + e~ Xz )s 

> v(x) — (1 + e~ Xz )£, (9) 

where a is a control defined by 

(a s (s) 0 < 5 < r 
a ( J ) = { - , , ( 10 ) 
/ a £ (s — r) s > r. 


e _ Xx (v(x) - v(y*(z))) + u(x)(l -e Ar ) 
r r 

= - r g(y*(s), a*(s))e~ Xs ds. 

r Jo 

Assume now that v is regular. By passing to the 
limit as r —> 0 + , we have 


lim 

T -^0 + 


u(j*(t)) - n(%) 
r 


= -Dv(x)-y*(x) 


lim v(x) 

r-K)+ 


(1 - e~ Xz ) 
x 


-Dv(x)-f(x,a*(P)) 

Xv(x) 


lim - r g (y*(s),a*(s))e- Xs ds = g(x,a*( 0)) 

r^0+ T Jo 


(Note that &(•) is still measurable). Since £ is 
arbitrary, (9) finally yields v(x) > r>(x). 

We observe that this proof crucially relies on 
the fact that the control defined by (10) still 
belongs to A, being a measurable control. The 
possibility of obtaining an admissible control by 
joining together two different measurable con¬ 
trols is known as the concatenation property. 


The Hamilton-Jacobi-Bellman 
Equation 

The DPP can be used to characterize the value 
function in terms of a nonlinear partial differ¬ 
ential equation. In fact, let a* e A be the 
optimal control, and y* the associated evolution 
(to simplify, we are assuming that the infimum is 
a minimum). Then, 

v(x)=r g (y*(s),a*(s))e- Xs ds+e- XT v(y*(r)), 

Jo 


where we have assumed that «*(•) is continuous 
at 0. Then, we can conclude 

Xv(x) — Dv(x) • f(x,a*) — g(x,a*) = 0 (11) 

where a * = a*(0). Similarly, using the equiva¬ 
lent form 

v(x) + sup ] — / g(y(s),a(s))e~ Xs ds 
aeA ( Jo 

-e-^v(y(z))} = 0 

of the DPP and the inequality, this implies for any 
(continuous at 0) control a e A, 

Xv(x) — Dv(x) • f(x,a) — g(x,a) 

< 0, for every a e A. (12) 

Combining (11) and (12), we obtain the 
Hamilton-Jacobi-Bellman equation (or dynamic 
programming equation)'. 








960 


Optimal Control and the Dynamic Programming Principle 


\u(x) + sup {—/ (x, a ) • Du{x) — g(v, a)} = 0, 

aeA 

(13) 

which characterizes the value function for the 
infinite horizon problem associated with mini¬ 
mizing (3). Note that given x, the value of a 
achieving the max (assuming it exists) corre¬ 
sponds to the control a* = a* (0), and this makes 
it natural to interpret the argmax in (13) as the 
optimal feedback at v (see Bardi and Capuzzo 
Dolcetta (1997) for more details). 

In short, (13) can be written as 


where 


min {t : y x (t, a) e T} 


t x (a):= 


-hoo 


if y x (t,a) € T 
for some t > 0 

if y x (t,a) i T 

for any t > 0 


The corresponding value function is called the 
minimum time function 

T{x) := inf (<*(•)). (15) 

a(-)eA 


H(x, u , Du) = 0 
with x eR d , and 

H(x, u,p)=Xu(x)-\-swp {—/(v, a)-p—g(x, a)}. 

aeA 

(14) 

Note that H(x,u , •) is convex (being the sup of 
a family of linear functions) and that H(x,-, p) 
is monotone (since A > 0). It is also easy to 
see that the solution u is not differentiable even 
when / and g are smooth functions (i.e., /, g, e 
C°°(R n , A)), so we need to deal with weak so¬ 
lution of the Bellman equation. This can be done 
in the framework of viscosity solutions, a theory 
initiated by Crandall and Lions in the 1980s 
which has been successfully applied in many 
areas as optimal control, fluid dynamics, and 
image processing (see the books Barles (1994) 
and Bardi and Capuzzo Dolcetta (1997) for an 
extended introduction and numerous applications 
to optimal control). Typically viscosity solutions 
are Lipschitz continuous solutions so they are 
differentiable almost everywhere. 

An Extension to the Minimum Time 
Problem 

In the minimum time problem, we want to min¬ 
imize the time of arrival of the state on a given 
target set T. We will assume that T C is 
a closed set. Then our cost functional will be 
given by 


The main difference with respect to the previous 
problem is that now the value function T will be 
finite valued only on a subset 7 Z which depends 
on the target, on the dynamics, and on the set of 
admissible controls. 

Definition 1 The reachable set 1Z is defined by 
7 Z := U t> olZ(t) = {x e R n : T{x) < -hoo} 

where, for t > 0, 7 Z(t) := {x e R n : T(x) < t}. 

The meaning is clear: 7 Z is the set of initial points 
which can be driven to the target in finite time. 
The system is said to be controllable on T if 
for all t > 0, T C int(7£(0) (here, int(D) 
denotes the interior of the set D). Assuming 
controllability in a neighborhood of the target one 
gets the continuity of the minimum time function 
and under the assumptions made on /, A, and T, 
one can prove some interesting properties: 

(i) 7Z is open. 

(ii) T is continuous on 1Z. 

(iii) lim T(x) = -hoo, for any Vo e dlZ. 

X —^X() 

Now let us denote by Xd the characteristic func¬ 
tion of the set D. Using in 1Z arguments similar 
to the proof of DPP in the previous section one 
can obtain the following DPP: 

Proposition 2 (DPP for the minimum time 
problem) For any x e TZ, the value function 
satisfies 

T(x) = inf {? A t x (a) + X {t<tx{a)} T(y x (t,a))} 
aeA 

for any t > 0 (16) 


J(x,a) = t x (a) 
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and 

T (x) = inf {? + T(y x (t,a))} 

aeA 

for any t e [0,T(x)] (17) 

From the previous DPP, one can also obtain the 
following characterization of the minimum time 
function. 

Proposition3 Let 1Z\T be open and T e 
C(JZ\T), then T is a viscosity solution of 

max{— f(x,a) • Vr(x)} = 1 v e 7 Z\T 

aeA 

(IB) 

coupled with the natural boundary condition 

T(x) = 0 x e 8T 

lim T(x) = -|-oo 

x— >d7Z 

By the change of variable v(x) = 1 — e~ T ^ x \ one 
can obtain a simpler problem getting rid of the 
boundary condition on dlZ (which is unknown). 
The new function v will be the unique viscosity 
solution of an external Dirichlet problem (see 
Bardi and Capuzzo Dolcetta (1997) for more 
details), and the reachable set can be recovered 
a posteriori via the relation 1Z = {x e : 
v(x) < 1}. 


Further Extensions and Related 
Topics 

The DPP has been extended from deterministic 
control problems to many other problems. In 
the framework of stochastic control problems 
where the dynamics are given by a diffusion, 
the characterization of the value function 
obtained via the DPP leads to a second-order 
Hamilton-Jacobi-Bellman equation (Fleming 
and Soner 1993; Kushner and Dupuis 2001). 
Another interesting extension has been made in 
differential games where the DPP is based on the 
delicate notion of nonanticipative strategies for 
the players and leads to a nonconvex nonlinear 
partial differential equation (the Isaacs equation) 


(Bardi and Capuzzo Dolcetta 1997). For a 
short introduction to numerical methods based 
on DP and exploiting the so-called “value 
iteration,” we refer the interested reader to the 
Appendix A in Bardi and Capuzzo Dolcetta 
(1997) and to Kushner and Dupuis (2001) (see 
also the book Howard (1960) for the “policy 
iteration”). 


Cross-References 

► Numerical Methods for Nonlinear Optimal 
Control Problems 

► Optimal Control and Pontryagin’s Maximum 
Principle 
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Abstract 

One approach to linear control system design in¬ 
volves the matching of certain input-output mod¬ 
els with respect to a quantification of closed- 
loop performance. The approach is based on a 
parametrization of all stabilizing feedback con¬ 
trollers, which relies on the existence of co¬ 
prime factorizations of the plant model. This 
parametrization and spectral factorization meth¬ 
ods for solving model-matching problems are 
described within the context of impulse-response 
energy and worst-case energy-gain measures of 
controller performance. 

Keywords 

Coprime factorization; ^ control; control; 
Spectral factorization; Youla-Kucera controller 
parametrization 

Introduction 

Various linear control problems can be formu¬ 
lated in terms of the interconnection shown in 
Fig. 1; e.g., see Francis and Doyle (1987), Boyd 
and Barratt (1991), and Zhou et al. (1996). The 
linear system K is a controller (with input y and 
output u — v i) to be designed for the generalized 
plant model G. The latter is constructed so that 
controller performance (i.e., the quality of K 
relative to specifications) can be quantified as a 
nonnegative functional of 

H(G, K ) = Gn + G l2 K(I - G 22 K)~ l G 2l , 

( 1 ) 


which relates the input w and the output z when 
v\ = 0 and u 2 = 0. The objective is to select 
K , to minimize this measure of performance. 
Alternatively, controllers that achieve a speci¬ 
fied upper bound are sought. It is also usual to 
require internal stability , which pertains to the 
fictitious signals v\ and u 2 , as discussed more 
subsequently. The best known examples are 
and ^oo control problems. In the former, perfor¬ 
mance is quantified as the energy (resp. power) 
of z when w is impulsive (resp. unit white noise), 
and in the latter, as the worst-case energy gain 
from w to z, which can be used to reflect robust¬ 
ness to model uncertainty; see Zhou et al. (1996). 

The special case of G 22 = 0 gives rise to 
a (weighted) model-matching problem, in that 
the corresponding performance map H(G, K) = 
G ii + GnKG 2 \ exhibits affine dependence on 
the design variable K , which is chosen to match 
GnKG 2 \ to — G\\ with respect to the scalar quan¬ 
tification of performance. Any internally stabiliz- 
able problem with G 22 7 ^ 0, can be converted into 
a model-matching problem. The key ingredients 
in this transformation are coprime factorizations 
of the plant model. The role of these and other 
factorizations in a model-matching approach to 
J %2 and control problems is the focus of this 
article. 

For the sake of argument, finite-dimensional 
linear time-invariant systems are considered via 
real-rational transfer functions in the frequency 
domain , as the existence of all factorizations 
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employed is well understood in this setting. 
Indeed, constructions via state-space realizations 
and Riccati equations are well known. The merits 
of the model-matching approach pursued here 
are at least twofold: (i) the underlying algebraic 
input-output perspective extends to more 
abstract settings, including classes of distributed- 
parameter and time-varying systems (Desoer 
et al. 1980; Vidyasagar 1985; Curtain and Zwart 
1995; Feintuch 1998; Quadrat 2006); and (ii) 
model matching is a convex problem for various 
measures of performance (including mixed 
indexes) and controller constraints. The latter 
can be exploited to devise numerical algorithms 
for controller optimization (Boyd and Barratt 
1991; Dahleh and Diaz-Bobillo 1995; Qi et al. 
2004). 

First, some notation regarding transfer 
functions and two measures of performance 
for control system design is defined. Coprime 
factorizations are then described within the 
context of a well-known parametrization of 
stabilizing controllers, originally discovered 
by Youla et al. (1976) and Kucera (1975). This 
yields an affine parametrization of performance 
maps for problems in standard form, and thus, a 
transformation to a model-matching problem. 
Finally, the role of spectral factorizations in 
solving model-matching problems with respect 
to impulse-response energy (^>) and worst-case 
energy-gain measures of performance is 

discussed. 


Notation and Nomenclature 

7 Z generically denotes a linear space of matrices 
having fixed row and column dimensions, 
which are not reflected in the notation for 
convenience, and entries that are proper real- 
rational functions of the complex variable s\ 

i- e -> (Ek=i b kS k ) / (El=i a k sk ) for sets of 

real coefficients {ak} n k=l and {b k }™ =l with 
m < n < oo. The compatibility of matrix 
dimensions is implicitly assumed henceforth. 
All matrices in 7 Z have (nonunique) “state-space” 
realizations of the form C(sl — A)~ l B + D , 
where A, B, C and D are real valued matrices. 


This form naturally arises in frequency-domain 
analysis of the input-output map associated with 
the time-domain model x(t) = Ax(t) + Bu(t ), 
with initial condition v(0) = 0 and output 

equation y(t) = Cx(t) + Du{t ), where x 
denotes the time derivative of v and u is the 
input. The study of such linear time-invariant 
differential equation models via the Laplace 
transform and multiplication by real-rational 
transfer function matrices is fundamental in 
linear systems theory (Kailath 1980; Francis 
1987; Zhou et al. 1996). P e 1Z has an inverse 
P~ l e 1Z if and only if lim|^oo P(s ) is a 
nonsingular matrix. The superscripts T and * 
denote the transpose and complex conjugate 
transpose. For a matrix Z = Z* with complex 
entries, Z > 0 means z*Zz > cz*z for 

some € > 0 and all complex vectors z of 

compatible dimension. P~(s) := P(—s) T , 

whereby (P(jco))* = P~(joo) for all real go 
with j : = >/—f. Zeros of transfer function 

denominators are called poles. 

In subsequent sections, several subspaces of 
7 Z are used to define and solve two standard 
linear control problems. The subspace B C 1Z 
comprises transfer functions that have no poles 
on the imaginary axis in the complex plane. For 
P e B, the scalar performance index 

||PHoc := max a(P(jco)) > 0 

—oo<co<oo 

is finite; the real number d(Z) is the maximum 
singular value of the matrix argument Z. This 
index measures the worst-case energy-gain from 
an input signal u , to the output signal y = Pu. 
Note that ||P||oo < y if and only if y 2 I — 
P~(jco)P(jGo) > 0 for all —oo < go < oo. 

The subspace S C B C 1Z consists of transfer 
functions that have no poles with positive real 
part. A transfer function in S is called stable 
because the corresponding input-output map is 
causal in the time domain, as well as bounded- 
in-bounded-out (in various senses). If P e S is 
such that P~P = /, then it is called inner. If 
P, P~ l e S , then both are called outer. 

Let C denote the subspace of strictly-proper 
transfer functions in B\ i.e., for all entries of the 
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matrix, the degree n of the denominator exceeds 
the degree m of the numerator. Observe that P G 
£ if and only if P~ g £. Moreover, P\P 2 e £ 
and P 3 P 1 e £ for all P\ e £ and Pf e B, 
i =2,3. Now, for P\, P 2 € £, define the inner- 
product 


1 r°° 

(P U P 2 ):= — I trace(Pf (joj) P 2 (jco))dco < oo 
J —oq 

and the scalar performance index ||P ||2 := 

yj{P, P) > 0 for P G £. This index equates 
to the root-mean-square (energy) measure of the 
impulse response and the covariance (power) 
of the output signal y = Pu, when the input 
signal u is unit white noise. By the properties 
trace(Zi + Z 2 ) = trace(Zi) + trace(Z 2 ) and 
trace(ZiZ 2 ) = trace(Z 2 Zi) of the matrix trace, 
it follows that (P\ + P 2 , P3) = (P\, P 3 ) + 
{P 2 , P 3 ) and 


(PUP2P3) = <P 2 ~Pi,P 3 ) = (PiPr,P 2 ) 

= (P^,P^P 2 ) for Pi e£, i= 1,2,3. 

( 2 ) 


The (not closed) subspace £ C B C 1Z can be 
expressed as the direct sum £ = % + H±, where 
T-L = £ n S and is the subspace of transfer 
functions in £ that have no poles with negative 
real part. That is, given P e £, there is a unique 
decomposition P = 77 + (P)-h77_(P), with 
77 + OP) G H and /7_(P) e H_ l. Observe that 
P G T-L if and only if P~ G H±. It can be shown 
via Plancherel’s theorem that (P\, P 2 ) = 0 for 
Pi G and P 2 G T-L. Finally, note that P\P 2 G 
T-L and P 2 P\ G H for P\ e H and Pi G S , 
i = 2,3. 


Coprime and Spectral Factorizations 

Given P elZ, the factorizations P = NM~ l = 
M~ l N are said to be (doubly) coprime over S, 
if N, M , N, M are all elements of S and there 
exist Uq, Vq, Uq, Vq all in S such that 


[V 0 -Uo] 


M 

N 


I and [-N M] 


Uo 

V 0 


= I 
(3) 


hold; i.e., [M t 7V r ] and [— N M] are right in- 
vertible in S. Importantly, if the factorizations are 
coprime and P G S, then M~ l = V 0 — U 0 P and 
M~ l = Vo—PUo are in S, as sums of products of 
transfer functions in S; i.e., M and M are outer. 
Doubly coprime factorizations over S always ex¬ 
ist, but these are not unique. Constructions from 
state-space realizations can be found in Zhou 
et al. (1996, Chapter 6) and Francis (1987), for 
example. As mentioned above, coprime factor¬ 
izations play a role in transforming a standard 
problem into the special case of a model matching 
problem, via the Youla-Kuceraparametrization of 
internally stabilizing controllers presented in the 
next section. 

Subsequently, a special coprime factoriza¬ 
tion proves to be useful. If P~(s)P(s) = 

M~~(s)N~(s)N(s)M~ l (s) > 0 for s on the 
extended imaginary axis (i.e., for s = jco with 
—00 < co < 00 ), then it is possible to choose 
the factor N to be inner. In this case, if P 
is also an element of S, then P = NM~ l 
is called an inner-outer factorization , and 
P~P = (M - 1 )~M -1 is called a spectral 

factorization, since M, M~ l G S. More 
generally, if S = G B satisfies E(s) >0 
for s on the extended imaginary axis, then there 
exists a (non-unique) spectral factor E, E~ l G S 
such that 3 = E~E. Similarly, there exists 
a co-spectral factor E,E~ l e S such that 
E = State-space constructions via 

Riccati equations can be found in Zhou et al. 
(1996, Chapter 13), for example. 


Affine Controller/Performance-Map 
Parametrization 

With reference to Fig. 1, a generalized plant 
model G = [ ^ ] G 1Z is said to be internally 

stabilizable if there exists a K G 1Z such that 
the nine transfer functions associated with the 
map from the vector of signals (w,v\,v 2 ) to 
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the vector of signals ( z,u,y ), which includes 
the performance map H(G, K ) = G\\ + 
GnK(I — G 22 K)~ l G 2 u are all elements of S. 
Accounting in this way for the influence of the 
fictitious signals v\ and V 2 , and the behavior 
of the internal signals u and y, amounts to 
following requirement: Given minimal state- 
space realizations, any nonzero initial condition 
response decays exponentially in the time domain 
when G and K are interconnected according 
to Fig. 1 with w = 0, V\ = 0 and V 2 = 0. 
Not every G G 1Z is internally stabilizable 
in the sense just defined; for example, take 
G\\ to have a pole with positive real part and 
G21 = G \2 = G 22 = 0. A necessary condition 
for stabilizability is (/ — G 22 K )- 1 G 7 Z; i.e., the 
inverse must be proper. The latter always holds if 
G22 is strictly proper, as assumed henceforth to 
simplify the presentation. It is also assumed that 
G is internally stabilizable. 

It can be shown that G is internally stabilized 
by K if and only if the standard feedback in¬ 
terconnection of G22 and K , corresponding to 
w = 0 in Fig. 1, is internally stable. That is, if 
and only if the transfer function 


I -K 
-G 22 I 


eU , 


( 4 ) 


sense that (3) holds for some Go, To, Go, Vo £ S. 
Indeed, since 0 = G 22 — G 22 = M~ l (MN — 
it follows that 


' Fo -c/o' 

~M U 0 ~ 


7 0" 



-N M _ 

_N V 0 _ 


_° I 






~M U 0 ' 

" Vo -C/o' 




N V 0 

-N M _ 


( 6 ) 


Exploiting this and the condition (5), it holds that 
K = UV~ l stabilizes G 22 if and only if 

U = (U 0 -MQ) and V = (Vo—NQ) with QeS. 

Similarly, K stabilizes G 22 if and only if K — 
(V 0 - QN)~ l (U 0 - QM) with QeS. Together, 
these constitute the Youla-Kucera parametriza- 
tions of internally stabilizing controllers. Impor¬ 
tantly, the coprime factors that appear in these 
are affine functions of the stable parameter Q. 
Moreover, using (6), an affine parametrization of 
the standard performance map (1) holds by direct 
substitution of either controller parametrization. 
Specifically, 


H(G, K) = G n + G U K(I - G 22 K)- l G 2 i 


which relates u and y to v\ and V 2 by virtue of the 
summing junctions at the interconnection points, 
has an inverse in <S; see Francis (1987, Theo¬ 
rem 4.2). Substituting the coprime factorizations 
K = UV~ l = V~ l U and G 22 = NM~ l = 
M~ l N , it follows that the inverse of (4) is an 
element of S if and only if 

~M U 
_ N V 

( 5 ) 


e<S 


V —U 
—N M 


eS. 


The equivalent characterizations of internal 
stability in (5) lead directly to affine parametriza- 
tions of controllers and performance maps. 
Specifically, following the approach of Desoer 
et al. (1980), Vidyasagar (1985), and Francis 
(1987), suppose that the factorizations G 22 — 
NM~ l = M~ l N are doubly coprime in the 


— T\ + T 2 QT 3 with Q G S, (7) 


where T\ = G\\ 4- G 12 U 0 MG 21 , T 2 — — G\ 2 M 
and T 3 = MG 21 . Clearly, T\ G S since this 
is the performance map when Q = 0 G S. 
By the assumption that G is stabilizable, it fol¬ 
lows that T 2 and T 3 are also elements of S ; 
see Francis (1987, Chapter 4). The so-called Q- 
parametrization in (7) motivates the subsequent 
consideration of model-matching problems with 
respect to the standard measures of control sys¬ 
tem performance || • ||2 and || ■lloo. 


Model-Matching via Spectral 
Factorization 

Bearing in mind the Q -parametrization (7), con¬ 
sider the following ,//f 2 model-matching problem, 
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where inf denotes greatest lower bound (infi- 
mum) and 7} G <S, i = 1,2,3: 

inf ||7i +T 2 QT 3 \\ 2 . 

QeS 

Assume that T 2 (s) and T 3 (s) have full column 
and row rank, respectively, for s on the extended 
imaginary axis. Also assume that T x is strictly 
proper, whereby Q must be strictly proper, and 
thus an element of TL C 5, for the performance 
index to be finite. Under this standard collec¬ 


tion of assumptions, the infimum is achieved as 
shown below. 

A minimizer of the convex functional / : = 
Q e n !-► <(7i + T 2 QT 3 ),(T X + T 2 QT 3 )) 
is a solution of the model matching problem. 
Given spectral factorizations = T^T 2 

and AA~ = T 3 T~ (i.e., &, <P~ X , A, A~ l G 
S ), which exist by the assumptions on the 
problem data, let R := 0QA and W := 
0 T~T\T^A . Then for Q G TL, which 
is equivalent to R G TL by the properties of 
spectral factors, it follows that 


/(G) = (Tu T x ) + (0—TzTxTfA—, R) + (R, 0 T^T X T^ A ) + (R, R) 

= {T u Ti) + ((n-(W) + n + (W) + /?), (77_(TU) + 77 + (TU) + /*)) - (fk, fk) 

= {Tu T x ) - {n + (W), n + (W )) + ((77 + (TU) + fl), (/7 + W + ^)), (9) 


where the second last equality holds by 
“completion-of-squares” and the last equality 
holds since {II+ (W),II-(W)) = 0 = 

( R , II-{W)). From (9) it is apparent that 


Q = -0~ x II + (0-~T~T x T~ A~~)A~ X 

is a minimizer of /.As above, spectral fac¬ 
torization is a key component of the so-called 
Wiener-Hopf approach of Youla et al. (1976) and 
DeSantis et al. (1978). 

Now consider the model-matching prob¬ 
lem 

inf ||7i + T 2 QT 3 \U 


With a view to highlighting the role of factor¬ 
ization methods and simplifying the presentation, 
suppose that T 2 is inner, which is possible without 
loss of generality via inner-outer factorization if 
T 2 (s) has full column rank for s on the extended 
imaginary axis. Furthermore, assume that T 3 = 
/. Following the approach of Francis (1987) and 
Green et al. (1990), let X~ = [X / X/] := 
[T 2 I - T 2 T~] g B, so that X~X = I and 
XT 2 = [ q ]. Observe that 


\\T x + T 2 Q \\ 00 = \\X(T x + T 2 Q )\\ 00 


t 2 t x + q 

(/ - T 2 T~)T x 


< y 

oo 

( 10 ) 


given / g S, i = 1,2,3. This is more chal¬ 
lenging than the problem discussed above, where 
|| • ||2 is the performance index. While sufficient 
conditions are again available for the infimum 
to be achieved, computing a minimizer is gener¬ 
ally difficult; see Francis and Doyle (1987) and 
Glover et al. (1991). As such, nearly optimal 
solutions are often sought by considering the 
relaxed problem of finding the set of Q G S that 
satisfy || T x + T 2 Q T 3 ||oo < y for a value of y > 0 
greater than, but close to, the infimum. 


if and only if 

0 < y 2 I — T~(I - T 2 T~)Tx 

- (UTi + Qr(T~T t + Q) (11) 

on the extended imaginary axis. Note that (11) 
implies 0 < y 2 I — 7/(7 — T 2 T^) 2 T X . Thus, it 
follows that there exists a Q G S for which (10) 
holds if and only if the following are both sat¬ 
isfied: (a) there exists a spectral factorization 











Optimal Control via Factorization and Model Matching 


967 


y 2 V~V = y 2 I - 77(/ - T 2 T~) 2 Tu and (b) 
there exists an R(= Q ^ _1 ) £ S such that \\W + 
R\\oo < y, where W := T^T\ e B. The 
condition (b) is a well-known extension problem 
and a solution exists if and only if the induced 
norm of the Hankel operator with symbol W is 
less than y, which is part of a result known as 
Nehari’s theorem. In fact, (b) is equivalent to the 
existence of a spectral factor T,T _1 £ <S with 
77 1 £ S such that 


17 0 1 


7 w~ 


7 0 1 

7 W~ 

1 

CM 

1 

O 

_ 1 

r = 

_° 1 


— 1 

CM 

1 

O 

_ 1 

_° I _ 


( 12 ) 


Summary 

The preceding sections highlight the role of co¬ 
prime and spectral factorizations in formulating 
and solving model-matching problems that arise 
from standard J# 2 and control problems. 
The transformation of standard control problems 
to model-matching problems hinges on an affine 
parametrization of internally stabilized perfor¬ 
mance maps. Beyond the problems considered 
here, this parametrization can be exploited to 
devise numerical algorithms for various other 
control problems in terms of convex mathemat¬ 
ical programs. 


in which case \\W + R\\oo < y if and only if 
R = R x R~ l with [R{ Rl] := [S T l]T ~ T , 
S £ S and H^Hoo < y; see Ball and Ran 
(1987), Francis (1987), and Green et al. (1990) 
for details, including state-space constructions of 
the factors via Riccati equations. Noting that 


Cross-References 

► H-Infinity Control 

► H 2 Optimal Control 

► Polynomial/Algebraic Design Methods 

► Spectral Factorization 


T 2 T x 
0 / 



7 0 

~t 2 t{ 


7 0" 



0 —y 2 I 

0 I 


0 ^ 



7 0 

7 W~ 

7 O' 

_0 -y 2 I_ 

_0 I _ 

_o <7 


it follows using ( 12 ) that there exists a Q e S 
such that ( 10 ) holds if and only if there exists 
a spectral factor £2,£2~ l £ S with £ S 
(X2 = T[ 0 $]) that satisfies 


T 2 T x 
0 I 


7 0 1 

73 Til 


7 0 1 

—1 

<N 

1 

O ( 

1- 

= £ 2 ^ 

- 1 

CN 

X 

1 

0 


(13) 


in which case \\T\ + 73 QII 00 < y if and only if 
Q = Q\ Si 1 , where [g[ gj] := [ S T I] Q~ T , 
S £ S and ||S||oo < y; see Green et al. 
(1990). So-called /-spectral factorizations of 
the kind in ( 12 ) and (13) also appear in the 
chain-scattering/conjugation approach of Kimura 
(1989, 1997) and the factorization approach 
of Ball et al. (1991), for example. 
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Optimal Control with State Space 
Constraints 

Heinz Schattler 

Washington University, St. Louis, MO, USA 

Abstract 

Necessary and sufficient conditions for optimal¬ 
ity in optimal control problems with state space 
constraints are reviewed with emphasis on geo¬ 
metric aspects. 


Keywords 

Admissible control; Bolza form; Mayer problem 

Problem Formulation and 
Terminology 

Many practical problems in engineering or 
of scientific interest can be formulated in the 
framework of optimal control problems with state 
space constraints. Examples range from the space 
shuttle reentry problem in aeronautics (Bonnard 
et al. 2003) to the problem of minimizing the base 
transit time in bipolar transistors in electronics 
(Rinaldi and Schattler 2003). 

An optimal control problem with state space 
constraints in Bolza form takes the following 
form: minimize a functional 

J(u) = f L(t, x(t),u(t))dt + &(T,x(T)) 

Jto 

over all Lebesgue measurable functions u : 
[to,T] —> U that take values in a control set 
U C M m , subject to the dynamics 

x(t) = F(t,x(t),u(t)), x(?o) = x 0 , 

terminal constraints 

V(T,x(T)) = 0, 
and state space constraints 

h a (t,x(t)) <0 for a = 1 ,..., r. 

The focus of this contribution is on state space 
constraints, and, for simplicity, in this formula¬ 
tion, we have omitted mixed control state space 
constraints of the form gp(t,x,u ) < 0. States 
v lie in W 1 and controls in M m ; typically, the 
control set U C M m is compact and convex, 
often a polyhedron. The time-varying vector field 
F : M x R n x U W 2 is continuously differen¬ 
tiable in (t,x), and the terminal constraint N = 
{( 7 , x) : ^F(t,x) = 0} is defined by continuously 
differentiable mappings ^ : M x W 1 R k with 
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the property that the gradients 
(which we write as row vectors) are linearly 
independent on N. The terminal time T can be 
free or fixed; a fixed terminal time simply would 
be prescribed by one of the functions i/'A The 
state space constraints 

M a = {(t, x) : h a (t, x) = 0}, a = l ,..., r, 

are defined by continuously differentiable time- 
varying vector fields h a : M x W 1 M, (t, x) i-> 
h a (t,x), and we assume that the gradients Vh a 
do not vanish on M a . In particular, each set M a 
thus is an embedded submanifold of codimension 
1 of M n+1 . We denote by h = (h\,. .., h r ) T the 
time-varying vector field defining the state space 
constraints. 

Terminology: Admissible controls are locally 
bounded Lebesgue measurable functions that 
take values in the control set, u : [to,T] U. 
Given any admissible control, the initial value 
problem x(t) = F(t, x(t), u(t)), x(to) = xo, 
has a unique solution defined on some maximal 
open interval of definition I. This solution is 
called the trajectory corresponding to the control 
u and the pair (x, u) is a controlled trajectory. An 
arc r of the graph of a trajectory defined over 
an open interval I for which none of the state 
space constraints is active is called an interior 
arc , and f is a boundary arc if at least one 
constraint is active on all of I. We call r an M a - 
boundary arc over / if only the constraint ha £ 0 
is active on I. The times r when interior arcs and 
boundary arcs meet are called junction times and 
the corresponding pairs (r, x(x)) junction points. 

Despite the abundance and importance of 
practical problems that can be described as 
optimal control problems with state space 
constraints, for such problems the theory still 
lacks the coherence that the theory for problems 
without state space constraints has reached and 
there still exist significant gaps between the 
theories of necessary and sufficient conditions 
for optimality for optimal control problems with 
state space constraints. The theory of existence 
of optimal solutions differs little between optimal 
control problems with and without state space 


constraints, is well established, and will not 
be addressed here (e.g., see Cesari 1983 or the 
Filippov-Cesari theorem in Hard et al. 1995). 


Necessary Conditions for Optimality 

First-order necessary conditions for optimality 
are given by the Pontryagin maximum principle 
(Pontryagin et al. 1962). The zero set of even a 
smooth (C°°) function can be an arbitrary closed 
subset of the state space. As a result, in necessary 
conditions for optimality, the multipliers associ¬ 
ated with the state space constraints a priori are 
only known to be nonnegative Radon measures 
(Ioffe and Tikhomirov 1979; Vinter 2000). Let 
w* : [to, T] —> U be an optimal control with 
corresponding trajectory x* and, for simplicity of 
presentation, also assume that no state constraints 
are active at the terminal time so that the standard 
transversality conditions apply. Then it follows 
that there exist a constant Ao > 0 , an absolutely 
continuous function r /, which we write as row- 
vector, rj : [t 0 ,T] -> (M")*, and nonnegative 
Radon measures pt a e C*([fo, T]; M), a = 
1,..., r, with support in the sets R a = {t e 
[t 0 , T] : h a (t,x*(t )) = 0}, which do not all 
vanish simultaneously, i.e., 

r 

^0 + || 771|oo + 0, T]) > 0, 

a=\ 

such that with 

r r r\h 

X(t) = >7(0 - V / -^-(s,x*(s))dn a (s), 

and 

H = H(t, Ao, A, x, u)=XoL(t, x, u)-\-XF(t, x, u) 

the following conditions hold: 

(a) The adjoint equation holds in the form 

dH 

>7(0 = —— (/,A 0 , A(0,**(0.«*(0) 
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30 

= -A 0 — (f,**(0,«*(0) 

ox 

30 

- A (t) — (0 x* (0, w* (0), 
ox 


and there exists a row-vector /x e (M*)* such 
that 


30 30 

A(0) = Ao—(0,**(r)) + /x—(0,**(0)) 
3x 3x 

and 


0 = 0/(0,A o ,A(0),x*(0),w*(0)) 

30 30 

+ A 0 — (0, x* (0)) + /x ■— (0, x* (0)). 

(b) The optimal control minimizes the Hamil¬ 
tonian over the control set U along 

(A(0,x*(0): 


But in many practical applications, state con¬ 
straints have strong geometric properties - often 
they are embedded submanifolds - and it is pos¬ 
sible to strengthen these necessary conditions for 
optimality in the sense of specifying the measures 
further. We formulate the conditions for a partic¬ 
ular case of common interest. 

We consider an optimal control problem in 
Mayer form (i.e., 0 = 0) for a single-input 
control linear system with dynamics 

x = 0(0 x,u) = f (0 x) + ug(t, x) 

and the control set U a compact interval, U = 
[a, b]. Adjoining time as extra state variable, i = 
1 , and defining 


7/(0 Ao, X(t), x*(t), «*(0) 

= min 0/(0 Ao, A (t), x*(t), v). 

veU 

Furthermore, 


//(0 A 0 , A(0, x*(t), u*{t)) 

= //(0, A 0 , A(f),**(f),w*(0) 


/ 


3/0 

- (s, A 0 , A (s ), v* (s), u* (s))ds 
ot 


+ 


y- f dh a 




[t,T] 


dt 


(s,X*(s))dfl a (s) 


for a continuously differentiable function k : M x 
M” M w , the expressions 

C Fo k :Mxr^r, 

(0 v) i-> (C Fo k) (0 x) 

30 30 

= (*>*)/(*>*) 

and 

C G k 

3 k 

(t,x) (C G k) (t,x) = —(t,x)g(t,x) 

OX 


Controlled trajectories (v,w) for which there 
exist multipliers such that these conditions are 
satisfied are called extremals. In general, it cannot 
be excluded that Ao vanishes and extremals with 
Ao = 0 are called abnormal, while those with 
Ao > 0 are called normal. In this case, the 
multiplier can be normalized, Ao = 1. 

Special Case: A Mayer Problem for 
Single-Input Control Linear Systems 

Under the general assumptions formulated above, 
the sets R a C [to, 0] when a particular con¬ 
straint is active can be arbitrarily complicated. 


represent the Lie (or directional) derivatives of 
the function k along the vector fields 0o and G, 
respectively. In terms of this notation, the deriva¬ 
tive of the function h a (defining the manifold M a ) 
along trajectories of the system is given by 

d 

h a (t,x(t )) = — h a (t,x(t )) 
dt 

= C Fo h a (t,x(t))-\-u(t)jC G h a (t,x(t)). 

If the function C G h a does not vanish at a point 
(t,x) e M a , then there exists a neighborhood 
V of (t,x) such that there exists a unique 
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control u a = u a (t,x ) which solves the equation 
h a (t,x ) = 0 on V and u a is given in feedback 
form as 


tty (t , v) — 


jC Fo hg(t,x ) 
C G h a (t,x )' 


The manifold is said to be control invariant 
of relative degree 1 if the Lie derivative of h a with 
respect to G, C G hg, does not vanish anywhere on 
Mg and if the function u a (t, x) is admissible, i.e., 
takes values in the control set [a,b]. 

Thus, for a control-invariant submanifold of 
relative degree 1, the control that keeps the man¬ 
ifold invariant is unique, and the corresponding 
dynamics induce a unique flow on the constraint. 
This assumption corresponds to the least degener¬ 
ate, i.e., in some sense most generic or common, 
scenario and is satisfied for many practical prob¬ 
lems. 

Suppose the reference extremal is normal and 
let r a be an Mg -boundary arc defined over an 
open interval I with corresponding boundary 
control Ug that takes values in the interior of the 
control set along r a . Then the Radon measure \i a 
is absolutely continuous with respect to Lebesgue 
measure on I with continuous and nonnegative 
Radon-Nikodym derivative v a (t) given by 


^a(0 — 


A(o(f(^x40) + [/,g](f,**(0)) 

C G hg(t,x*(t )) 


where [/, g] denotes the Lie bracket of the time- 
varying vector fields / and g in the variable x, 


lf,g\(t,x) = ^(t,x)f(t,x)-Y~(t,x)g(t,x). 
ox ox 


In particular, in this case, the adjoint equation can 
be expressed in the more common form 


9 f dh 

\(t) = -A (t) — (t,x*,u*) - v a (t)—^(t,x*), 
ox ox 


with all partial derivatives evaluated along the 
reference trajectory. Furthermore, the multiplier 
A remains continuous at entry or exit if the 
controlled trajectory (jc* , u*) meets the constraint 


Mg transversally (e.g., see Schattler 2006). This 
follows from the following characterization of 
transversal connections between interior and 
boundary arcs due to Maurer (1977): if r is an 
entry or exit junction time between an interior arc 
and an M a -boundary arc for which the reference 
control w* has a limit at r along the interior 
arc, then the interior arc is transversal to M a 
at entry or exit if and only if the control u* is 
discontinuous at r. 

Informal Formulation of Necessary 
Conditions 

In order to ensure the practicality of necessary 
conditions for optimality, it is essential that be¬ 
sides atomistic structures at junctions that lead to 
computable jumps in the multipliers, the Radon 
measures /z a have no singular parts with respect 
to Lebesgue measure. If it is assumed a priori that 
optimal controlled trajectories are finite concate¬ 
nations of interior and boundary arcs, and if the 
constraint sets have a reasonably regular structure 
(embedded submanifolds and transversal inter¬ 
sections thereof) and satisfy a rather technical 
constraint qualification (see Hard et al. 1995) 
that guarantees that the restrictions of the system 
to active constraints have solutions, then it is 
possible to specify the above necessary condi¬ 
tions further and formulate more user friendly 
versions for the determination of the multipliers. 
Such formulations have become the standard for 
numerical computations, but they still have not 
always been established rigorously and somewhat 
carry the stigma of a heuristic nature. Neverthe¬ 
less, it is often this more concrete set of condi¬ 
tions that allow to solve problems numerically 
and analytically. If then, in conjunction with 
sufficient conditions for optimality, it is possible 
to verify the optimality of the computed extremal 
solutions, this generates a satisfactory theoretical 
procedure. Such conditions, following Hard et al. 
(1995), generally are referred to as the “informal 
theorem”. 

Suppose (x*,u*) is a normal extremal con¬ 
trolled trajectory defined over the interval [to, T] 
with the property that the graph of x* is a finite 
concatenation of interior and boundary arcs with 
junction times r/, i = l, ... ,k, to = to < t\ < 
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... < r k < 'Ck+i = 7". Under an appropriate 
constraint qualification, there exist a multiplier A, 
A : [fo,r] —> (M")*, which is absolutely con¬ 
tinuous on each subinterval [r z , r/+i]; multipliers 
v a , v a : [f 0 , > (M 71 )*, which are continuous 

on each interval [r/, r z +i]; a vector /x e 
and vectors rj(ti) e (M r )*, i = 1 with 

nonnegative entries such that: 

(a) (adjoint equation) On each interval 
(r z , r z+ i), i = 0, ...,r, A satisfies the 
adjoint equation in the form 


and at the junction times r z we have that 

H{ti , A (n -), v* (r z ), w* (r/ -)) 

= A(r z +),x*(r z ), w*(r,-+)) 

dh 

-vW ^-(r/,x*(r/)). 

Sufficient Conditions for Optimality 


3L 

A(0 = 

dx 

dF 

ox 


a=\ 


with v a (0 = 0 if the constraint M a is not 
active at time £. Assuming that no state space 
constraint is active at the terminal time, the 
value of the multiplier A at the terminal time 
is given by the transversality condition 

d& dV 

A (T) = — (r,jc„(r)) + jK—(r,jc*(r)). 

dx dx 

At any junction time r z between an interior 
arc and a boundary arc, the multiplier A may 
be discontinuous satisfying a jump condition 
of the form 


9 h 

A(t;—) = A(r,-+) + r}{xi) — {xi,x*{xi)) 

OX 


The literature on sufficient conditions for opti¬ 
mality for optimal control problems with state 
space constraints is limited. The value function 
for an optimal control problem at a point ( t , x) in 
the extended state space, V = V(t, x), is defined 
as the infimum over all admissible controls u for 
which the corresponding trajectory starts at the 
point x at time t and satisfies all the constraints 
of the problem, 


V(t,x ) = inf J(u). 


Any sufficiency theory for optimal control prob¬ 
lems, one way or another, deals with the solution 
of the corresponding Hamilton-Jacobi-Bellman 
(HJB) equation: 


— (t, x) + min < —(t, x)F(t, x, u) 

Ot ueu ( dx 

+L(t , x, u)} = 0, 

V(T, x) = 0(T,x) whenever F(T,x) = 0. 


and the complementary slackness condition 

dh 

il(xi) — (xi,x*(x i )) = 0 

dx 

holds. 

(b) The optimal control minimizes the Hamil¬ 
tonian over the control set U along 

(A(0,x*(0): 

Fl(t,\(t),x*(t),u*(t)) 

= min H(t, A(U,x*(U, v) 

veU 


Value functions for optimal control problems 
rarely are differentiable everywhere, but gener¬ 
ally have singularities along lower-dimensional 
submanifolds. Nevertheless, under some techni¬ 
cal assumptions and with proper interpretations 
of the derivatives, this equation describes the 
evolution of the value function of an optimal 
control problem and, if an appropriate solution 
can be constructed, indeed solves the optimal 
control problem. 

There exists a broad theory of viscosity so¬ 
lutions to the HJB equation (e.g., Fleming and 
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Soner 2005; Bardi and Capuzzo-Dolcetta 2008) 
that is also applicable to problems with state 
space constraints (Soner 1986) and, under vary¬ 
ing technical assumptions, characterizes the value 
function V as the unique viscosity solution to the 
HJB equation. This has led to the development of 
algorithms that can be used to compute numerical 
solutions. 

A more classical and more geometric 
approach to solving the HJB equation is based 
on the method of characteristics and goes back to 
the work of Boltyansky on a regular synthesis 
for optimal control problems without state 
space constraints (Boltyanskii 1966). This work 
follows classical ideas of fields of extremals 
from the calculus of variations and imposes 
technical conditions that allow to handle the 
singularities that arise in the value functions 
(e.g., see, Schattler and Ledzewicz 2012). 
Stalford’s results in Stalford (1971) follow 
this approach for problems with state space 
constraints, but a broadly applicable theory of 


regular synthesis, as it was developed by Piccoli 
and Sussmann in (2000) for problems without 
state space constraints, does not yet exist for 
problems with state space constraints. Results 
that embed a controlled reference extremal into 
a local field of extremals have been given by 
Bonnard et al. (2003) or Schattler (2006), and 
these constructions show the applicability of the 
concepts of a regular synthesis to problems with 
state space constraints as well. 

Examples of Local Embeddings 
of Boundary Arcs 

We illustrate the typical, i.e., in some sense most 
common, generic structures of local embeddings 
of boundary arcs in Figs. 1 and 2. The state 
constraint M a is a control-invariant submanifold 
of relative degree 1 and represented by a hori¬ 
zontal line as it arises when limits on the size 
of a particular state are imposed. Figure 1 shows 
the typical entry-boundary-exit concatenations of 
an interior arc followed by a boundary arc and 


Optimal Control with 
State Space Constraints, 
Fig. 1 A typical local 
synthesis around a 
boundary arc when no 
terminal constraints are 
present 


Optimal Control with 
State Space Constraints, 
Fig. 2 A typical local 
synthesis around a 
boundary arc when 
terminal constraints are 
present 
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another interior arc. The local embedding of the 
boundary arc differs substantially from classi¬ 
cal local imbeddings for unconstrained problems 
in the sense that this field necessarily contains 
small pieces of trajectories which, when propa¬ 
gated backward, are not close to the reference 
trajectory. This, however, does not affect the 
memoryless properties required for a synthesis 
forward in time, and strong local optimality of 
the reference trajectory can be proven combining 
synthesis type arguments with homotopy type 
approximations of the synthesis (Schattler 2006). 
The one trajectory marked as black line in Fig. 1 
corresponds to an optimal trajectory that meets 
the constraint only at the junction point and 
immediately bounces back into the interior. Such 
a trajectory arises as the limit when the concate¬ 
nation structure of optimal controlled trajecto¬ 
ries changes from interior-boundary-interior arcs 
to trajectories that do not meet the constraint. 
These structures are one of the extra sources 
for singularities in the value function that come 
up in optimal control problems with state space 
constraints. Switching surfaces for the interior 
arcs, as one is also shown in this figure, do not 
cause such a loss of differentiability if they are 
crossed transversally be the extremal trajectories 
of the field. 

Figure 2 depicts the structure of an optimal 
synthesis for a problem from electronics, the 
problem of minimizing the base transit time of 
bipolar homogeneous transistors. The electrical 
field that determines the transit time is controlled 
by tailoring a distribution of dopants in the base 
region, and this dopant profile becomes an impor¬ 
tant design parameter determining the speed of 
the device. But due to physical and engineering 
limitations, the variables describing the dopants 
need to be limited, and thus this becomes an 
optimal control problem with state space con¬ 
straints represented by hard limits on the vari¬ 
ables. The constraints here are control invariant 
of relative degree 1. Optimal solutions, in the 
presence of initial and terminal constraints, have 
both portions along the upper and lower control 
limits of the constrained variable and typically 
proceed from the upper to the lower values along 
an optimal singular control (which takes values in 


the interior of the control set) in the interior of the 
admissible domain, possibly with saturation if the 
control limits are reached. 


Cross-References 

► Numerical Methods for Nonlinear Optimal 
Control Problems 

► Optimal Control and Pontryagin’s Maximum 
Principle 
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Abstract 

Optimal deployment refers to the problem of how 
to allocate a finite number of resources over a 
spatial domain to maximize a performance metric 
that encodes certain quality of service. Depend¬ 
ing on the deployment environment, the type of 
resource, and the metric used, the solutions to this 
problem can greatly vary. 


Keywords 

Coverage control algorithms; Facility location 
problems 


Introduction 

The problem of deciding what are optimal geo¬ 
graphic locations to place a set of facilities has a 
long history and is the main subject in operations 
research and management science; see Drezner 
(1995). A facility can be broadly understood as 
a service such as a school; a hospital; an airport; 
an emergency service, such as a fire station; or, 
more generally, routes of a vehicle, from buses 
to aircraft, an autonomous vehicle, or a mobile 
sensor. 


The specific formulation of facility location 
problems depends very much on the particular 
underlying application. A distinguishing feature 
is that all involve strategic planning, accounting 
for the long-term impact on the facility operating 
cost and their fast response to the demand. Thus, 
these problems lead to constrained optimization 
formulations which are typically very hard to 
solve optimally. The computational complexity 
of such problems, which, even in their most basic 
formulations, typically lead to NP-hard problems, 
has made their solution largely intractable until 
the advent of high-speed computing. 

Locational optimization techniques have also 
been employed to solve optimal estimation 
problems by static sensor networks, mesh and 
grid optimization design, clustering analysis, data 
compression, and statistical pattern recognition; 
see Du et al. (1999). However, these solutions 
typically require centralized computations and 
availability of information at all times. 

When the facilities are multiple vehicles or 
mobile sensors, the underlying dynamics may 
require additional changes and further analysis 
that guarantee the overall system stability. In 
what follows, we review a particular coverage 
control problem formulation in terms of the 
so-called expected-value multicenter functions 
that makes the analysis tractable leading to 
robust, distributed algorithm implementations 
employing computational geometric objects such 
as Voronoi partitions. 

Basic Ingredients from 
Computational Geometry 

In order to formulate a basic optimal deployment 
problem and algorithm, we require of several 
notions from computational geometry; see Bullo 
et al. (2009) for more information. 

Let S be a measurable set of M m , for m e 
N, consider a distance function d on M m , and 
let P = {p i,..., p n } be n distinct points of 
S , corresponding to locations of certain facil¬ 
ities. The Voronoi partition of S generated by 
P and associated with d is given by V(T > ) = 
{V\(P),, V n (P)}, where 
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Vi(P) = {q € S I < d(pj,q), 

j e P \ {/}}, i € {1,... ,n}. 

Given r e M>o, denote by B(pi,r ) the closed 
ball of center p\ and radius r. The r-limited 
Voronoi partition of S generated by P and as¬ 
sociated with d is the Voronoi partition of the 
set S D U ” =l B(pi,r), denoted as V r (P) = 

{V U {P),...V n , r (P)}. 

Let 0 : S —> M>o be a measurable density 
function on S. The area and the centroid (or 
center of mass) of W c S with respect to 0 are 
the values 

A<p(W) = [ <fi(q)dq, 

Jw 

CM 0 (WO = 1 f q<t>(q)dq. 

A(p{ Vv ) Jw 

We say that the set of distinct points P in S 
is a centroidal Voronoi configuration (resp., a r- 
limited centroidal Voronoi configuration) if each 
Pi is at the centroid of its own Voronoi cell. 

That is, pi = CM+(Vi(P)), i e {1./i} 

(resp., pi = CM A v i,r(P)), and i e {1,..., w}). 
Voronoi partitions and centroidal Voronoi config¬ 
urations help assess the distribution of locations 
in a spatial domain as we establish below. 

A Voronoi partition induces a natural proxim¬ 
ity graph , called the Delaunay graph , over the 
set of points P. We recall that a grap/z G is a 
pair G = (V, E) where V is a set of n vertices 
and E is a set of ordered pair of vertices , E C 
F x F, called edge set. A proximity graph is 
a graph function defined on the set S , which 
assigns a set of distinct points P C S' to a graph 
G(P) = (P, E(P)), where E(P) is a function 
of the relative locations of the point set. Example 
graphs include the following: 

1. The r-disk graph , (?disk,r, for r e M>o. Here, 
(Pi,Pj) e E disKr (P) if d(pi,pj) < r. 

2. The Delaunay graph , (5d- We have (/> z , e 

e d (p) if Vi(P)nVj(P)^&. 

3. The r-limited Delaunay graph , (/LD,r ? for r e 
M >0 . Here, (pi,pj) e E LD , r (P) if V/ >r (P) n 

vun ^ 0. 


Expected-Value Multicenter 
Functions 

Facility location problems consist of spatially 
allocating a number of sites to provide certain 
quality of service. Problems of this class are for¬ 
mulated in terms of multicenter functions and, in 
particular, expected-value multicenter functions. 

To define these, consider 0 : S —> M>o 
a density function over a bounded measurable 
set S C M m . One can regard 0 as a function 
measuring the probability that some event takes 
place over the environment. The larger the value 
of 0 (g), the more important the location q will 
have. We refer to a nonincreasing and piecewise 
continuously differentiable function / : M>o —> 
M, possibly with finite jump discontinuities, as a 
performance function. 

Performance functions describe the utility of 
placing a node at a certain distance from a loca¬ 
tion in the environment. The smaller the distance, 
the larger the value of /, that is, the better the 
performance. For instance, in sensing problems, 
performance functions can encode the signal-to- 
noise ratio between a source with an unknown 
location and a sensor attempting to locate it. 
Without loss of generality, it can be assumed that 
/( 0 ) = 0 . 

An expected-value multicenter function mod¬ 
els the expected value of the coverage over any 
point in S provided by a set of points p\,..., p n . 
Formally, 

H(pu---,Pn)= / max f(\\q-pi\\ 2 )(p(q)dq, 

( 1 ) 

where || • W 2 denotes the 2-norm of M m . This 
definition can be understood as follows: consider 
the best coverage of q e S among those provided 
by each of the nodes p\,...,p n , which corre¬ 
sponds to the value max zG {! f(\\q — Pi\\i)- 
Then, modulate the performance by the impor¬ 
tance 0(g) of the location q. Finally, the infinites¬ 
imal sum of this quantity over the environment S 
gives rise to EL(p \,..., p n ) as a measure of the 
overall coverage provided by p\,..., p n . 

From here, we can formulate the following 
geometric optimization problem, known 
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as the continuous p-median problem , see {p\,..., p n } e S. For any performance function 
Drezner (1995): / and for any partition {W \,..., W n } of S, 


max 'H(pi,...,pn)- ( 2 ) 

{pi,-,Pn)CS 

The expected-value multicenter function can 
be alternatively described in terms of the Voronoi 
partition of S generated by P = {pi,..., p n }. 
Let us define the set 

C = {(pu...,Pn)e(M m ) n \Pi = Pj 
for some i ^ j} , 

consisting of tuples of n points, where some of 
them are repeated. Then, for (p i,... , p n ) e S n \ 
C, one has 





f(\\q-Pih)(p(q)dq. 


( 3 ) 


This expression of PL is appealing because it 
clearly shows the result of the overall coverage 
of the environment as the aggregate contribution 
of all individual nodes. If (p \,..., p n ) e C, then 
a similar decomposition of % can be written in 
terms of the distinct points P = {p\,..., p n }. 

Inspired by (3), a more general version of 
the expected-value multicenter function is given 
next. Given (p \,..., p n ) e S n and a partition 
{W \,..., W n } of S, let 

H(j>u...,p n ,Wu...,W n ) 



Pih)<t>(q)dq. 


( 4 ) 


For all (pi,...,p n ) c S n \ C, we have that 
Hpu. • . ,Pn)=HPU. • • , PnMPl . . . ,Vn(P))- 
With respect to, e.g., sensor networks, this 
function evaluates the performance associated 
with an assignment of the sensors’ locations 
at and a region assignment 

(W u ...,W n ). 

Moreover, one can establish that the Voronoi 
partition (Du et al. 1999) V(T > ) is optimal for 
T-L among all partitions of S. That is, let P = 


H(pu...,p n ,V l (P),...,V n (P))> 

n(j>i,... 9 p n ,w u ...,w n ), 

with a strict inequality if any set in { W \,..., W n } 
differs from the corresponding set in { V\ (P ),..., 
V n (P)} by a set of positive measure. 

Next, we characterize the smoothness of the 
expected-value multicenter function (Cortes et al. 
2005). Before stating the precise properties, let 
us introduce some useful notation. For a perfor¬ 
mance function /, let discont(/) denote the (fi¬ 
nite) set of points where / is discontinuous. For 
each a e discont(/), define the limiting values 
from the left and from the right, respectively, as 

/_(«)= lim f(x), f+(a)= lini f(x). 

X ^ a x^a+ 

Recall that the line integral of a function g : 
M 2 —> M over a curve C parameterized by a con¬ 
tinuous and piecewise continuously differentiable 
map y : [ 0 , 1 ] —> M 2 is defined as follows: 

f g= f g(r)dy := f g(y(t))\\y(t)hdt, 

JC JC JO 

and is independent of the selected parameteriza¬ 
tion. 

Now, given a set S C M m that is bounded 
and measurable, a density 4 > : S —> M>o, and 
a performance function / : M ^>o M, the 
expected-value multicenter function H : S n M 
is globally Lipschitz (Given S C Mf 1 , a function 
/ : S —> R k is globally Lipschitz if there exists 
K e M >0 such that || f(x) - f(y) || 2 < K\\x - 
y ||2 for all x, y e 5.) on V; and continuously 
differentiable on S n \ C, where for / e {l,... ,n} 


jr-( p ) = f j-f(\\q - Pih)<p(q)dq 
°Pi JVi(P ) 3 Pi 

+ J2 (f- (a) - /+(«)) 

aGdiscont(/) 


L 


Vi(P)ndB(pi,a) 


n ou t (q)<p(q)dq, (5) 
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where n out is the outward normal vector to 
B(pi,a). 

Different performance functions lead to differ¬ 
ent expected-value multicenter functions. Let us 
examine some important cases. 

Distortion Problem 

Consider the performance function f(x ) = — v 2 . 
Then, on S n \ C, the expected-value multicenter 
function takes the form 


and the inequality is strict if there exists i e 
{1 ,,n} for which Wj has nonvanishing area 
and pi ^ CM ( f ) (Wi). In other words, the centroid 
locations CM^Wf ),..., CM^iWn) are optimal 
for Hdistor among all configurations in S . 

Note that when n = 1, the node location that 
optimizes p i-> Hdistor (p) is the centroid of the 
set S , denoted by CM^(5). 

Recall that the gradient of Hdistor on S n \ C 
takes the form, 


^distor (p !?•••> Pn 



Ik - Pi\\l<l>(q)dq. 


In signal compression —Hdistor is referred to as 
the distortion function and is relevant in many 
disciplines where including vector quantization, 
signal compression, and numerical integration; 
see Gray and Neuhoff (1998) and Du et al. 
(1999). Here, distortion refers to the average 
deformation (weighted by the density 0 ) caused 
by reproducing q e S with the location pi in 
P = {pi,... ,p n } such that q e V t (P). By 
means of the Parallel Axis Theorem (see Hibbeler 
2006), it is possible to express Hdistor as a sum 


^distorQh? • • • > Pn-> W\, • • • > W n ) 

n 

= £-j,W.CM*(wn) 

i = \ 

-A4(Wi)\\p t -CM+(W t )\\l ( 6 ) 

where J p) = f w \\q - p\\ 2 <p(q)dq is the 
so-called moment of inertia of the region W 
about p with respect to 0. In this way, the terms 
J 0 (I^-,CM 0 (Ik;-)) only depend on the partition 
of S , whereas the second terms multiplied by 
A(p(Wi) include the particular location of the 
points. As a consequence of this observation, the 
optimality of the centroid locations for Hdistor 
follows Bullo et al. (2009). More precisely, let 
{W \,..., W n } be a partition of S. Then, for any 
set points P = {p\,..., p n } in S, 

Hdistor(CM^m), • • •, CM W u ...,W n ) 

— ^distorQfi* • • • ? Pni Wl > • • • » Wn), 


=2A^(7,(P))(CM^(7,(P)) -pi), 
i e { 1 ,..., n}, 


that is, the i th component of the gradient points 
in the direction of the vector going from pt 
to the centroid of its Voronoi cell. The critical 
points of Hdistor are therefore the set of centroidal 
Voronoi configurations in S. This is a natural 
generalization of the result for the case n = 1 , 
where the optimal node location is the centroid 
CM 0 (S). 


Area Problem 

For r G M>o, consider the performance function 
f(x) = l[o, r ] 0 O, that is, the indicator function 
of the closed interval [0, r\. Then, the expected- 
value multicenter function becomes 


n 


area,r 


(Pu 


n 

..,p n ) = J2 A <p(V i (P)nB(p i ,r)) 


= A 0 (U ? =1 B (p if r)). 


which corresponds to the area, measured accord¬ 
ing to 0 , covered by the union of the n balls 
B(p u r),..., B(p n ,r). 

Let us see how the computation of the partial 
derivatives of H ar ea,r specializes in this case. 
Here, the performance function is differentiable 
everywhere except at a single discontinuity, and 
its derivative is identically zero. Therefore, the 
first term in (5) vanishes. The gradient of H ar ea,r 
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on S n \ C then takes the form, for each i e 

{i >•••,«}, 

Q'Harea.r = f _ n oat (q)<p(q)dq, 

d Pi JVi(P)ndB(pi,r ) 

where n out is the outward normal vector to 
B ( pi , r). The critical points of H ar ea,r correspond 
to configurations with the property that each p\ 
is a local maximum for the area of Vij(P) = 
Vi(P) PI B(pf,r) at fixed Vj(P). We refer to 
these configurations as r-limited area-centered 
Voronoi configurations. 


Optimal Deployment Algorithms 

Once a set of optimal deployment configurations 
have been characterized, the next step is to devise 
a distributed algorithm that allows a group of 
mobile robots to converge to such configurations. 
Gradient algorithms are the first of the options 
that should be explored. 

For the expected-value multicenter functions, 
robots whose dynamics can be described by first- 
order integrator dynamics and which can commu¬ 
nicate at predetermined communication rounds of 
a fixed time schedule, these laws present a similar 
structure, loosely described as follows: 

[Informal description ] In each communication 
round, each robot performs the following tasks: (i) 
it transmits its position and receives its neighbors’ 
positions; (ii) it computes a notion of the geometric 
center of its own cell, determined according to 
some notion of partition of the environment, (iii) 
Between communication rounds, each robot moves 
toward this center. 

The notions of geometric center and of par¬ 
tition of the environment differ depending on 
what is the type of expected-value multicenter 
function used. In the Voronoi-center deployment 
algorithm , the geometric center just reduces to 
CMIn the limited-Voronoi-normal deploy¬ 
ment problem in (ii), each agent computes the 
di~L 

direction of v = ^ ea,r for some r and (iii) 
moves for a maximum step size in this direction 
to ensure the area function will be decreased. 


The Voronoi-center deployment algorithm 
achieves convergence of a set of nodes 
to a centroidal Voronoi configuration, thus 
maximizing the expected-value multicenter 
function Hdistor- The algorithm is distributed over 
the proximity graph as the computation of 
the centroids requires information in J\fg D (pi), 
for each i e {1,..., n}. Additional properties of 
this algorithm are that the algorithm is adaptive 
to agent departures or arrivals and amenable to 
asynchronous implementations. 

On the other hand, the limited-Voronoi-normal 
deployment algorithm achieves convergence to a 
set that locally maximizes the area covered by the 
set of sensing balls. The algorithm is distributed 
in the sense that agents only need to know in¬ 
formation from neighbors in the proximity graph 
(/ 2 r or, more precisely, Q\jo,r • Thus, it can be 
implemented by agents that employ range-limited 
interactions. It enjoys similar robustness proper¬ 
ties as the Voronoi-center deployment algorithm. 

Simulation Results 

We show evolutions of the Voronoi-centroid de¬ 
ployment algorithm in Fig. 1. One can verify that 
the final network configuration is a centroidal 
Voronoi configuration. For each evolution we 
depict the initial positions, the trajectories, and 
the final positions of all robots. 

Finally, we show an evolution of limited- 
Voronoi-normal deployment algorithm in 
Fig. 2. One can verify that the final network 
configuration is an ^-limited area-centered 
Voronoi configuration. In other words, the 
deployment task is achieved. 

Future Directions for Research 

The algorithms described above achieve 
locally optimal deployment configurations with 
respect to expected-value multicenter functions. 
However, this simplified setting does not 
account for many important constraints, such 
as obstacles and deployment in non-convex 
environments (Pimenta et al. 2008; Caicedo- 
Nunez and Zefran 2008), deployment with vis¬ 
ibility sensors, range-limited and wedge-shaped 
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Optimal Deployment and Spatial Coverage, Fig. 1 

The evolution of the Voronoi-centroid deployment algo¬ 
rithm with n = 20 robots. The left-hand (resp., right- 
hand) figure illustrates the initial (resp., final) locations 


and Voronoi partition. The central figure illustrates the 
evolution of the robots. After 13 s, the value of 'Hdistor has 
monotonically increased to approximately —0.515 



Optimal Deployment and Spatial Coverage, Fig. 2 

The evolution of the limited-Voronoi-normal deployment 
algorithm with n = 20 robots and r = 0.4. The 
left-hand (resp., right-hand ) figure illustrates the initial 
(respectively, final) locations and Voronoi partition. The 

footprints (Ganguli et al. 2006; Laventall 
and Cortes 2009), and energy and vehicle 
dynamical restrictions (Kwok and Martinez 
2010a,b). Deployment strategies find application 
in exploration and data gathering tasks, and so 
these algorithms have been expanded to account 
for uncertainty and learning of unknown density 
functions (Schwager et al. 2009; Graham and 
Cortes 2012; Zhong and Cassandras 2011; 
Martmez 2010). Gossip and self-triggered 
communications (Bullo et al. 2012; Nowzari 
and Cortes 2012), self-triggered computations 
for region approximation (Ru and Martmez 
2013), and area equitable partitions (Cortes 
2010) have also been investigated. Much work is 
currently being devoted to solve on the current 
limitations of these nontrivial extensions, which 
make the problem settings significantly harder to 
solve. 


central figure illustrates the evolution of the robots. The 
|-limited Voronoi cell of each robot is plotted in light 
gray. After 36 s, the value of H area , with a = has 
monotonically increased to approximately 14.141 


Cross-References 

► Graphs for Modeling Networked Interactions 

► Multi-vehicle Routing 

► Networked Systems 
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Optimal Sampled-Data Control 

Yutaka Yamamoto 
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Abstract 

This article gives a brief overview on the modern 
development of sampled-data control. Sampled- 
data systems intrinsically involve a mixture of 
two different time sets, one continuous and the 
other discrete. Due to this, sampled-data systems 
cannot be characterized in terms of the stan¬ 
dard notions of transfer functions, steady-state 
response, or frequency response. The technique 
of lifting resolves this difficulty and enables the 
recovery of such concepts and simplified solu¬ 
tions to sampled-data H°° and H 2 optimization 
problems. We review the lifting point of view, its 
application to such optimization problems, and 
finally present an instructive numerical example. 


Keywords 

Computer control; Frequency response; H°° and 
H 2 optimization; Lifting; Transfer operator 


Introduction 

A sampled-data control system consists of 
a continuous-time plant and a discrete-time 
controller, with sample and hold devices 
that serve as an interface between these two 
components. As can be seen from this fact, 
sampled-data systems are not time invariant, 
and various problems arise from this property. 

To be more specific, consider the unity - 
feedback control system shown in Fig. 1; r is 
the reference signal, y the system output, and 
e the error signal. These are continuous-time 
signals. The error e(t) goes through the sampler 
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Optimal Sampled-Data Control, Fig. 1 A unity-feedback system 



Optimal Sampled-Data Control, Fig. 2 Sampling with 0-order hold 


(or an A/D converter) S. This sampler reads out 
the values of e(t) at every time step h called 
the sampling period and produces a discrete¬ 
time signal ed[k\, k = 0,1,2,... (Fig. 2). In 
particular, the sampling operator S acts on a 
continuous-time signal w{t ), t > 0, as 

S(w)[k] := w(kh), k = 0,1,2,... 

The discretized signal is then processed by the 
discrete-time controller C ( z ) and becomes a con¬ 
trol input Ud. There can also be a quantization 
effect, although for the sake of simplicity this is 
neglected here. The obtained signal Ud then goes 
through another interface % called a hold device 
or a D/A converter to become a continuous-time 
signal. A typical example is the 0-order hold 
where % simply maintains the value of a discrete¬ 
time signal w[k\ constant as its output until the 
next sampling time: 


(T-L(w[k])) ( t ) := w[k\, for kh < t < (k + 1 )h. 

A typical sample-hold action is shown in Fig. 2. 

While one can consider a nonlinear plant P or 
controller C , or infinite-dimensional P and C we 
confine ourselves to linear and finite-dimensional 
P and C, and also suppose that P and C are time 
invariant in continuous time and in discrete time, 
respectively. 


The Main Difficulty 

As stated above, the unity-feedback system Fig. 1 
is not time invariant either in continuous time or 
in discrete time, even when the plant and con¬ 
troller are both time invariant in their respective 
domains of operators. The mixture of the two 
time sets prohibits the total closed-loop system 
from being time invariant. 
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The lack of time-invariance implies that we 
cannot naturally associate to sampled-data sys¬ 
tems such classical concepts of transfer functions, 
steady-state response and frequency response. 

One can regard Fig. 1 as a time-invariant 
discrete-time system by ignoring the intersample 
behavior and focusing attention on the sample- 
point behavior only. But the obtained model does 
not then reflect what happens between sampling 
times. This approach can lead to the neglect 
of undesirable inter-sample oscillations, called 
ripples. To monitor the intersample behavior, 
the notion of the modified z-transform was 
introduced, see, e.g., Jury (1958) and Ragazzini 
and Franklin (1958); however, this transform is 
usable only after the controller has been designed 
and hence not for the design problems considered 
in this article. 


Lifting: A Modem Approach 

A new approach was introduced around 1990- 
1991 (Bamieh et al. 1991; Tadmor 1991; Toivo- 
nen 1992; Yamamoto 1990,1994). The new idea, 
now called lifting , makes it possible to describe 
sampled-data systems via a time-invariant model 
while maintaining the intersample behavior. 

Let f{t) be a continuous-time signal. Instead 
of sampling f{t ), we will represent it as a se¬ 
quence of functions. Namely, we set up the cor¬ 
respondence: 

r : / ^ {mmr=o, 

f[k](6) = f(kh + e), o <e<h (i) 

See Fig. 3. 



This idea makes it possible to view a (time- 
invariant or even periodically time-varying) 
continuous-time system as a linear, time-invariant 
discrete-time system. 

Let 

x(t) = Ax{t) + Bu(t ) 
y(t) = Cx(t). 

be a given continuous-time plant and lift the input 
u(t) to obtain u[k](-). We apply this lifted input 
with the timing t = kh (h is the prespecified sam¬ 
pling rate as above) and observe how it affects the 
system. Let x[k] be the state at time t = kh. The 
state x[k - 1- 1] at time (k + l)/z is given by 


f h 

x[k + 1] = e Ah x[k] + / e A(<h Bu[k](t)dt. 

Jo 

(3) 

The right-hand side integral defines an operator 

ph 

L 2 [0,h) -» R" : «(•) i-> / e A(h ~ z) Bu(x)d x. 

Jo 


While the state-transition (3) only described a 
discrete-time update, the system keeps producing 
an output during the intersample period. If we 
consider the lifting of x(t), it is easily seen to be 
described by 


p o 

x[k\(6) = e Ae x[k] + / e M0 ~ z) Bu[k](r)dr. 

Jo 

As such, the lifted output y [k\ (•) is given by 

p6 

y[k]{9) = Ce Ae x[k]+ / Ce A(6 ~ z) Bu[k]{x)dx. 

Jo 

(4) 

Observe that formulas (3) and (4) take the form 



Optimal Sampled-Data Control, Fig. 3 Lifting 
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x[k + 1] = Ax[k] + Bu[k\ 
y[k] = Cx[k] + 

and the operators ^4, S,C,P do not depend 
on the time variable k. In other words, it is 
possible to describe this continuous-time system 
with discrete timing, once we adopt the lifting 
point of view. To be more precise, the operators 
A, B, C, V are defined as follows: 

A : W 1 W 1 : x i-> e Ah x 
B : L 2 [0, h) —> M" : w i-> e A ^ h ~ r ^Bu(r)dr 

C: W 1 L 2 [0,h) : x i-> Ce^x 
£>: L 2 [0 ,/z) —> L 2 [0, /z) 

Jq Ce A ( e ~ r ^Bu(t)dr 

(5) 

Thus the continuous-time plant (2) can be de¬ 
scribed by a time-invariant discrete-time model. 
Once this is done, it is straightforward to connect 
this expression with a discrete-time controller, 
and hence, sampled-data systems (for example, 
Fig. 1) can be fully described by time-invariant 
discrete-time equations, without discarding the 
intersampling information. We will also denote 
the overall equation (with discrete-time controller 
included) abstractly in the form 

x[k + 1] = Ax[k] + Bu[k] 
y[k]=Cx[k] + Vu[k]. 

While the obtained discrete-time model is a time 
invariant, the input and output spaces are now 
infinite dimensional. Its transfer function (oper¬ 
ator) is defined as 

G(z):=V + C(zI (7) 

Note that A in (6) is a matrix because it is so for 
A in (5). Hence, (6) is stable if G(z) is analytic for 
{z : |z| > 1}, provided that there is no unstable 
pole-zero cancellation. 

Definition 1 Let G(z) be the transfer operator of 
the lifted system given by (7), which is stable in 
the sense above. The frequency response operator 
is the operator 


regarded as a function of co e [0, co s ) (co s : = 
2tt/ h). Its gain at co is defined to be 


||G( e ^)||= sup 

v€L 2 [0,h ) 


\\G{e^ h )v\\ 

Nl 


(9) 


The maximum ||G(e /w/i )|| over [0, co s ) is the H°° 
norm ofG(z). The H 2 -norm ofG is defined by 

||G|| 2 : = ( — / trace {G*(e j(oh )G(e j(oh )}dco 
\2n Jo 

where the trace here is taken in the sense of 
Hilbert-Schmidt norm; see Chen and Francis 
(1995) for details. 



H°° and H 2 Control Problems 

A significant consequence of the lifting approach 
described above is that various robust control 
problems such as H°° and H 2 control problems 
for sampled-data control systems can be 
converted to corresponding discrete-time (finite¬ 
dimensional) problems. The approach was 
initiated by Chen and Francis (1990) and later 
solved by Bamieh and Pearson (1992), Kabamba 
and Hara (1993), Sivashankar and Khargonekar 
(1994), Tadmor (1991), and Toivonen (1992) 
in more complete forms; see Chen and Francis 
(1995) for the pertinent historical accounts. 

Let us introduce the notion of generalized 
plants. Suppose that a continuous time plant is 
given in the following model: 

x c (t ) = Ax c (t) + B\w(t) + B 2 u(t) 
z(t ) = Cix c (0 + D\\w(t) + T)\ 2 u(t ) (11) 

y(t) = C 2 x c (t ) 

Here w is the exogenous input, u(t) control input, 
y(t) measured output, and z(t) is the controlled 
output. The objective is to design a controller that 
takes the sampled measurements of y and returns 
a control variable u according to the following 
formula: 

Xd[k + 1] = AdXd[k\ + BdSy[k] 

v[k] = C d x d [k\ + D d Sy[k\ (12) 
u[k}(6) = H(6)v[k] 


G(e j0)h ) : L 2 [0, h) -> L 2 [ 0, h) (8) 
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Optimal Sampled-Data Control, Fig. 4 Sampled feed¬ 
back system 


where H(9) is a suitable hold function. This 
is depicted in Fig. 4. The objective here is to 
design or characterize a controller that achieves 
a prescribed performance level y > 0 in such a 
way that 

\\T zw \\oo<y (13) 

where T zw denotes the closed-loop transfer oper¬ 
ator from w to Z- This is the H°° control problem 
for sampled-data systems. If we take the H 2 - 
norm (10) instead, then the problem becomes that 
of the H 2 (sub)optimal control problem. 

The difficulty here is that both w and z are 
continuous-time variables, and hence their lifted 
variables are infinite dimensional. A remarkable 
fact here is that the H°° problem (and the 
H 2 problem as well) (13) can be equivalently 
transformed to an H°° problem for a finite¬ 
dimensional discrete-time system. We will 
indicate in the next section how this can be done. 


Let us write the system (11) and (12) in the form 

x[k + 1] = + Bu[k] 


y[k] = Cx[k] + Vu[k\. 


(14) 


as in (6). For simplicity of treatments, assume 
D\\ in (11) to be zero; for the general case, see 
Yamamoto and Khargonekar (1996). 

Let G(z) be the transfer operator G(z) := V + 
C(z.I — The H°° norm of G is given as 

the maximum of the singular values of the gain 
G(e ja)h ) for co e [0, 2 tt//z). 

Now consider the singular value equation 

(y 2 I -G*G(e jo>h ))w = 0. (15) 


and suppose that y > \\V\\. A crux here is 
that A, 13, C are finite-rank operators, and we can 
reduce this to a finite-dimensional rank condition. 
Taking the adjoint of (14), we obtain 

p[k] = A* Pk +i + C*v[k\ 
e[k\=B* Pk+l +V*v[k\. 


Taking the z-transforms of both sides, setting z = 
e jct)h , and substituting v = y and e = y 2 w, we 
obtain 

e Jcoh x = Ax + Bw 

p = e j(0h A*p+C*(Cx + Vw) 

(y 2 - V*V)w = e ja)h B*p + V*Cx. 


Eliminating the variable w then yields 




' A + BR~ l V*C O' 

\ 

X 

_0 A* +C*T>R~ l B*_ 


_C*(I +VR~ 1 V*)C I 

) 

_P_ 


(16) 


where R y = (yl The important point to operator from L 2 [0, h ) to M", and its adjoint 23* 

be noted here is that all the operators appearing is an operator from W 1 to L 2 [0,h). Hence, the 
here are actually matrices. For example, B is an composition BR~ l B * is a linear operator from 
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W 1 into itself, i.e., a matrix. Thus, for a given 
y the singular value equation admits a nontrivial 
solution w for (15) if and only if the finite¬ 
dimensional equation (16) admits a nontrivial so¬ 
lution [x p] (Yamamoto 1993; Yamamoto and 
Khargonekar 1996). (Note that R y is invertible 
since y > \\T>\\.) 

It is possible to find matrices A, B , C such that 
A = A + BR~ X V*C , ~BB* /y 1 = BR~ l B\ and 

C*c = C*(I + VR~ l V*)C, and hence (16) is 
equivalent to 



for A = e^ h . In other words, we have that 
II G ||oo < y if and only if there exists no A of 
modulus 1 such that (17) holds. 

It can be proven that by substituting the ex¬ 
pressions of (11) and (12) for (, A,B,C,V), one 
obtains a finite-dimensional discrete-time gener¬ 
alized plant Gd with digital controller (12) such 
that HGHoo < y if and only if ||G</||oo < y. The 
precise formulas for the discrete-time plant can 
be found, e.g., Bamieh and Pearson (1992), Chen 
and Francis (1995), Kabamba and Hara (1993), 
Yamamoto and Khargonekar (1996), and Cantoni 
and Glover (1997). 


An H°° Design Example 

For sampled-data control systems, there used to 
be, and still is, a rather common myth that if 
one takes a sufficiently fast sampling rate, it will 
not cause a major problem. This can be true for 
continuous-time design, but we here show that if 
we employ a sample-point discretization without 
a performance consideration for intersampling 
behavior, fast sampling rates can cause a serious 
problem. 

Take a simple second-order plant P(s ) = 
l/(s 2 +0.1s + l), and consider the disturbance re¬ 
jection problem minimizing the H 00 -norm from 
w to z as given in Fig. 5. Set the sampling time 
h = 0.5. We execute the following: 


Sampled-data H°° design with the generalized 
plant 


G(s) = 


P(s) P(s) 
P(s) P(s) 


• Discrete-time H°° design with the discrete¬ 
time generalized plant Gd(z ) given by the 

step-invariant transformation (see, e.g., Chen 

and Francis 1995) of G(s). 

Figures 6 and 7 show the frequency and 
time responses of the two resulting closed- 
loop systems, respectively. In Fig. 6, the solid 
curve shows the response of the sampled design, 
while the dash-dotted curve shows the discrete¬ 
time frequency response, but purely reflecting 
its sample-point behavior only. At first glance, 
it may appear that the discrete-time design 
performs better. But when we actually compute 
the lifted sampled-data frequency response in 
the sense defined in Definition 1, it becomes 
obvious that the sampled-data design is far 
superior. The dashed curve shows the frequency 
response of the closed-loop, i.e., that of G(s) 
connected with the discrete-time designed 
Kd. The response is similar to the discrete¬ 
time frequency response in low frequency, but 
exhibits a very sharp peak around the Nyquist 
frequency (i.e., half the sampling frequency; 
in the present case, n/h ~ 6.28 rad/s, i.e., 
1/2 h = 1 Hz). 

This can also be verified from the initial- 
state responses Fig. 7 with v(0) = (1,1). The 
solid curve shows the sampled-data design 
and the dashed curve the discrete-time one. 
Both responses decay to zero rapidly at 
sampled instants as shown by the circles for 
the discrete-time design. But the discrete-time 
design exhibits very large ripples, with period 
approximately 1 s. This corresponds to 1 Hz, 
which is the same as 2tt = n/h [rad/s], 
i.e., the Nyquist frequency. This is precisely 
captured in the lifted frequency response in 
Fig. 6. 

It is worth noting that when we take the 
sampling period h smaller, the response for the 
discrete-time design becomes even more oscilla¬ 
tory and shows a very high peak in the frequency 
response. 
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Optimal Sampled-Data 
Control, Fig. 5 

Disturbance rejection 



Optimal Sampled-Data 
Control, Fig. 6 Frequency 
responses h = 0.5 


Frequency response 



Summary, Bibliographical Notes, and 
Future Directions 

We have given a short summary of the main 
achievements of modern sampled-data control 
theory. Particularly, we have reviewed how the 
technique of lifting resolved the intrinsic diffi¬ 
culty arising from the mixture of two distinct time 
sets: continuous and discrete. This idea further 
led to the new notions of transfer operators and 
frequency response. These notions together en¬ 
abled us to treat optimal sampled-data control 
problems in a unified and transparent way. We 
have outlined how the sampled-data H°° control 
problem can equivalently be reduced to a cor¬ 
responding discrete-time H°° problem, without 
sacrificing the performance in the intersample be¬ 
havior. This has been exemplified by a numerical 
example. 


There are other performance indices for opti¬ 
mality, typically those arising from H 2 and L l 
norms. These problems have also been studied 
extensively, and fairly complete solutions are 
available. For the lack of space, we cannot list all 
references, and the reader is referred to Chen and 
Francis (1995) and Yamamoto (1999) for a more 
concrete survey and references therein. 

For classical treatments of sampled-data con¬ 
trol, it is instructive to consult Jury (1958) and 
Ragazzini and Franklin (1958). The textbook 
Astrom and Wittenmark (1996) covers both clas¬ 
sical and modern aspects of digital control. For a 
mathematical background of the computation of 
adjoints treated in section i6 H°° Norm Computa¬ 
tion and Reduction to Finite Dimension,” consult 
Yamamoto (2012) as well as Yamamoto (1993). 

Since control devices are now mostly digital, 
the importance of sampled-data control will 
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Time response 



Optimal Sampled-Data Control, Fig. 7 Initial-state responses h = 0.5 


definitely increase. While the linear, time- 
invariant case as treated here is now fairly 
complete, sampled-data control for a nonlinear 
or an infinite-dimensional plant seems to be still 
quite an open issue, although it is unclear if the 
methodology treated here is effective for such 
classes of plants. 

Sampled-data control has much to do with 
signal processing. Indeed, since it can optimize 
continuous-time performance, it can shed a new 
light on digital signal processing. Traditionally, 
Shannon’s paradigm based on the perfect band- 
limiting hypothesis and the sampling theorem 
has been prevalent in the signal processing 
community. Since the sampling theorem opts 
for perfect reconstruction, the resulting theory 
reduces mostly to discrete-time problems. In 
other words, the intersample information is 
buried in the sampling theorem. It should, 
however, be noted that the very stringent band- 
limiting hypothesis is almost never satisfied 
in reality, and various approximations are 
necessitated. In contrast, sampled-data control 
can provide an optimal platform for dealing with 
and optimizing the response between sampling 


points when the band-limiting hypothesis does 
not hold. See, for example, Yamamoto et al. 
(2012) and Nagahara and Yamamoto (2012) for 
the idea and some efforts in this direction. 


Cross-References 

► Control Applications in Audio Reproduction 

► H 2 Optimal Control 

► H-Infinity Control 

► Optimal Control via Factorization and Model 
Matching 
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Abstract 

This entry reviews optimization algorithms for 
both linear and nonlinear model predictive con¬ 
trol (MPC). Linear MPC typically leads to spe¬ 
cially structured convex quadratic programs (QP) 
that can be solved by structure exploiting active 
set, interior point, or gradient methods. Nonlin¬ 
ear MPC leads to specially structured nonlinear 
programs (NLP) that can be solved by sequential 
quadratic programming (SQP) or nonlinear inte¬ 
rior point methods. 
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Introduction 

Model predictive control (MPC) needs to solve 
at each sampling instant an optimal control prob¬ 
lem with the current system state Xo as initial 
value. MPC optimization is almost exclusively 
based on the so-called direct approach which 
first discretizes the continuous time system to 
obtain a discrete time optimal control problem 
(OCP). This OCP has as optimization variables 
a state trajectory X = [x^,..., x^] T with X/ e 
W 1 * for i = 0 ,,N and a control trajectory 
U = [u ^,..., w^_ 1 ] T with Ui e for i = 
0,..., N — 1. For simplicity of presentation, we 
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restrict ourselves to the time-independent case, 
and the OCP we treat in this article is stated as 
follows: 

N -1 

minimize L(xi,Ui ) + E (x^) (la) 

X,u 

subject to xo — xo = 0, (lb) 

*/+i - f(xi,Ui ) = 0, i = 0,..., N - 1, 

(lc) 

h(xi ,Ui) < 0 , i = 0 ,..., A — 1, 

(l d) 

r (x N ) < 0. (le) 

The MPC objective is stated in Eq. (la), the sys¬ 
tem dynamics enter via Eq. (lc), while path and 
terminal constraints enter via Eqs. (Id) and (le). 
All functions are assumed to be differentiable and 
to have appropriate dimensions ( h(x,u ) e 
and r(x) e M' v )- Note that Xq e is not an op¬ 
timization variable, but a parameter upon which 
the OCP depends via the initial value constraint in 
Eq. (lb). The optimal solution trajectories depend 
only on this value and can thus be denoted by 
X*(vo) and U*(x o). Obtaining them, in partic¬ 
ular the first control value Mq(*o), as fast and 
reliably as possible for each new value of xo is 
the aim of all MPC optimization algorithms. The 
most important dividing line is between convex 
and non-convex optimal control problems (OCP). 
If the OCP is convex, algorithms exist that find a 
global solution reliably and in computable time. 
If the OCP is not convex, one usually needs to be 
satisfied with approximations of locally optimal 
solutions. The OCP (1) is convex if the objec¬ 
tive (la) and all components of the inequality 
constraint functions (Id) and (le) are convex and 
if the equality constraints (lc) are linear. 

We typically speak of linear MPC when the 
OCP to be solved is convex, and otherwise of 
nonlinear MPC. 

General Algorithmic Features for MPC 
Optimization 

In MPC we would dream to have the solution to 
a new optimal control problem instantly, which is 


impossible due to computational delays. Several 
ideas can help us to deal with this issue. 

Off-line Precomputations and Code 
Generation 

As consecutive MPC problems are similar and 
differ only in the value xo, many computations 
can be done once and for all before the MPC 
controller execution starts. Careful preprocessing 
and code optimization for the model routines is 
essential, and many tools automatically gener¬ 
ate custom solvers in low-level languages. The 
generated code has fixed matrix and vector di¬ 
mensions, has no online memory allocations, and 
contains a minimal number of if-then-else state¬ 
ments to ensure a smooth computational flow. 

Delay Compensation by Prediction 
When we know how long our computations for 
solving an MPC problem will take, it is a good 
idea not to address a problem starting at the cur¬ 
rent state but to simulate at which state the system 
will be when we will have solved the problem. 
This can be done using the MPC system model 
and the open-loop control inputs that we will ap¬ 
ply in the meantime. This feature is used in many 
practical MPC schemes with non-negligible com¬ 
putation time. 

Division into Preparation and Feedback Phase 
A third ingredient of several MPC algorithms is 
to divide the computations in each sampling time 
into a preparation phase and a feedback phase. 
The more CPU intensive preparation phase is 
performed with a predicted state Xo, before the 
most current state estimate, say x' 0 , is available. 
Once Xq is available, the feedback phase delivers 
quickly an approximate solution to the optimiza¬ 
tion problem for x[ y 

Warmstarting and Shift 

An obvious way to transfer solution information 
from one solved MPC problem to the next one 
uses the existing optimal solution as an initial 
guess to start the iterative solution procedure of 
the next problem. We can either directly use 
the existing solution without modification for 
warmstarting or we can first shift it in order to 
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account for the advancement of time, which is 
particularly advantageous for systems with time- 
varying dynamics or objectives. 

Iterating While the Problem Changes 
A last important ingredient of some MPC algo¬ 
rithms is the idea to work on the optimization 
problem while it changes, i.e., to never iterate 
the optimization procedure to convergence for an 
MPC problem getting older and older during the 
iterations but to rather work with the most current 
information in each new iteration. 


Convex Optimization for Linear MPC 

Linear MPC is based on a linear system model of 
the form = Axi+But and convex objective 
and constraint functions in (la), (Id), and (le). 
The most widespread linear MPC setting uses 
a convex quadratic objective function and affine 
constraints and solves the following quadratic 
program (QP): 


1 

minimize - 
X,U 2 


N -1 

E 


i =0 



+ 2 X n^ x n 


(2a) 


subject to xo — xo = 0, (2b) 

Xi +1 — Axt — Buj = 0,/ = 0,..., N — 1, 

(2c) 


b T Cxf -\- Duf E 0,/ — 0,..., N — 1, 

(2d) 


c + Fxn < 0. (2e) 


Here, b , c are vectors and Q, S, R, P, C, D, F 

" Q s~ 

_S T R_ 

metric and positive semi-definite to ensure the QP 
is convex. 


matrices, and matrices 


and P are sym- 


Sparsity Exploitation 

The QP (2) has a specific sparsity structure that 
can be exploited in different ways. One way is to 
reduce the variable space by a procedure called 
condensing and then to solve a smaller-scale QP 


instead of (2). Another way is to use a banded 
matrix factorization . 


Condensing 

The constraints (2b) and (2c) can be used to 
eliminate the state trajectory X. This yields an 
equivalent but smaller-scale QP of the following 
form: 


minimize 

U e R Nn « 

subject to 


1 

~U 

T 

' H 

G" 

~U 

2 

Xq_ 


_G t 

J 

x 0 _ 


d T Kxq MU E 0. 


(3a) 

(3b) 


The number of inequality constraints is the same 
as in the original QP (2) and given by m = 
Nnu + n r . Note that in the simplest case without 
inequalities (m = 0), the solution U*(x o) of 
the condensed QP can be obtained by setting the 
gradient of the objective to zero, i.e., by solving 
HU*(xo) + Gx o = 0. The factorization of a 
dense matrix H with dimension Nn u xNn u needs 
0(N 3 nl) arithmetic operations, i.e., the compu¬ 
tational cost of condensing-based algorithms typ¬ 
ically grows cubically with the horizon length N . 


Banded Matrix Factorization 
An alternative way to deal with the sparsity is 
best sketched at hand of a sparse convex QP (2) 
without inequality constraints (2d) and (2e). We 
define the vector of Lagrange multipliers Y = 
[yj, ..., y^] T and the Lagrangian function by 


C(X, U, Y ) = (x n - x 0 ) + -x^Pxn 


1 V- 

+ 9E 


N -1 r -,T r 


Q s' 

S T R 


i =0 L - 1 L 

(Az+ 1 — Axt + B u \). 


+ T /+1 


(4) 


If we reorder all unknowns that enter the Lagran¬ 
gian and summarize them in the vector 

the optimal solution W*(x 0 ) is uniquely charac¬ 
terized by the first-order optimality condition 
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V w C(W*) = 0. 

Due to the linear quadratic dependence of C on 
W, this is a block-banded linear equation in the 
unknown W*: 


“0 / 


XO 

IQ S -A t 


0 

S T R -B t 


0 

-A -B 0 I 

w*= 

0 

I Q • • 



• • 0 / 



/ P_ 


_ 0 _ 


Because the above matrix has nonzero elements 
only on a band of width 2 n x + n u around the di¬ 
agonal, it can be factorized with a computational 
cost of order 0(N(2n x + n u ) 3 ). 

Treatment of Inequalities 

An important observation is that both the un¬ 
condensed QP (2) and the equivalent condensed 
QP (3) typically fall into the class of strictly con¬ 
vex parametric quadratic programs: the solution 
U*(x o), X* (xq) is unique and depends piecewise 
affinely and continuously on the parameter xq. 
Each affine piece of the solution map corresponds 
to one active set and is valid on one polyhedral 
critical region in parameter space. This observa¬ 
tion forms the basis of explicit MPC algorithms 
which precompute the map u q (xo), but it can also 
be exploited in online algorithms for quadratic 
programming, which are the focus of this section. 

We sketch the different ways of how to treat 
inequalities only for the condensed QP (3), but 
they can equally be applied in sparse QP al¬ 
gorithms that directly address (2). The optimal 
solution U*(x o) for a strictly convex QP (3) 
is - together with the corresponding vector of 
Lagrange multipliers, or dual variables A*(vo) e 
M m - uniquely characterized by the so-called 
Karush-Kuhn-Tucker (KKT) conditions, which 
we omit for brevity. There are three big families 
of solution algorithms for inequality-constrained 
QPs that differ in the way they treat inequalities: 
active set methods , interior point methods , and 
gradient projection methods. 


Active Set Methods 

The optimal solution of the QP is characterized 
by its active set , i.e., the set of inequality con¬ 
straints (3b) that are satisfied with equality at this 
point. If one would know the active set for a given 
problem instance Xo, it would be easy to find the 
solution. Active set methods work with guesses 
of the active set which they iteratively refine. 
In each iteration, they solve one linear system 
corresponding to a given guess of the active set. 
If the KKT conditions are satisfied, the optimal 
solution is found; if they are not, another division 
into active and inactive constraints needs to be 
tried. A crucial observation is that an existing 
factorization of the linear system can be reused to 
a large extent when only one constraint is added 
or removed from the guess of the active set. Many 
different active set strategies exist, three of which 
we mention: Primal active set methods first find 
a feasible point and then add or remove active 
constraints, always keeping the primal variables 
U feasible. Due to the difficulty of finding a 
feasible point first, they are difficult to warmstart 
in the context of MPC optimization. Dual active 
set strategies always keep the dual variables A 
positive. They can easily be warmstarted in the 
context of MPC. Parametric or online active 
set strategies ensure that all iterates stay primal 
and dual feasible and go on a straight line in 
parameter space from a solved QP problem to 
the current one, only updating the active set when 
crossing the boundary between critical regions, as 
implemented in the online QP solver qpOASES. 

Active set methods are very competitive in 
practice, but their worst case complexity is not 
polynomial. They are often used together with the 
condensed QP formulation, for which each active 
set change is relatively cheap. This is particu¬ 
larly advantageous if an existing factorization can 
be kept between subsequent MPC optimization 
problems. 

Interior Point Methods 

Another approach is to replace the KKT condi¬ 
tions by a smooth approximation that uses a small 
positive parameter r > 0: 

HU* + Gx o + M t A* = 0, (5a) 
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(d Kx o + MU )^. Xj + r — 0, i — 1 ,,m. 

(5b) 

These conditions form a smooth nonlinear system 
of equations that uniquely determines a primal 
dual solution U*(x o, r) and A*(xo, r) in the in¬ 
terior of the feasible set. They are not equiv¬ 
alent to the KKT conditions, but for r 0, 
their solution tends to the exact QP solution. An 
interior point algorithm solves the system (5 a) 
and (5b) by Newton’s method. Simultaneously, 
the path parameter r, that was initially set to 
a large value, is iteratively reduced, making the 
nonlinear set of equations a closer approximation 
of the original KKT system. In each Newton 
iteration, a linear system needs to be factored 
and solved, which constitutes the major com¬ 
putational cost of an interior point algorithm. 
For the condensed QP (3) with dense matrices 
H, M , the cost per Newton iteration is of order 
0(N 3 ). But the interior point algorithm can also 
be applied to the uncondensed sparse QP (2), 
in which case each iteration has a runtime of 
order O(N). In practice, for both cases, 10-30 
Newton iterations usually suffice to obtain very 
accurate solutions. As an interior point method 
needs always to start with a high value of r and 
then reduces it during the iterations, warmstarting 
is of minor benefit. There exist efficient code 
generation tools that export convex interior point 
solvers as plain C-code such as CVXGEN and 
FORCES. 

Gradient Projection Methods 
Gradient projection methods do not need to 
factorize any matrix but only evaluate the 
gradient of the objective function HU^ + Gx o 
in each iteration. They can only be implemented 
efficiently if the feasible set is a simple set 
in the sense that a projection T(U) on this 
set is very cheap to compute, as, e.g., for 
upper and lower bounds on the variables U, 
and if we know an upper bound Lh > 0 

on the eigenvalues of the Hessian H. The 
simple gradient projection algorithm starts 
with an initialization U ^ and proceeds as 
follows: 


U [k+l] = V (u [k] - d-(HUW + Gxo)j . 

An improved version of the gradient projection 
algorithm is called the optimal or fast gradient 
method and has probably the best possible iter¬ 
ation complexity of all gradient type methods. 
All variants of gradient projection algorithms 
are easy to warmstart. Though they are not as 
versatile as active set or interior point methods, 
they have short code sizes and can offer ad¬ 
vantages on embedded computational hardware, 
such as the code generated by the tool FIOR- 
DOS. 


Optimization Algorithms for 
Nonlinear MPC 

When the dynamic system jt ;+\ = f(xi,Uj) 
is not affine, the optimal control problem (1) 
is non-convex, and we speak of a nonlinear 
MPC (NMPC) problem. NMPC optimization 
algorithms only aim at finding a locally optimal 
solution of this problem, and they usually do 
it in a Newton-type framework. For ease of 
notation, we summarize problem (1) in the form 
of a general nonlinear programming problem 
(NLP): 


minimize 

X,U 

*(X, U ) 

(6a) 

subject to 

G eq (X,U,x 0 ) =0, 

(6b) 


G ineq (X, U ) < 0. 

(6c) 


Let us first discuss a fundamental choice that 
regards the problem formulation and number of 
optimization variables. 

Simultaneous vs. Sequential Formulation 

When an optimization algorithm addresses prob¬ 
lem (6) iteratively, it works intermediately with 
nonphysical, infeasible trajectories that violate 
the system constraints (6b). Only at the optimal 
solution the constraint residual is brought to zero 
and a physical simulation is achieved. We speak 
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of a simultaneous approach to optimal control 
because the algorithm solves the simulation and 
the optimization problems simultaneously. Vari¬ 
ants of this approach are direct discretization or 
direct multiple shooting. 

On the other hand, the equality con¬ 
straints (6b) could easily be eliminated by 
a nonlinear forward simulation for given 
initial value Vo and control trajectory U, 
similar to condensing in linear MPC. Such a 
forward simulation generates a state trajectory 
Af S im(Ao> U) such that for the given value of xq, U 
the equality constraints (6b) are automatically 
satisfied: G eq (X s { m (xo,U),U, xo) = 0. Inserting 
this map into the NLP (6) allows us to formulate 
an equivalent optimization problem with a 
reduced variable space: 

minimize <D(Z sim (x 0 , U), U) (7a) 

U 

subject to 67i n eq(^f s i m (Xo,£/),£/) 5 0- (7b) 

When solving this reduced problem with an it¬ 
erative optimization algorithm, we sequentially 
simulate and optimize the system, and we speak 
of the sequential approach to optimal control. 

The sequential approach has a lower dimen¬ 
sional variable space and is thus easier to use 
with a black-box NLP solver. On the other hand, 
the simultaneous approach leads to a sparse NLP 
and is better able to deal with unstable nonlinear 
systems. In the remainder of this section, we 
thus only discuss the specific structure of the 
simultaneous approach. 

Newton-Type Optimization 

In order to simplify notation further, we summa¬ 
rize and reorder all optimization variables U and 
X in a vector V = [xj , u^ ,..., , uj [ _ ] , x^] T 

and use the same problem function names as 
in (6) also with the new argument V. 

As in the section on linear MPC, we can 
introduce multipliers Y for the equalities and A 
for the inequalities and define the Lagrangian 


( 8 ) 


All Newton-type optimization methods try to 
find a point satisfying the KKT conditions by 
using successive linearizations of the problem 
functions and Lagrangian. For this aim, starting 
with an initial guess (F^,T^,A^), they 
generate sequences of primal-dual iterates 
(yi k \ 

An observation that is crucial for the efficiency 
of all NMPC algorithms is that the Hessian of 
the Lagrangian is at a current iterate given by a 
matrix of the form 


v^(-) 


n [k\ r.M 

cM.T «[*] 

^0 ^0 


q k s m 
sf ],T R [k] 


p [k\ 


This block sparse matrix structure makes it possi¬ 
ble to use in each Newton-type iteration the same 
sparsity-exploiting linear algebra techniques as 
outlined in section “Convex Optimization for 
Linear MPC” for the linear MPC problem. 

Major differences exist on how to treat the 
inequality constraints, and the two big families of 
Newton-type optimization methods are sequen¬ 
tial quadratic programming (SQP) methods and 
nonlinear interior point (NIP) methods. 

Sequential Quadratic Programming (SQP) 

A first variant to iteratively solve the KKT sys¬ 
tem is to linearize all nonlinear functions at the 
current iterate (V^ k \ Y^ k \ A^) and to find a new 
solution guess from the solution of a quadratic 
program (QP): 


minimize 


(9a) 

V 


subject to 

Geq, ,i„(Fxo;FW) = 0, 

(9b) 


Gi„eq,lin(F; V [k] ) < 0. 

(9c) 

Here, the subindex “lin” in the constraints G.^ n 


expresses that a first-order Taylor expansion at 


C(V, Y, A) = 0(F) + 7 T G eq (F,x 0 ) 
+ A T G ineq (F). 
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is used, while the QP objective is given 
by = <K, in (F; V™) + 

i(F - V^) J V 2 v C(-)(V - VM). Note that the 
QP has the same sparsity structure as the QP (2) 
resulting from linear MPC, with the only differ¬ 
ence that all matrices are now time varying over 
the MPC horizon. In the case that the Hessian 
matrix is positive semi-definite, this QP is convex 
so that global solutions can be found reliably 
with any of the methods from section “Con¬ 
vex Optimization for Linear MPC.” The solu¬ 
tion of the QP along with the corresponding 
constraint multipliers gives the next SQP iterate 
(jz[fc+t], F^ +1 ], A^ +1 l). Apart from the presented 
“exact Hessian” SQP variant, which has quadratic 
convergence speed, several other SQP variants 
exist, which make use of other Hessian approx¬ 
imations. A particularly useful Hessian approx¬ 
imation for NMPC is possible if the original 
objective function <1>(L) is convex quadratic, and 
the resulting SQP variant is called the generalized 
Gauss-Newton method. In this case, one can just 
use the original objective as cost function in the 
QP (9a), resulting in convex QP subproblems and 
(often fast) linear convergence speed. 

Nonlinear Interior Point (NIP) Method 
In contrast to SQP methods, an alternative way 
to address the solution of the KKT system is to 
replace the last nonsmooth KKT conditions by a 
smooth nonlinear approximation, with r > 0: 

V k £(F*,T*,A*) = 0 (10a) 

Geq(V*,Xo) = 0 (10b) 

Gi„ eq j(V*) X* + z = 0, i = \,... ,m. 

(10c) 

We summarize all variables in a vector W = 
[L t ,T t ,A t ] t and summarize the above set of 
equations as 

Gm>(W,x 0 ,t) = 0. (11) 

The resulting root finding problem is then 
solved with Newton’s method, for a descending 
sequence of path parameters r^. The NIP 


method proceeds thus exactly as in an interior 
point method for convex problems, with the only 
difference that it has to re-linearize all problem 
functions in each iteration. An excellent software 
implementation of the NIP method is given in the 
form of the code IPOPT. 

Continuation Methods and Tangential 
Predictors 

In nonlinear MPC, a sequence of OCPs with 
different initial values xj^, xj^, xj^,... is solved. 
For the transition from one problem to the next, 
it is beneficial to take into account the fact that 
the optimal solution W*(xo) depends almost ev¬ 
erywhere differentiably on xo- The concept of a 
continuation method is most easily explained in 
the context of an NIP method with fixed path 
parameter f > 0. In this case, the solution 
W*(xq,t) of the smooth root finding problem 
Gnip(IT*(xo, f), Xo, f) = 0 from Eq. (11) is 
smooth with respect to Xo. This smoothness can 
be exploited by making use of a tangential pre¬ 
dictor in the transition from one value of Xo to 
another. Unfortunately, the interior point solution 
manifold is strongly nonlinear at points where the 
active set changes, and the tangential predictor is 
not a good approximation when we linearize at 
such points. 

Generalized Tangential Predictor and 
Real-Time Iterations 

In fact, the true NLP solution is not determined 
by a smooth root finding problem (10a)-(3) 
but by the (nonsmooth) KKT conditions. The 
solution manifold has smooth parts when the 
active set does not change, but non-differentiable 
points occur whenever the active set changes. 
We can deal with this fact naturally in an SQP 
framework by solving one QP of form (9) in 
order to generate a tangential predictor that 
is also valid in the presence of active set 
changes. In the extreme case that only one 
such QP is solved per sampling time, we speak 
of a real-time iteration (RTI) algorithm. The 
computations in each iteration can be subdivided 
into two phases, the preparation phase , in 
which the derivatives are computed and the QP 
is condensed, and the feedback phase , which 
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only starts once becomes available and 

in which only a condensed QP of form (3) is 
solved, minimizing the feedback delay. This 
NMPC algorithm can be generated as plain 
C-code, e.g., by the tool ACADO. Another 
class of real-time NMPC algorithms based on 
a continuation method can be generated by the 
tool AutoGenU. 


Cross-References 

► Explicit Model Predictive Control 

► Model-Predictive Control in Practice 

► Numerical Methods for Nonlinear Optimal 
Control Problems 

Recommended Reading 

Many of the algorithmic ideas presented in this 
article can be used in different combinations than 
those presented, and several other ideas had to be 
omitted for the sake of brevity. Some more details 
can be found in the following two overview arti¬ 
cles on MPC optimization: Binder et al. (2001) 
and Diehl et al. (2009). The general field of nu¬ 
merical optimal control is treated in Bryson and 
Ho (1975), Betts (2010), and the even broader 
field of numerical optimization is covered in the 
excellent textbooks (Fletcher 1987; Wright 1997; 
Nesterov 2004; Gill et al. 1999; Nocedal and 
Wright 2006; Biegler 2010). General purpose 
open-source software for MPC and NMPC is de¬ 
scribed in the following papers: FORCES (Dom- 
ahidi et al. 2012), CVXGEN (Mattingley and 
Boyd 2009), qpOASES (Ferreau et al. 2008), 
FiOrdOs (Richter et al. 2011), AutoGenU (Oht- 
suka and Kodama 2002), ACADO (Houska et al. 
2011), and IPOPT (Wachter and Biegler 2006). 


Bibliography 

Betts JT (2010) Practical methods for optimal control 
and estimation using nonlinear programming, 2nd edn. 
SIAM, Philadelphia 

Biegler LT (2010) Nonlinear programming. SIAM, 
Philadelphia 


Binder T, Blank L, Bock HG, Bulirsch R, Dah- 
men W, Diehl M, Kronseder T, Marquardt W, 
Schloder JP, Stryk OV (2001) Introduction to 
model based optimization of chemical processes 
on moving horizons. In: Grotschel M, Krumke 
SO, Rambau J (eds) Online optimization of large 
scale systems: state of the art. Springer, Berlin, 
pp 295-340 

Bryson AE, Ho Y-C (1975) Applied optimal control. 
Wiley, New York 

Diehl M, Ferreau HJ, Haverbeke N (2009) Efficient nu¬ 
merical methods for nonlinear MPC and moving hori¬ 
zon estimation. In: Nonlinear model predictive control. 
Lecture notes in control and information sciences, 
vol 384. Springer, Berlin, pp 391-417 

Domahidi A, Zgraggen A, Zeilinger MN, Morari M, Jones 
CN (2012) Efficient interior point methods for mul¬ 
tistage problems arising in receding horizon control. 
In: IEEE conference on decision and control (CDC), 
Maui, Dec 2012, pp 668-674 

Ferreau HJ, Bock HG, Diehl M (2008) An online ac¬ 
tive set strategy to overcome the limitations of ex¬ 
plicit MPC. Int J Robust Nonlinear Control 18(8): 
816-830 

Fletcher R (1987) Practical methods of optimization, 2nd 
edn. Wiley, Chichester 

Gill PE, Murray W, Wright MH (1999) Practical optimiza¬ 
tion. Academic, London 

Houska B, Ferreau HJ, Diehl M (2011) An auto¬ 
generated real-time iteration algorithm for nonlinear 
MPC in the microsecond range. Automatica 47(10): 
2279-2285 

Mattingley J, Boyd S (2009) Automatic code generation 
for real-time convex optimization. In: Convex op¬ 
timization in signal processing and communica¬ 
tions. Cambridge University Press, New York, 
pp 1-43 

Nesterov Y (2004) Introductory lectures on convex opti¬ 
mization: a basic course. Applied optimization, vol 87. 
Kluwer, Boston 

Nocedal J, Wright SJ (2006) Numerical optimization. 
Springer series in operations research and financial 
engineering, 2nd edn. Springer, New York 

Ohtsuka T, Kodama A (2002) Automatic code 
generation system for nonlinear receding horizon 
control. Trans Soc Instrum Control Eng 38(7): 
617-623 

Richter S, Morari M, Jones CN (2011) Towards 
computational complexity certification for constrained 
MPC based on Lagrange relaxation and the fast 
gradient method. In: 50th IEEE conference 
on decision and control and European control 
conference (CDC-ECC), Orlando, Dec 2011, 
pp 5223-5229 

Wachter A, Biegler LT (2006) On the implementation of 
a primal-dual interior point filter line search algorithm 
for large-scale nonlinear programming. Math Program 
106(l):25-57 

Wright SJ (1997) Primal-dual interior-point methods. 
SIAM, Philadelphia 



Optimization Based Robust Control 


997 


Optimization Based Robust Control 

Didier Henrion 

LAAS-CNRS, University of Toulouse, Toulouse, 
France 

Faculty of Electrical Engineering, Czech 
Technical University in Prague, Prague, 

Czech Republic 

Abstract 

This entry describes the basic setup of linear ro¬ 
bust control and the difficulties typically encoun¬ 
tered when designing optimization algorithms to 
cope with robust stability and performance spec¬ 
ifications. 
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Linear Robust Control 

Robust control allows dealing with uncertainty 
affecting a dynamical system and its environ¬ 
ment. In this section, we assume that we have a 
mathematical model of the dynamical system 
without uncertainty (the so-called nominal 
system) jointly with a mathematical model of 
the uncertainty. We restrict ourselves to linear 
systems: if the dynamical system we want to 
control has some nonlinear components (e.g., 
input saturation), they must be embedded in 
the uncertainty model. Similarly, we assume 
that the control system is relatively small 
scale (low number of states): higher-order 
dynamics (e.g., highly oscillatory but low energy 
components) are embedded in the uncertainty 
model. Finally, for conciseness, we focus 
exclusively on continuous-time systems, even 
though most of the techniques described in this 
section can be transposed readily to discrete-time 
systems. 


Our control system is described by the first- 
order ordinary differential equation 

x = A (8) x + D (8) u 
y = C (8)x 

where as usual x e W 1 denotes the states, u e 
M m denotes the controlled inputs, and y e R p 
denotes the measured outputs, all depending on 
time t, with x denoting the time derivative of x. 
The system is subject to uncertainty and this is 
reflected by the dependence of matrices A, B , 
and C on uncertain parameter 8 which is typically 
time varying and restricted to some bounded set 

8 e A c ML 

A linear control law 

u = Ky 

modeled by a matrix K e R mx P must be 
designed to overcome the effect of the uncertainty 
while optimizing some performance criterion 
(e.g., pole placement, disturbance rejection, 
H 2 or Hoq norm). Sometimes, a relevant 
performance criterion is that the control should 
be stabilizing for the largest possible uncertainty 
(measured, e.g., by some norm on A). In this 
section, for conciseness, we restrict our attention 
to static output feedback control laws, but most 
of the results can be extended to dynamical output 
feedback control laws, where the control signal u 
is the output of a controller (a linear system to be 
designed) whose input is y . 

Uncertainty Models 

Amongst the simplest possible uncertainty mod¬ 
els, we can find the following: 

• Unstructured uncertainty, also called norm- 
bounded uncertainty, where 

A = {8 eR q : ||<5|| < 1} 

and the given norm can be a standard vector 
norm or a more complicated matrix norm if 8 is 
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interpreted as a vector obtained by stacking the 
column of a matrix 

• Structured uncertainty, also called poly topic 
uncertainty, where 

A = conv {Si,i = 1,..., N} 

is a polytope modeled as the convex combination 
of a finite number of given vertices <5/ e R q ,i = 
1 ,..., A 

We can find more complicated uncertainty 
models (e.g., combinations of the two above: see 
Zhou et al. 1996), but to keep the developments 
elementary, they are not discussed here. 


Nonconvex Nonsmooth Robust 
Optimization 

The main difficulties faced when seeking a feed¬ 
back matrix K are as follows: 

• Nonconvexity: The stability conditions are 
typically nonconvex in K. 

• Nondifferentiability: The performance cri¬ 
terion to be optimized is typically a non- 
differentiable function of K. 

• Robustness: Stability and performance 
should be ensured for every possible instance 
of the uncertainty. 

So if we are to formulate the robust control 
problem as an optimization problem, we should 
be ready to develop and use techniques from non¬ 
convex, nondifferentiable, robust optimization. 

Let us first elaborate on the first difficulty 
faced by optimization-based robust control, 
namely, the nonconvexity of the stability 
conditions. In continuous time, stability of a 
linear system x = Ax is equivalent to negativity 
of the spectral abscissa, which is defined as the 
maximum real part of the eigenvalues of A : 

a(A) = max{Re X : det (XI n — A) = 0, X e C}. 

It turns out that the open cone of matrices 
A e W ixn such that a (A) < 0 is nonconvex 
(Ackermann 1993). This is illustrated in Fig. 1 
where we represent the set of vectors K = 


(k\,k 2 ,k?) e R 3 such that k\ + k\ + k\ < 1 
and cl(A(K)) < 0 for 



There exist various approaches to handling non¬ 
convexity. One possibility consists of building 
convex inner approximations of the stability re¬ 
gion in the parameter space. The approxima¬ 
tions can be polytopes, balls, ellipsoids, or more 
complicated convex objects described by linear 
matrix inequalities (LMI). The resulting stability 
conditions are convex, but surely conservative, in 
the sense that the conditions are only sufficient 
for stability and not necessary. Another approach 
to handling nonconvexity consists of formulating 
the stability conditions algebraically (e.g., via the 
Routh-Hurwitz stability criterion or its symmetric 
version by Hermite) and using converging hier¬ 
archies of LMI relaxations to solve the result¬ 
ing nonconvex polynomial optimization problem: 
see, e.g., Henrion and Lasserre (2004) and Chesi 
( 2010 ). 

The second difficulty characteristic of 
optimization-based robust control is the potential 
nondifferentiability of the objective function. 
Consider for illustration one of the simplest 
optimization problems which consists of 
minimizing the spectral abscissa a (A (K)) of 
a matrix A(K ) depending linearly on a matrix 
K. Such a minimization makes sense since 
negativity of the spectral abscissa is equivalent 
to system stability. Then typically, a(A(K)) is 
a continuous but non-Lipschitz function of K , 
which means that its gradient can be unbounded 
locally. In Fig. 2, we plot the spectral abscissa 
a(A(K)) for 

and K e R. The function is non-Lipschitz at K — 
0, at which the global minimum a(A(0)) = 0 
is achieved. Nonconvexity of the function is also 
apparent in this example. The lack of convexity 
and smoothness of the spectral abscissa and other 
similar performance criteria renders optimization 
of such functions particularly difficult (Burke 
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Optimization Based 
Robust Control, Fig. 1 A 

nonconvex ball of stable 
matrices 


Optimization Based 
Robust Control, Fig. 2 

The spectral abscissa is 
typically nonconvex and 
nonsmooth 
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et al. 2001, 2006b). In Fig. 3, we represent graphs 
of the spectral abscissa (with flipped vertical 
axis for better visualization) of some small-size 
matrices depending on two real parameters, with 
randomly generated parametrization. We observe 
the typical nonconvexity and lack of smoothness 
around local and global optima. 


The third difficulty for optimization-based 
robust control is the uncertainty. As explained 
above, optimization of a performance criterion 
with respect to controller parameters is already 
a potentially difficult problem for a nominal 
system (i.e., when the uncertainty parameter is 
equal to zero). This becomes even more difficult 
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Optimization Based Robust Control, Fig. 3 The graph of the negative spectral abscissa for some randomly generated 
matrix parametrizations 


when this optimization must be carried out for 
all possible instances of the uncertainty 8 in 
A. This is where the above assumption that the 
uncertainty set A has a simple description proves 
useful. If the uncertainty 8 is unstructured and 
not time varying, then it can be handled with the 
complex stability radius (Ackermann 1993), the 
pseudospectral abscissa (Trefethen and Embree 
2005), or via an H 00 norm constraint (Zhou et al. 
1996). If the uncertainty 8 is structured, then 
we can try to optimize a performance criterion 
at every vertex in the polytopic description 
(which is a relaxation of the problem of 
stabilizing the whole poly tope). An example 
is the problem of simultaneous stabilization, 
where a controller K must be found such that the 
maximum spectral abscissa of several matrices 
A j ( K ), i = 1,..., N is negative (Blondel 1994). 
Finally, if the uncertainty 8 is time varying, then 
performance and stability guarantees can still be 
achieved with the help of Lyapunov certificates or 
potentially conservative convex LMI conditions: 
see, e.g., Boyd et al. (1994) and Scherer et al. 
(1997). 


A unified approach to addressing conflicting 
performance criteria and uncertainty consists of 
searching for locally optimal solutions of a nons¬ 
mooth optimization problem that is built to incor¬ 
porate minimization objectives and constraints 
for multiple plants. This is called (linear robust) 
multiobjective control, and formally, it can be 
expressed as the following optimization prob¬ 
lem 

minjf max;=i .jvte (K) : = 00} 

s.t .gi(K)<Pi,i = 1 

where each g/(K) is a function of the closed- 
loop matrix Aj(K) (e.g., a spectral abscissa 
or an H 00 norm) and the scalars /3 7 are given 
and such that if /3 7 = oo for some /, then gj 
appears in the objective function and not in a 
constraint: see Gumussoy et al. (2009) for details. 
In the above problem, the objective function, 
a maximum of nonsmooth and nonconvex 
functions, is typically also nonsmooth and 
nonconvex. Moreover, without loss of generality, 
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we can easily impose a sparsity pattern on 
controller matrix K to account for structural 
constraints (e.g., a low-order decentralized 
controller). 

Software Packages 

Algorithms for nonconvex nonsmooth opti¬ 
mization have been developed and interfaced 
for linear robust multiobjective control in the 
public domain Matlab package HIFOO released 
in Burke et al. (2006a) and based on the theory 
described in Burke et al. (2006b). In 2011, 
The MathWorks released HINFSTRUCT, a 
commercial implementation of these techniques 
based on the theory described in Apkarian and 
Noll (2006). 

Cross-References 

► H-Infinity Control 

► LMI Approach to Robust Control 
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Abstract 

Structured output feedback controller synthesis 
is an exciting new concept in modern control 
design, which bridges between theory and 
practice insofar as it allows for the first time 
to apply sophisticated mathematical design 
paradigms like H 00 or H 2 control within 
control architectures preferred by practitioners. 
The new approach to structured H 00 control, 
developed during the past decade, is rooted 
in a change of paradigm in the synthesis 
algorithms. Structured design may no longer 
be based on solving algebraic Riccati equations 
or matrix inequalities. Instead, optimization- 
based design techniques are required. In 
this essay we indicate why structured con¬ 
troller synthesis is central in modern control 
engineering. We explain why non-smooth 
optimization techniques are needed to compute 
structured control laws, and we point to 
software tools which enable practitioners 
to use these new tools in high-technology 
applications. 
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Introduction 

In the modern high-technology field of control, 
engineers usually face a large variety of con¬ 
curring design specifications such as noise or 
gain attenuation in prescribed frequency bands, 
damping, decoupling, constraints on settling or 
rise time, and much else. In addition, as plant 
models are generally only approximations of the 
true system dynamics, control laws have to be 
robust with respect to uncertainty in physical 
parameters or with regard to un-modeled high- 
frequency phenomena. Not surprisingly, such a 
plethora of constraints present a major challenge 
for controller tuning, not only due to the ever¬ 
growing number of such constraints but also 
because of their very different provenience. 

The dramatic increase in plant complexity is 
exacerbated by the desire that regulators should 
be as simple as possible, easy to understand and 
to tune by practitioners, convenient to hardware 
implement, and generally available at low cost. 
Such practical constraints explain the limited use 
of black-box controllers, and they are the driving 
force for the implementation of structured control 
architectures, as well as for the tendency to re¬ 
place hand-tuning methods by rigorous algorith¬ 
mic optimization tools. 


Structured Controllers 

Before addressing specific optimization tech¬ 
niques, we introduce some basic terminology 
for control design problems with structured 
controllers. A state-space description of the given 
P used for design is given as 


( Xp — Axp B\w -f- B2U 
z = C\x P + D\\w + D u u (1) 
y = C 2 XP D 21 W + D 22 U 


where A, B\,. .. are real matrices of appropriate 
dimensions, Xp E R np is the state, u e the 
control, y e M"? the measured output, w e W 1 ™ 
the exogenous input, and z € the regulated 
output. Similarly, the sought output feedback 
controller K is described as 

( xk =A k xk + B K y 
' ( u =C K x K + D K y 1 J 

with xk € M nK and is called structured if 
the (real) matrices Ak, Bk,Ck, P>k depend 
smoothly on a design parameter xeff, referred 
to as the vector of tunable parameters. Formally, 
we have differentiable mappings 

Ak = A k (x), B k = B K (x), Ck = C^(x), 
D k = D k (x), 


and we abbreviate these by the notation K(x) for 
short to emphasize that the controller is structured 
with x as tunable elements. 

A structured controller synthesis problem is 
then an optimization problem of the form 

minimize \\T WZ (P, ^(x))|| 
subject to K(x) closed-loop stabilizing (3) 
K(x) structured, x eW 1 

where T WZ (P, K ) = J~i(P, K ) is the lower feed¬ 
back connection of (1) with (2) as in Fig. 1 (left), 
also called the linear fractional transformation 
(Varga and Looye 1999). The norm || • || stands for 
the H 0 o norm, the H 2 norm, or any other system 
norm, while the optimization variable x e W 1 
regroups the tunable parameters in the design. 

Standard examples of structured controllers 
K(x ) include realizable PIDs and observer-based, 
reduced-order, or decentralized controllers, 
which in state space are expressed as 


' 0 0 

1 

0 —1/t 

-k D /z 

ki 1/t 

kp + kp)/x _ 
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Optimization-Based Control Design Techniques and Tools, Fig. 1 Black-box full-order controller K on the left , 
structured 2-DOF control architecture with K = block-diag(ATi, K 2 ) on the right 


Optimization-Based 
Control Design 
Techniques and Tools, 
Fig. 2 Synthesis of K = 
block-diag(Xi ,..., K^) 
against multiple 
requirements or models 
Each 

Kj (x) can be structured 




In the case of a PID, the tunable parameters 
are x = (r ,kp,kj ,ko), for observer-based 
controllers x regroups the estimator and state- 
feedback gains ( Kf,K c ), for reduced order 
controllers uk < np the tunable parameters x 
are the n\ + nxn y + ti k^Iu + n y n u unknown 
entries in (Ag, Bk, Ck, Dr), and in the 
decentralized form x regroups the unknown 
entries in Aku .. •, Dxq- In contrast, full- 
order controllers have the maximum number 
N = n 2 p + YipYi y + npn u + n y n u of degrees of 
freedom and are referred to as unstructured or as 
black-box controllers. 

More sophisticated controller structures K(x) 
arise from architectures like, for instance, a 
2-DOF control arrangement with feedback 
block K 2 and a set-point filter K\ as in Fig. 1 
(right). Suppose K\ is the lst-order filter 
K\(s) = a/(s + a) and K 2 the PI feedback 
K 2 (s) = kp + ki/s. Then the transfer T ry 
from r to y can be represented as the feedback 
connection of P and ^(x) with 



" A 

0 0 B 

D ._ 

C 

0 0 D 

1 .— 

0 

7 0 0 


-c 

0 7 -D 


,K(x):= 


Ki(s) 

0 


0 

K 2 (s) 


where K(x) takes a typical block-diagonal 
structure featuring the tunable elements x = 
(< a,kp,kj ). 

In much the same way, arbitrary multi-loop in¬ 
terconnections of fixed-model elements with tun¬ 
able controller blocks Ki (x) can be rearranged as 
in Fig. 2 so that K(x) captures all tunable blocks 
in a decentralized structure general enough to 
cover most engineering applications. 

The structure concept is equally useful to 
deal with the second central challenge in control 
design: system uncertainty. The latter may be 
handled with /x-synthesis techniques (Stein and 
Doyle 1991) if a parametric uncertain model 
is available. A less ambitious but often more 
practical alternative consists in optimizing the 
structured controller K(x) against a finite set 
of plants representing model 

variations due to uncertainty, aging, sensor and 
actuator breakdown, and un-modeled dynamics, 
in tandem with the robustness and performance 
specifications. This is again formally covered by 
Fig. 2 and leads to a multi-objective constrained 
optimization problem of the form 


minimize /(x) = max ||7’i,fi(A'(x))|| 
(teSOFT,i€4 
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subject to g(x) = max \\T^\j (K(x))\\ < 1 
k£HAKD,j ej k 1 J 

K(x) structured and stabilizing 

xel” (4) 

where denotes the / th closed-loop robust¬ 
ness or performance channel -> Zi for the 
kt h plant model P^ k \s). The rationale of (4) 
is to minimize the worst-case cost of the soft 
constraints || T^l II , k e SOFT while enforcing 
the hard constraints \\Tw%- II < 1, k e HARD. 
Note that in the mathematical programming ter¬ 
minology, soft and hard constraints are classi¬ 
cally referred to as objectives and constraints. 
The terms soft and hard point to the fact that 
hard constraints prevail over soft ones and that 
meeting hard constraints for solution candidates 
is mandatory. 


Optimization Techniques Over the 
Years 

During the late 1990s, the necessity to develop 
design techniques for structured regulators 
K(x) was recognized (Fares et al. 2001), and 
the limitations of synthesis methods based on 
algebraic Riccati equations (AREs) or linear 
matrix inequalities (LMIs) became evident, as 
these techniques can only provide black-box 
controllers. The lack of appropriate synthesis 
techniques for structured K(x) led to the 
unfortunate situation, where sophisticated 
approaches like the paradigm developed by 
academia since the 1980s could not be brought to 
work for the design of those controller structures 
K(x) preferred by practitioners. Design engineers 
had to continue to rely on heuristic and ad hoc 
tuning techniques, with only limited scope and 
reliability. As an example, post-processing to 
reduce a black-box controller to a practical size 
is prone to failure. It may at best be considered a 
fill-in for a rigorous design method which directly 
computes a reduced-order controller. Similarly, 
hand-tuning of the parameters x remains a 
puzzling task because of the loop interactions 
and fails as soon as complexity increases. 


In the late 1990s and early 2000s, a change 
of methods was observed. Structured H 2 - and 
Hoq -synthesis problems (3) were addressed by 
bilinear matrix inequality (BMI) optimization, 
which used local optimization techniques based 
on the augmented Lagrangian method (Fares 
et al. 2001; Noll et al. 2002; Kocvara and Stingl 
2003), sequential semidefinite programming 
methods (Fares et al. 2002; Apkarian et al. 2003), 
and non-smooth methods for BMIs (Noll et al. 
2009; Lemarechal and Oustry 2000). However, 
these techniques were based on the bounded 
real lemma or similar matrix inequalities and 
were therefore of limited success due to the 
presence of Lyapunov variables, i.e., matrix¬ 
valued unknowns, whose dimension grows 
quadratically in np + Uk and represents the 
bottleneck of that approach. 

The epoch-making change occurs with the 
introduction of non-smooth optimization tech¬ 
niques (Noll and Apkarian 2005; Apkarian and 
Noll 2006b,c, 2007) to programs (3) and (4). 
Today non-smooth methods have superseded ma¬ 
trix inequality-based techniques and may be con¬ 
sidered the state of the art as far as realistic 
applications are concerned. The transition took 
almost a decade. 

Alternative control-related local optimization 
techniques and heuristics include the gradient 
sampling technique of Burke et al. (2005), 
derivative-free optimization discussed in Kolda 
et al. (2003) and Apkarian and Noll (2006a) 
and particle swarm optimization; see Oi 
et al. (2008) and references therein and also 
evolutionary computation techniques (Lieslehto 
2001). The last three classes do not exploit 
derivative information and rely on function 
evaluations only. They are therefore applicable 
to a broad variety of problems including those 
where function values arise from complex 
numerical simulations. The combinatorial nature 
of these techniques, however, limits their 
use to small problems with a few tens of 
variable. More significantly, these methods often 
lack a solid convergence theory. In contrast, 
as we have demonstrated over recent years 
(Apkarian and Noll 2006b; Noll et al. 2008), 
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specialized non-smooth techniques are highly 
efficient in practice, are based on a sophis¬ 
ticated convergence theory, are capable of 
solving medium-size problems in a matter 
of seconds, and are still operational for 
large-size problems with several hundreds of 
states. 


Non-smooth Optimization 
Techniques 

The benefit of the non-smooth casts (3) and 
(4) lies in the possibility to avoid searching for 
Lyapunov variables, a major advantage as their 
number (np + uk ) 2 /2 usually largely dominates 
n , the number of true decision parameters x. 
Lyapunov variables do still occur implicitly in the 
function evaluation procedures, but this has no 
harmful effect for systems up to several hundred 
states. In abstract terms, a non-smooth optimiza¬ 
tion program has the form 


minimize /(x) 

subject to g(x) < 0 (5) 

X e R" 

where f,g : W 1 —> M are locally Lipschitz 
functions and are easily identified from the cast 
in (4). 

In the realm of convex optimization, non¬ 
smooth programs are conveniently addressed by 
so-called bundle methods, introduced in the late 
1970s by Lemarechal (1975). Bundle methods 
are used to solve difficult problems in integer pro¬ 
gramming or in stochastic optimization via La- 
grangian relaxation. Extensions of the bundling 
technique to non-convex problems like (3) or 
(4) were first developed in Apkarian and Noll 
(2006b,c, 2007), Apkarian et al. (2008), Noll 
et al. (2009), and, in more abstract form, Noll 
et al. (2008). 

Figure 3 shows a schematic view of a 
non-convex bundle method consisting of a 
descent-step generating inner loop (yellow 
block) comparable to a line search in smooth 
optimization, embedded into the outer loop 



Optimization-Based Control Design Techniques and Tools, Fig. 3 Flowchart of proximity control bundle 
algorithm 
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(blue box), where serious iterates are processed, 
stopping criteria are applied, and the model 
tradition is assured. Serious steps or iterates refer 
to steps accepted in a line search, while null steps 
are unsuccessful steps visited during the search. 
By model tradition, we mean continuity of the 
model between (serious) iterates and x i+1 
by recycling some of the older planes used at 
counter j into the new working model at j + 1. 
This avoids starting the first inner loop k = 1 at 
j'+l from scratch and therefore saves time. 

At the core of the interaction between in¬ 
ner and outer loop is the management of the 
proximity control parameter r, which governs 
the stepsize ||x — y^|| between trial steps y k 
at the current serious iterate x. Similar to the 
management of a trust region radius or of the 
stepsize in a line search, proximity control al¬ 
lows to force shorter trial steps if agreement of 
the local model with the true objective function 
is poor and allows larger steps if agreement is 
satisfactory. 

Oracle-based bundle methods traditionally as¬ 
sure global convergence in the sense of subse¬ 
quences under the sole hypothesis that for every 
trial point x, the function value /(x) and a Clarke 
subgradient (j) e 9/(x) are provided. In automatic 
control applications, it is as a rule possible to 
provide more specific information, which may be 
exploited to speed up convergence. 

Computing function value and gradients of the 
H 2 norm /(x) = || T wz (P, K(x)) H 2 requires es¬ 
sentially the solution of two Lyapunov equations 
of size np+riK (see Apkarian et al. 2007 ; Rautert 
and Sachs 1997). For the norm, /(x) = 
\\T WZ (P , K(x)) || oo, function evaluation is based 
on the Hamiltonian algorithm of Benner et al. 
(2012) and Boyd et al. (1989). The Hamiltonian 
matrix is of size rip + Uk so that function eval¬ 
uations may be costly for very large plant state 
dimension (np > 500), even though the number 
of outer loop iterations of the bundle algorithm is 
not affected by a large n p and generally relates 
to n , the dimension of x. The additional cost for 
subgradient computation for large n p is relatively 
cheap as it relies on linear algebra (Apkarian and 
Noll 2006b). 


Computational Tools 

The novel non-smooth optimization methods 
became available to the engineering commu¬ 
nity since 2010 via the MATLAB Robust 
Control Toolbox (Robust Control Toolbox 4.2 
2012; Gahinet and Apkarian 2011). Routines 
HINFSTRUCT, LOOPTUNE, and SYSTUNEare 
versatile enough to define and combine tunable 
blocks Ki(x ), to build and aggregate design 
requirements T^z of different nature, and to 
provide suitable validation tools. Their imple¬ 
mentation was carried out in cooperation with 
R Gahinet (MathWorks). These routines further 
exploit the structure of problem (4) to enhance 
efficiency (see Apkarian and Noll 2006b, 2007). 

It should be mentioned that design problems 
with multiple hard constraints are inherently 
complex. It is well known that even simultaneous 
stabilization of more than two plants P^ with 
a structured control law K(x ) is NP-complete so 
that exhaustive methods are expected to fail even 
for small to medium problems. The principled 
decision made in Apkarian and Noll (2006b) 
and reflected in the MATLAB routines is to 
rely on local optimization techniques instead. 
This leads to weaker convergence certificates 
but has the advantage to work successfully 
in practice. In the same vein, in (4) it is 
preferable to rely on a mixture of soft and hard 
requirements, for instance, by the use of exact 
penalty functions (Noll and Apkarian 2005). 
Key features implemented in the mentioned 
MATLAB routines are discussed in Apkarian 
(2013), Gahinet and Apkarian (2011), and 
Apkarian and Noll (2007). 

Design Example 

Design of a feedback regulator is an interactive 
process, in which tools like SYS TUNE, 
LOOPTUNE, or HINFSTRUCT support the 
designer in various ways. In this section we 
illustrate their enormous potential by solving 
a multi-model, fixed-structure reliable flight 
control design problem. 



Optimization-Based Control Design Techniques and Tools 


1007 



Selector 


Optimization-Based Control Design Techniques and Tools, Fig. 4 Synthesis interconnection for reliable control 


Optimization-Based Control Design Techniques and Tools, Table 1 Outage scenarios where 0 stands for failure 


Outage cases 

Diagonal of outage gain 




Nominal mode 

1 

1 

1 

1 

1 

Right elevator outage 

0 

1 

1 

1 

1 

Left elevator outage 

1 

0 

1 

1 

1 

Right aileron outage 

1 

1 

0 

1 

1 

Left aileron outage 

1 

1 

1 

0 

1 

Left elevator and right aileron outage 

1 

0 

0 

1 

1 

Right elevator and right aileron outage 

0 

1 

0 

1 

1 

Right elevator and left aileron outage 

0 

1 

1 

0 

1 

Left elevator and left aileron outage 

1 

0 

1 

0 

1 


In reliable flight control, one has to maintain 
stability and adequate performance not only in 
nominal operation but also in various scenarios 
where the aircraft undergoes outages in elevator 
and aileron actuators. In particular, wind gusts 
must be alleviated in all outage scenarios to main¬ 
tain safety. Variants of this problem are addressed 
in Liao et al. (2002). 

The open loop FI6 aircraft in the scheme of 
Fig. 4 has six states, the body velocities u,v,w 
and pitch, roll, and yaw rates q, p,r. The state 
is available for control as is the flight-path bank 
angle rate \i (deg/s), the angle of attack a (deg), 
and the sideslip angle /3 (deg). Control inputs are 
the left and right elevator, left and right aileron, 


and rudder deflections (deg). The elevators are 
grouped symmetrically to generate the angle of 
attack. The ailerons are grouped antisymmetri¬ 
cally to generate roll motion. This leads to three 
control actions as shown in Fig. 4. The controller 
consists of two blocks, a 3 x 6 state-feedback gain 
matrix K x in the inner loop and a 3 x 3 integral 
gain matrix K\ in the outer loop, which leads to a 
total of 27 = dimx parameters to tune. 

In addition to nominal operation, we consider 
eight outage scenarios shown in Table 1 . 

The different models associated with the 
outage scenarios are readily obtained by pre¬ 
multiplication of the aircraft control input by a 
diagonal matrix built from the rows in Table 1 . 
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Optimization-Based Control Design Techniques and Tools, Fig. 5 Responses to step changes in /z, a, and /3 for 

nominal design 


The design requirements are as follows: 

• Good tracking performance in /z, a , and /3 
with adequate decoupling of the three axes. 

• Adequate rejection of wind gusts of 5 m/s. 

• Maintain stability and acceptable performance 
in the face of actuator outage. 

Tracking is addressed by an LQG cost 
(Maciejowski 1989), which penalizes integrated 
tracking error e and control effort u via 


J = Irni^E j^ T \\W e ef + \\W u ufdty 

( 6 ) 

Diagonal weights W e and W u provide tuning 
knobs for trade-off between responsiveness, con¬ 
trol effort, and balancing of the three channels. 
We use W e = diag(20, 30,20), W u = h for nor¬ 
mal operation and W e = diag(8,12, 8), W u = h 
for outage conditions. Model-dependent weights 
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Optimization-Based Control Design Techniques and Tools, Fig. 6 Responses to step changes in /x, a, and /3 for 

fault-tolerant design 


allow to express the fact that nominal operation 
prevails over failure cases. Weights for failure 
cases are used to achieve limited deterioration of 
performance or of gust alleviation under deflec¬ 
tion surface breakdown. 

The second requirement, wind gust allevia¬ 
tion, is treated as a hard constraint limiting the 
variance of the error signal e in response to white 
noise w g driving the Dry den wind gust model. 


In particular, the variance of e is limited to 0.01 
for normal operation and to 0.03 for the outage 
scenarios. 

With the notation of section “Non-smooth Op¬ 
timization Techniques,” the functions /(x) and 
g(x) in (5) are /(x) := max*^,...^ ||7^ ) (x)|| 2 
and g(x) := max k=h ,„ <9 ||(x) || 2 , where r 
denotes the set-point inputs in /x, a , and /3. The 
regulated output z is 
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z T ■= [(W e l/2 e) T ( W u 1 / 2 u) t ] T , 

with x = (vec(X;), vec(X x )) e M 27 . Soft 

constraints are the square roots of J in (6) 
with appropriate weightings W e and W u , hard 
constraints the RMS values of e , suitably 
weighted to reflect variance bounds of 0.01 and 
0.03. These requirements are covered by the 
Variance and WeightedVariance options 
in Robust Control Toolbox 4.2 (2012). 

With this setup, we tuned the controller 
gains Kj and K x for the nominal scenario only 
(nominal design) and for all nine scenarios 
( fault-tolerant design). The responses to set- 
point changes in /z, a, and /3 with a gust speed 
of 5 m/s are shown in Fig. 5 for the nominal 
design and in Fig. 6 for the fault-tolerant design. 
As expected, nominal responses are good but 
notably deteriorate when faced with outages. In 
contrast, the fault-tolerant controller maintains 
acceptable performance in outage situations. 
Optimal performance (square root of LQG cost 
J in (6)) for the fault-tolerant design is only 
slightly worse than for the nominal design (26 
vs. 23). The non-smooth program (5) was solved 
with SYS TUNE, and the fault-tolerant design 
(9 models, 11 states, 27 parameters) took 30 s 
on Mac OS X with 2.66GHz Intel Core i7 and 
8 GB RAM. The reader is referred to Robust 
Control Toolbox 4.2 (2012) or higher versions, 
for further examples, and additional details. 

Future Directions 

From an application viewpoint, non-smooth 
optimization techniques for control system 
design and tuning will become one of the 
standard techniques in the engineer’s toolkit. 
They are currently studied in major European 
aerospace industries. 

Future directions may include: 

• Extension of these techniques to gain schedul¬ 
ing in order to handle larger operating do¬ 
mains. 

• Application of the available tools to integrated 
system/control when both system physical 
characteristics and controller elements are 


optimized to achieve higher performance. 
Application to fault detection and isolation 
may also reveal as an interesting vein. 


Cross-References 

► H-Infinity Control 

► Optimization Based Robust Control 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 
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Abstract 

Managers can stake a claim by committing to 
capital investments today that can influence their 
rivals’ behavior or take a “wait-and-see” or step- 
by-step approach to avoid possible adverse mar¬ 
ket consequences tomorrow. At the core of this 
corporate dilemma lies the classic trade-off be¬ 
tween commitment and flexibility. This trade¬ 
off calls for a careful balancing of the merits 
of flexibility against those of commitment. This 
balancing is captured by option games. 


Keywords 

Game theory; Option games; Optimal stopping; 
Real options 


Introduction 

The global competitive environment has become 
increasingly more challenging as modern 
economies undergo unprecedented changes in 
the midst of the global economic turmoil. Real- 
world dilemmas corporate managers face today 
are driven by the interplay among strategic 
and market uncertainty. The tech industry 
has evolved most rapidly, putting companies 
unable to respond to market developments 
and technological breakthroughs at severe 
disadvantage. Corporate management’s plans 
and how they implement their strategy will likely 
determine whether the firm will survive and be 
successful in the marketplace or become extinct. 
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Formulating the right strategy in the right 
competitive environment at the right time is a 
nontrivial task. Whether to invest in a new tech¬ 
nology, a new product or enter a new market is 
a strategic decision of immense importance. Cor¬ 
porate management must assess strategic options 
with proper analytical tools that can help deter¬ 
mine whether to commit to a particular strategic 
path, given scarce or costly resources, or whether 
to stay flexible. Oftentimes, firms need to position 
themselves flexibly to capitalize on future oppor¬ 
tunities as they emerge, while limiting potential 
losses arising from adverse future circumstances. 
In many cases, corporate managers find them¬ 
selves in need to revise their decision plans in 
view of actual market developments when facing 
an uncertain future; they can then decide to un¬ 
dertake only those projects with sufficiently high 
prospects in the future to justify commitment at 
that time. This needs to be balanced with the 
need to make irreversible strategic commitments 
to seize first-mover advantage presenting rivals 
with a fait accompli to which they have no choice 
but adapt. 

Capital Budgeting Ignoring Strategic 
Interactions 

Net Present Value 

Prevailing management approaches simplify 
matters and often lead to investment decisions 
that are detrimental to the firm’s long-term well¬ 
being. Suppose a firm’s future cash flow at time 
t is given by a random variable X t . Cash flows 
then evolve as a geometric Brownian motion 

dX t = gX t dt + oX t dB t andXo = x 

with drift parameter g and volatility or. The Brow¬ 
nian motion (B t ; t > 0) captures exogenous 
market uncertainty. The standard criterion used 
in corporate finance is based on discounted cash 
flows (DCF) or net present value (NPV). This 
consists in assessing the current value of a project 
by discounting the expected future cash flows 
E[X t \ at a constant discount rate, r. Management 
supposedly creates shareholder value by under¬ 


taking projects with positive NPV, i.e., projects 
for which the present value of cash flows, v(x) = 
/ 0 °° e~ rt E[X t \dt, exceeds the necessary invest¬ 
ment cost, I . In the present case, the firm will 
invest under the zero-NPV criterion if 

— > / CD 

r-g 

This traditional criterion views investment oppor¬ 
tunities as now-or-never decisions under passive 
management. However, this precludes the possi¬ 
bility to adjust future decisions in case the market 
develops off the expected path. While market 
uncertainty is factored in through the discount 
rate, the flexibility management has is typically 
not properly accounted for. 

Real Options Analysis 

It has become standard practice in finance and 
strategy to interpret real investment opportunities 
as being analogous to financial options. This view 
is well accepted among academics and practi¬ 
tioners alike and is at the core of real options 
analysis (ROA). ROA is an extension of option¬ 
pricing theory to real investment situations (My¬ 
ers 1977; Trigeorgis 1996). This approach effec¬ 
tively allows one to capture the dynamic nature of 
decision-making since it factors in management’s 
flexibility to revise and adapt its decision in the 
face of market uncertainty. ROA allows managers 
with flexibility to adapt to actual market devel¬ 
opments as uncertainty gets resolved. Managers 
may, for example, delay the start (or closure) of a 
project depending on its prospects. This approach 
leverages on optimal stopping theory (e.g., see 
Bensoussan and Lions 1982; Dixit and Pindyck 
1994) and is considered to be more reflective 
of real decision-making than traditional methods. 
In the case the firm can delay the decision to 
invest, for example, the problem is one of optimal 
stopping: 

V(x) = ma x T E[e~ rT (v(X T ) — /)] 

by ROA, the discount rate r is the risk-free 
interest (Dixit and Pindyck 1994; Trigeorgis 
1996). The time of managerial action, T, is 
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now a strategic decision variable random by 
nature as the decision maker faces an uncertain 
environment. This problem has an analytical 
solution characterized by a threshold policy, say 
a trigger X , given by 


where b is the positive root of a quadratic 
function (e.g., see Dixit and Pindyck 1994) and 
b/(b — 1) > 1. When decisions are costly or 
difficult to reverse, corporate managers would be 
more cautious and careful to make decisions. A 
firm should not always commit immediately - 
even if the NPV criterion (1) indicates so - but 
wait until the gross project value is sufficiently 
positive to cover the investment cost I by a factor 
larger than one, as expressed in (2). Investing 
prematurely may destroy shareholder value. 
Real options may justify sometimes undertaking 
projects with negative (static) net present value 
if it creates a platform for growth options or 
delaying projects with positive NPV. 

Accounting for Strategic Interactions 
in Capital Budgeting 

Strategic Uncertainty 

As natural monopolies have lost their secular 
well-protected positions owing to market liber¬ 
alization in the European Union and elsewhere 
across the globe, strategic interdependencies have 
become new key challenge for managers. At 
the same time sectors traditionally populated by 
multiple firms have undergone significant consol¬ 
idation, often resulting in oligopolistic situations 
with a reduced number of players. The ongoing 
economic crisis has amplified these consolidation 
pressures. These two ongoing phenomena - lib¬ 
eralization and consolidation - have put high on 
the corporate agenda the assessment of strategic 
options under competition. Standard real options 
analysis often examines investment decisions as 
if the option holder has a proprietary right to 
exercise. This perspective may not be realistic 
in the new oligopolistic environment as several 


firms may share the right to a related investment 
opportunity in the industry. 

Game Theory 

In oligopolistic industries, firms often have dif¬ 
ficulty predicting how rivals will behave and 
make decisions based on beliefs about their likely 
behavior. A theory that helps characterize beliefs 
and form predictions about which strategies op¬ 
ponents will follow is helpful in analyzing such 
oligopolistic situations. Game theory has tradi¬ 
tionally been used to frame strategic interactions 
arising in conflict situations involving parties 
with different objectives or interests. It attempts 
to model behavior in strategic situations or games 
in which one party’s success in making choices 
depends on the choices of other players through 
influencing one another’s welfare. Game theory 
adopts a different perspective on optimization, as 
the focus is on the formation of beliefs about 
how rivals’ optimal strategies. Finance theory 
has been primarily concerned with “moves by 
nature,” while game theory focuses on “optimiza¬ 
tion problems” involving multiple players. To 
solve a game, one needs to reduce a complex 
multiplayer problem into a simpler structure that 
captures the essence of the conflict situation. One 
can then derive useful predictions about how 
rivals are likely to react in a given situation. 
Game theory helped reshape microeconomics by 
providing analytical foundations for the study of 
market behavior and has been at the foundation 
of the Nobel prize winning research field of 
industrial organization. 

Dynamic game theory (see, e.g., Basar and 
Olsder 1999) addresses problems in which 
several parties are in repeated interaction. 
Strategic management approaches based on 
dynamic economic theory can provide a richer 
foundation for understanding developments and 
competitive reactions within an industry. As 
firm competitiveness involves interactions among 
several players (rivals, suppliers or clients), game 
theoretic analysis brings important insights into 
strategic management in addressing such issues 
as first- and second-mover advantages, firm 
entry and exit decisions, strategic commitment, 
reputation, signaling, and other informational 
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effects. A key lesson is that, when firms react to 
one another, it may sometimes be appropriate 
for one firm to take an aggressive stance in 
expectation that rivals will back off. Dynamic 
industrial organization includes the analysis of 
“games of timing” such as preemption games 
or war of attrition, whereby firms decide on 
appropriate investment timing under rivalry. 


Option Games 

The earlier optimal stopping problem falls in the 
category of “games of timing” when a firm’s 
entry decision influences another firm’s market 
strategy. Option games are most suitable to help 
model situations where a firm that has a real op¬ 
tion to (dis)invest faces rivalry. Here, the problem 
consists in finding a Nash equilibrium solution 
for the two-player equivalent of the above optimal 
stopping problem. This solution must also satisfy 
certain dynamic consistency criteria. For sequen¬ 
tial investments, the follower is faced with a 
single-agent optimal investment timing problem; 
it will thus enter if the gross project value exceeds 
the investment cost by a sufficient factor. A firm 
entering the market early on, i.e., a leader, earns 
temporary monopoly rents as long as demand 
remains below the follower’s entry threshold. 
Following the follower’s entry, the firms act as 
a duopoly. As long as the leader’s value exceeds 
the follower’s, there is an incentive for one firm to 
invest, but not necessarily for both of them, lead¬ 
ing to a “coordination problem.” The competitive 
pressure will dissipate away the leader’s first- 
mover advantage, leading to a market entry point 
that is not socially optimal and to rent dissipa¬ 
tion. Unfortunately, the multiplayer problem does 
not involve a simple analytical solution, since at 
each point a duopolist firm might end up in any 
of four distinct situations (two-by-two matrix) 
depending on the rival’s entry decision. Option 
games indicate in each situation which driving 
force (commitment vs. flexibility) prevails and 
whether to go ahead with the investment or wait 
and see. Main drivers of the prevailing market 


equilibrium include the riskiness of the venture, 
cr, the magnitude of the first-mover advantage and 
the exclusive or shared ability to reap the benefits 
of the investment vis-a-vis rivals. When firms 
can grasp a large first-mover advantage from in¬ 
vesting early but cannot differentiate themselves 
sufficiently from each other, they may be tempted 
to wage a preemptive war, investing prematurely 
at an early market stage that actually kills option 
value. If firms are more on an equal footing but 
do not see much benefit from investing early, 
they may prefer to wait and invest (jointly) at a 
later stage when the future market is sufficiently 
mature. If, however, one firm has a comparative 
cost advantage that dominates (e.g., a radical or 
drastic technological superiority) its rival indus¬ 
try, participants may prefer a consensual leader- 
follower investment arrangement involving less 
option value destruction. 


Conclusions 

Corporate management’s strategic tool kit should 
provide clearer guidance on whether to pursue 
a wait-and-see stance in the face of uncertain 
market developments or jump on the first-mover 
bandwagon to build competitive advantage. We 
discussed two different modeling approaches that 
provide complementary perspectives and insights 
to help management deal with issues of flexibility 
versus commitment: real options and dynamic 
game theory. While each approach separately 
might turn a blind eye to flexibility or commit¬ 
ment, an integrative perspective through “options 
games" might provide the right balance and serve 
as a tool kit for adaptive competitive strategy. 
Both perspectives ultimately aim to derive bet¬ 
ter insights into industry dynamics under indus¬ 
try conditions characterized by both market and 
strategic uncertainty. 

Option games pave the way for a consis¬ 
tent approach in addressing managerial decision¬ 
making, elevating the art of strategy to scientific 
analysis. Option games integrates in a common, 
consistent framework recent advances made in 
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these diverse set of disciplines. This emerging 
field that represents a promising strategic man¬ 
agement tool that can help guide managerial 
decisions through the complexity of the modern 
competitive marketplace. 


Cross-References 

► Auctions 

► Learning in Games 


Recommended Reading 

Smit and Trigeorgis (2004) discuss related trade¬ 
offs with discrete-time real option techniques. 
Grenadier (2000) and Huisman (2001) examine 
a number of continuous-time models. Chevalier- 
Roignant and Trigeorgis (2011) synthesize both 
types of “option games.” An overview of the 
literature is provided in Chevalier-Roignant et al. 
( 2011 ). 
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Abstract 

The nonlinear Kuramoto equations for n coupled 
oscillators are derived and studied. The oscilla¬ 
tors are defined to be synchronized when they os¬ 
cillate at the same frequency and their phases are 
all equal. A control-theoretic viewpoint reveals 
that synchronized states of Kuramoto oscillators 
are locally asymptotically stable if every oscilla¬ 
tor is coupled to all others. The problem of syn¬ 
chronization in Kuramoto oscillators is closely 
related to rendezvous, consensus, and flocking 
problems in distributed control. These problems, 
with their elegant solution by graph theory, are 
discussed briefly. 


Keywords 

Graph theory; Kuramoto model; Laplacian; 
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Introduction 

An oscillator is an electronic circuit or other kind 
of dynamical system that produces a periodic 
signal. If several oscillators are coupled together 
in some fashion and the periodic signals that they 
each produce are of the same frequency and are in 
phase, the oscillators are said to be synchronized. 
The book Sync: The Emerging Science of Spon¬ 
taneous Order , by Strogatz, introduces a wide 
variety of phenomena where oscillators synchro¬ 
nize. Some examples from biology: networks of 
pacemaker cells in the heart, circadian pacemaker 
cells in the suprachiasmatic nucleus of the brain, 
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Oscillator Synchronization, Fig. 1 Two metronomes 
on a board that is on two pop cans. After the metronomes 
are let go at the same frequency but at different times, they 
soon become synchronized and tick in unison. 

metabolic synchrony in yeast cell suspensions, 
groups of synchronously flashing fireflies, and 
crickets that chirp in unison. Engineering exam¬ 
ples include clock synchronization in distributed 
communication networks and electric power net¬ 
works with synchronous generators. 

A very simple example of oscillator synchro¬ 
nization was discovered by Christiaan Huygens, 
the prominent Dutch scientist and mathematician 
who lived in the 1600s. One of his contributions 
was the invention of the pendulum clock, where a 
pendulum swings back and forth with a constant 
frequency. Huygens observed that two pendulum 
clocks in his house synchronized after some time. 
The explanation for this phenomenon is that the 
pendula were coupled mechanically through the 
wooden frame of the house. The same principle 
can be observed by a fun, simple experiment. As 
in Fig. 1, put two pop cans on a table, on their 
sides and parallel to each other. Place a board on 
top of them, and place two (or more) metronomes 
on the board. Set the metronomes to tick at the 
same frequency. Start them off ticking but not in 
unison. Within a few minutes they will be ticking 
in unison. 

In this essay we derive what are known as 
the Kuramoto equations, a mathematical model 
of n oscillators, and then we study when they will 
synchronize. 


The Kuramoto Model 

In 1975 the Japanese researcher Yoshiki Ku¬ 
ramoto gave one of the first serious mathemati¬ 


cal studies of coupled oscillators. To derive Ku¬ 
ramoto’s equations, we begin with a simple hy¬ 
pothetical setup. Imagine n runners going around 
a circular track. Suppose they’re all going at 
roughly the same speed, but each adjusts his/her 
speed based on the speeds of his/her nearest 
neighbors. If some runner passes another, that one 
tends to speed up to close the gap. The synchro¬ 
nization question is do the runners eventually end 
up running together in a tight pack? 

Idealize the runners to be merely points, num¬ 
bered k = 1 They move on the unit 

circle in the complex plane. A point on the unit 
circle can be written as & e , where j denotes the 
unit imaginary number and 0 denotes the angle 
measured counterclockwise from the positive real 
axis. The position of point k at time t is Zkif) = 
Q j(a)t+e k {t)), w h ere M [ s the nominal rotational 
speed in rad/s, and Ok (t) is the difference between 
the actual angle at time t and the nominal angle 
cot. Notice that go is a constant positive real 
number and it is the same for all n points. As 
in circuit theory, it simplifies the mathematics to 
refer all the positions to the sinusoid e yft ^, and 
therefore we define the local position of point k 
to be p k (t) = Zk(t)/e ja)t , i.e., p k (t) = e- 7 ’^. 
Differentiate the local position with respect to 
time and let “dot” denote d/dt: pk = e j9k j0k. 
Define the local rotational velocity Vk = Ok and 
substitute into the preceding equation: 

Pk = v k jpk • ( 1 ) 

The local velocity Vk could be positive or neg¬ 
ative. Notice that if we view pk as a vector 
from the origin and view multiplication by j 
as rotation by tt / 2 , then jpk can be viewed as 
tangent to the circle at the point pk - see the 
picture on the left in Fig. 2. 

Now we propose a feedback law for Vk in 
Eq. (1); see the picture on the right in Fig. 2. Take 
Vk proportional to the projection of pi onto the 
tangent at pk, that is, Vk = ( Pi,JPk )• Here the 
inner product between two complex numbers v , w 
is (v, w) = Re vw. (You may check that this is 
equivalent to the usual dot product of vectors in 
M 2 .) Thus from (1) the model to get k to close the 
gap is pk = (PiJPk)jPk- 
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Oscillator Synchronization, Fig. 2 Left : The vectors pk 
and jpk. Right'. The local velocity Vk 

More generally, suppose that point k pays 
attention to not just point i but a fixed set of points 
called its neighbors. Let A4 denote the index set 
of neighbors of point k and for simplicity assume 
A4 does not depend on time. We consider the 
control law v k = T, ieA f k {PiJPk) and thereby 
arrive at the model of the evolution of the posi¬ 
tions pk : 

pk = ^{pijpk)jpk- 
i^Nk 

However, the Kuramoto model gives the evo¬ 
lution of the angles Ok rather than the points pk . 
To find the equation for 6k, we observe that 

(PiJPk) = Re (Pijpk) 

= Re (e^'/e 7 ^) 

= sin(0, - Q k ). 

In this way, the controlled points move according 
to 

Pk = L sin(0 ' “ °k)jPk- 

ieAfk 

Substitute in pk = e j0k and then cancel jpk : 

Ok = ^2 — Ok), k = 1 ,..., n. ( 2 ) 

ieAfk 

This is the Kuramoto model of coupled oscil¬ 
lators in terms of the phases of the oscillators. 
There are n coupled nonlinear ordinary differen¬ 
tial equations. 

Equation (2) has the vector form 0 = g(0). 
There are some variations in the literature about 


the state space associated with this equation. It is 
important to get the state space right because oth¬ 
erwise the concepts of stability and synchroniza¬ 
tion become shaky. The phase angles Ok are real 
numbers with units of radians, so at first glance 
the state space is W 1 . But the angles are defined 
modulo 2 7t and so their values are restricted to lie 
in the interval [0, 2n). In this way the state space 
becomes [0,2 rc ) n . For example, if n = 2 the state 
space is the square [0,2tt) x [0, 2tt) viewed as 
a subset of the plane M 2 . The mapping 0 i-> e- 7 ^ 
is a one-to-one correspondence from the interval 
[0,2tt) to the unit circle in the complex plane 
C. This unit circle is usually denoted S 1 , the 
superscript signifying the circle’s dimension as a 
manifold. By this correspondence the state space 
of (2) is the n -fold product S 1 x • • • x S 1 , and this 
is sometimes called the n -torus, denoted T n . 

To recap, in what follows, the state space 
is [0,27r)”. This is an -dimensional manifold 
rather than a vector space. 


Synchronization 

Control-theoretic methods, for example, that of 
Sepulchre et al. (2007), have been insightful. We 
address now the question of whether or not the 
oscillators in (2) synchronize, that is, the phases 
asymptotically converge to a common value. In 
the state space, [0,2 7t) n , the set of synchronized 
states is the set of vectors 0 of the form cl, 
where c e [0, 2ti) and 1 is the vector of l’s. The 
simplest case is when every point is a neighbor of 
every other point, i.e., A4 contains every integer 
in the set 1,..., n except k. Then (2) becomes 

n 

Ok = T sin(0 ( - - 0 k ), k = 1,..., n. (3) 

i = 1 

Let us show that if the initial phases 0k( 0) are 
all close enough together, then 6{t) converges 
asymptotically to a synchronized state. This will 
show that the synchronized states are locally 
asymptotically stable in a certain sense. 

As stated before, Eq. (3) has the form 0 = 
g(6). The function g(0) is the gradient of a 
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positive definite function. Indeed, let re 7 ^ denote 
the average of the points e 7 01 ,..., d° n . Of course, 
r and \j/ are functions of 9 , and so we have 


r(0)e^ w = - (e j0 ' + • • • + e-'M 
n v 7 

and therefore 

r(<9) = - y 9 ' +--- + e^"|. 
n 1 1 


The average of n points on the unit circle lives 
inside the unit disc, and therefore r{9) is a real 
number between 0 and 1. It equals 1 if and only 
if the n points are equal, that is, the n phases are 
equal, and this is the state where the phases are 
synchronized. 

Define the function 


V(J9) = -r(6) 2 


JO 1 
JO] 



■■ + eJ 0 ’ 


■ ■ +e je ” 


= t (e^ + .-.+e^") ( e~j e ' 


+ • ■ ■ +e' 


-je n \ _ 


Thus 
dV(9) 

—^ = sin(0i - 9 k ) + • • • + sin (9 n - 9 h ) 
d0 k 

and therefore (3) can be written as 9 = 

dV(9)/d9. This is a gradient equation. If 0(0) 
is chosen so that all the phases are close enough 
together, then r(6( 0)) will be close to 1, and 
therefore 9 will move in a direction to increase 
V(9), that is, increase r(9), until in the limit 
r(9) = 1 and the phases are synchronized. 

There are results, e.g., Sepulchre et al. (2008), 
when the coupling is not all-to-all. Also, the term 
“synchronization” is used more generally than 
just for oscillators Wieland et al. (201 1). 


Rendezvous, Consensus, Flocking, 
and Infinitely Many Oscillators 

Synchronization of coupled oscillators is closely 
related to other problems known as rendezvous, 


consensus, or flocking problems. Phase synchro¬ 
nization is replaced by the requirement of mobile 
robots gathering at some location, by the require¬ 
ment of temperature sensors in a sensor network 
converging to the same temperature estimate, or 
by the requirement that mobile robots should 
head in the same direction. The simplest form of 
these problems has the equations 

9k = y ( 0i - 9k), k = 1,... ,n. (4) 

ieAfjc 


Notice that this can be obtained from the Ku- 
ramoto model (2) merely by replacing sin(0/ —9k) 
by 9i — 9k in (2), that is, by linearizing the 
latter at a synchronized state. We shall continue 
to call 9k a phase of an oscillator. When do the 
phases evolving according to (4) synchronize? 
The answer to the question involves a lovely col¬ 
laboration between graph theory and dynamics. 

Introduce a directed graph that is in one-to- 
one correspondence with the neighbor structure. 
The graph is made up of n nodes, one for each 
oscillator. From each node there is an arrow to 
every neighbor of that node; that is, from node 
k is an arrow to every node in A4- Denote the 
adjacency matrix and the degree matrix of the 
graph by, respectively, A and D. That is, cijj = 1 
if j is a neighbor of i and da equals the sum of 
the elements on row i of A. The Laplacian of the 
graph is defined to be L = D — A. Then (4) is 
equivalent to simply 

9 = -L9 , (5) 

where 9 is still the vector with elements 
9\,... ,9 n . Whether or not synchronization 
occurs depends on the connectivity of the graph. 
We stop here and refer the reader to the articles 

► Averaging Algorithms and Consensus and 

► Flocking in Networked Systems 

Suppose there are an infinite but countable 
number of oscillators in the model (5). When will 
they synchronize? To answer this, we have to be 
more specific. 

Let us allow an infinite number of oscillators 
numbered by the integers, positive, zero, and 
negative. Denote the phases by 9k and let 9 
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denote the phase vector, whose kth component is 
6k . Assume each oscillator has only finitely many 
neighbors, let A4 denote the set of neighbors of 
oscillator k , and let L be the Laplacian of the as¬ 
sociated graph. Finally, let 6{t) evolve according 
to the Eq. (5). This equation isn’t automatically 
well posed in the sense that there may not be a 
solution defined for all £ > 0. We have to impose 
a framework so that solutions do indeed exist. 
One natural space in which to place 6(0) is £ 2 , 
the Hilbert space of square-summable sequences. 
If L is a bounded operator on l 2 , then so is o~ Lt 
for every t >0, and hence the phase vector 
exists and belongs to t 2 for every t > 0. Another 
natural space in which to place 0(0) is £°°, the 
Banach space of bounded sequences. Again, a 
phase vector exists for all t > 0 if L is a bounded 
operator on £°°. 

The following example is from Feintuch and 
Francis (2012). Take the neighbor sets to be 
A4 = {k — 1}. The graph is a chain: There is 
an arrow from node k to node k — 1, for every 
k , and the Laplacian is the infinite matrix with 
1 on the diagonal, —1 on the first subdiagonal, 
and zero elsewhere. This Laplacian is a bounded 
operator on both £ 2 and £°°. Now the vector cl, 
where 1 is the vector of all l’s, belongs to £°° 
for every real number c, but it belongs to l 2 
only for c = 0. So the phases can potentially 
synchronize at any value in £°°, but only at 0 in 
l 2 . For the example under discussion, if the initial 
phase vector is in l 2 , then the phases synchro¬ 
nize at 0. By contrast, there exist initial phase 
vectors in l°° such that synchronization does 
not occur. Even worse, lim^oo 6(t) does not 
exist. The conclusion is that whether or not the 
oscillators will synchronize is a difficult question 
in general. 

Summary and Future Directions 

The Kuramoto model is a widely used paradigm 
for coupled oscillators. The model has the form 
6 = f(E6 ), where 6 is the vector of phases, 
the matrix E maps 6 into the vector of possible 
differences 6i — 6k , and / is a function. The 
Kuramoto model considered in this essay is not 


the most general. A more general model allows 
different frequencies cok instead of just one, and 
also a coupling gain K, leading to the model 

4 = oj k 4-V sin(0; - 9 k ), k = 1 , ... ,n. 

n 

i eJ\T k 

( 6 ) 

An important problem associated with the 
Kuramoto model is to determine which synchro¬ 
nized states are stable. The linearized equation is 
interesting in its own right and relates to problems 
of rendezvous, consensus, and flocking. 

Reference Dorfler and Bullo (2014) offers 
some questions for future study. In particular, 
it would be interesting to extend the Kuramoto 
model beyond the first-order oscillators of (2). 
Also, the case of general neighbor sets has much 
room for exploration. 

Asymptotic stability is a robust property. For 
example, if the origin is asymptotically stable 
for the system x = Ax, it remains so if A is 
perturbed by a sufficiently small amount. This 
is because the spectrum of a matrix is a con¬ 
tinuous function of the matrix. The sketch in 
Fig. 1 vividly depicts the concept of synchronized 
oscillators. A topic for future study is that of ro¬ 
bustness. Mathematically, if the two metronomes 
are identical, they will synchronize perfectly - 
this can be proved. Of course, physically two 
metronomes cannot be identical, and yet they will 
synchronize if they are close enough physically. 
A mathematical study of this phenomenon might 
be interesting. 

Cross-References 

► Averaging Algorithms and Consensus 

► Flocking in Networked Systems 

► Graphs for Modeling Networked Interactions 

► Networked Systems 

► Vehicular Chains 

Recommended Reading 

The literature on the Kuramoto model is 
huge - there are now many hundreds of journal 
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papers continuing the study of oscillators using 
Kuramoto’s model. There is space here only to 
highlight a few sources. 

You can find a mathematical study of 
coupled metronomes in Pantaleone (2002). Also, 
Pantaleone’s webpage Pantaleone describes 
some experimental observations. Kuramoto’s 
original paper is Kuramoto (1975). Dorfler and 
Bullo have recently written a comprehensive 
survey (Dorfler and Bullo (2014)). Strogatz has 
written extensively on oscillator synchronization. 
His book Sync is fascinating and is highly 
recommended (Strogatz 2004). See also Strogatz 
(2000) and Strogatz and Stewart (1993). The 
papers Scardovi et al. (2007) and Dorfler 
and Bullo (2011) are recommended for more 
recent results, the latter treating the general 
model (6). 

Getting phases in oscillators to synchronize is 
a special case of getting the states or outputs of 
coupled systems asymptotically to converge to 
a common value. There is a very large number 
of references on these subjects, a seminal one 
being Jadbabaie et al. (2003); others are Lin et al. 
(2007) and Moreau (2005). Regarding infinitely 
many oscillators, the physics literature treats only 
a continuum of oscillators, whereas countably 
many oscillators are the subject of Feintuch and 
Francis (2012). 

Acknowledgments I greatly appreciate the help from 
Luca Scardovi, Florian Dorfler, and Francesco Bullo. 
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Abstract 

This entry discusses some of the salient fea¬ 
tures of the output regulation problem for hybrid 
systems, especially in connection with the steady- 
state characterization. In order to better high¬ 
light such peculiarities, the discussion is mostly 
focused on the simplest class of linear time- 
invariant systems exhibiting such behaviors. In 
comparison with the usual regulation theory, the 
role played by the zero dynamics and by the 
presence of more inputs than outputs is particu¬ 
larly striking. 
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Introduction 

Output regulation is one of the most classical 
problems in control theory, and its celebrated 
solution in the linear time-invariant case (Davison 
1976; Francis and Wonham 1976) is character¬ 
ized by remarkable elegance and ideas (like the 
internal model principle). While the extension to 
nonlinear systems is still an active field of inves¬ 
tigation, the study of output regulation for hybrid 
systems is also being actively pursued, and sev¬ 
eral surprising results have already appeared for 
the linear case, suggesting that a richer structure 
arises in hybrid output regulation problems due to 
the interplay between flow and jump dynamics. 

The problem can be stated as follows. A 
known exosystem £ with initial state belonging 
to a suitably defined set Wo produces a signal 
w possibly affecting both the plant V and the 
compensator C ; the compensator has to guarantee 
that for any initial state of £ in a set Wo: 

• All closed-loop responses are bounded. 

• The output e of V asymptotically converges to 

zero. 

In order to avoid trivialities, the exosystem £ 
is assumed to be such that its state evolution 
from nonzero initial states in Wo is bounded and 
not asymptotically converging to zero, both in 
forward and in backward time. 

Two typical embodiments of the output regu¬ 
lation problem are the disturbance rejection and 
the reference tracking problems. In disturbance 
rejection , w acts as a disturbance on V and cannot 
be measured by C, and the output e from which 
the effect of w has to be canceled is the actual 
plant output. In reference tracking , w contains the 
references to be tracked by an output y r of V, 
so that w can be assumed to be known by C ; by 
defining the regulated output ease = y r — r, the 
reference tracking problem is cast as an output 
regulation problem. 


The solution of an output regulation prob¬ 
lem entails the solution of two subproblems: the 
definition of a set of zero output steady-state 
solutions and the asymptotic stabilization of such 
solutions (or at least making them attractive ; in 
many cases of interest, the achievement of this 
last objective actually yields asymptotic stabi¬ 
lization). As a matter of fact, the stabilization 
subproblem is already widely studied and de¬ 
scribed per se; for this reason, after some short 
remarks in section “Stabilization Obstructions in 
Hybrid Regulation”, the remainder of this pre¬ 
sentation will focus only on steady-state-related 
issues, for the simplest class of systems which 
exhibit the most peculiar and interesting phe¬ 
nomena of hybrid steady-state behavior (see in 
particular section “Key Features in Hybrid vs 
Classical Output Regulation”). For concreteness, 
only hybrid systems £, V characterized by linear 
time-invariant (flow and jump) dynamics will 
be considered; following Goebel et al. (2012, 
Chap. 2), a two-dimensional parameterization of 
hybrid time (t, k) e MxN will be used, with t 
measuring the flow of (usual) time and k counting 
the number of jumps experienced by the solution 
(see Fig. 1 for a specific example). So, the exosys¬ 
tem £ will be described at time (t, k ) by 


w = Sw , 

( w , t, k) € Cs, 

(la) 

’+ = Jw, 

( w,t,k ) e Vs, 

(lb) 



Output Regulation Problems in Hybrid Systems, 
Fig. 1 Hybrid time domain T for a “sampled data” hybrid 
system. Dots indicate ( t,k ) G T when jumps occur 
(see section “Hybrid Steady-State Generation” for the 4 
notation) 
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and the plant V will be described at time ( t , k) by 


x — Ax T- Bu + Pw , (x, u, t , /c) G Cp, 

(2a) 

+ Rw , (x, u , t, £) G Dp, 

(2b) 

e = Cv + Qw, (2c) 

with x(t,k) G M w , u(t,k) G M m , e(t,k) G M 77 , 
w(t,k) G M ? , and suitably defined flow sets C^, 
Cp and jump sets T>g, Dp. 


Stabilization Obstructions in Hybrid 
Regulation 

The achievement of asymptotic stabilization of 
the desired (zero output) steady-state responses 
for the considered class of linear hybrid systems 
crucially depends on whether the plant V and 
the exosystem £ have synchronous jump times or 
not. 

Asynchronous Jumps 

Typically, jumps in V and £ will be asyn¬ 
chronous, and this will cause the undesirable 
phenomenon that genuinely close trajectories 
will look “distant” around each jump when the 
distance is measured according to the usual 
Euclidean norm. The simplest illustration of 
such phenomenon consists in considering two 
trajectories of the same system starting from 
e-close initial conditions. Consider the system 

v = 1, v G [0,1], r> + = 0, v $ (0,1), 

with the initial states vo = 0 and v\ = e, 0 < 
s < 1. The two ensuing solutions at time (t,k) 
are immediately computed as 

v(t,k;vo)=t — k, te[k,k + 1], 

t—k + s, te[k,k + l—e], 

t — k £ — 1, t G [/c-|-l — £, k - 1-1], 


Hence, the (Euclidean) distance between the two 
solutions at time (t, k ) is given by 

d(,,k)=\ e ’ ,6[<a+i - ei ' 

| (1 — e), t G [k + 1 — e, k + 1]; 

in other words, choosing s > 0 as small as de¬ 
sired, arbitrarily close initial conditions generate 
trajectories which are apart by a finite amount (as 
close as desired to 1) during the arbitrarily small 
time intervals where t G [k + 1 — s,k + \\. 
Since stability deals with trajectories remaining 
close forever and attractivity deals with trajecto¬ 
ries getting closer and closer, examples such as 
the one above pose serious issues when defining 
(let alone establish) stability and attractivity in 
the hybrid case. Similar problems arise not only 
in output regulation problems but also in other 
areas like state tracking, observers, and general 
interconnections of hybrid systems. 

However, intuition suggests (and mathematics 
confirms, by using a suitable notion of “dis¬ 
tance”) that such trajectories are close indeed. 
Several approaches have been proposed in order 
to overcome such difficulty. Considering as an 
example a bouncing ball tracking another bounc¬ 
ing ball, the problematic time intervals are those 
between the bounce of the first ball hitting the 
ground and the bounce of the other ball; in such 
a case, the modified distances are defined by 
either 

• Allowing to exclude sufficiently short 
“problematic” intervals (possibly requiring 
that their length asymptotically tends to zero); 
see, e.g., Galeani et al. (2008, 2012) 

• Considering alternative “mirrored” trajecto¬ 
ries computed as if the last jump did not 
happen; see, e.g., Forni et al. (2013a,b) 

• Using a “stretched” distance function 8 such 
that when point a is in the jump set and 
its image via the jump map is g(a ), then 
8(a,b ) = 8(g(a), b)\ see, e.g., Biemond et al. 
(2013). 

While the first approach has been proposed first, 
the other two (which are strongly related) have 
the advantage of providing (under mild additional 
hypotheses) global control Lyapunov functions. 


v(t, k\ ri) = 
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Finally, it is worth noting that the most adequate 
tools to address similar issues for general hybrid 
systems are the “graphical distance” among hy¬ 
brid arcs and related concepts (see Goebel et al. 
2012, Chap. 5). 

Synchronous Jumps 

When synchronous jumps are considered, the 
above issue disappears, and asymptotic stabiliza¬ 
tion becomes a much simpler matter. Although 
synchronous jumps look more like an exception 
than a rule in hybrid systems, they are very 
reasonable for specific classes of problems. 

In order to have synchronous jumps, some 
authors have considered the use of “jump inputs” 
which impose a jump at a certain time , which 
can be physically reasonable in some systems, 
e.g., two tanks separated by a movable wall, 
assuming that when the wall is removed the fluid 
reaches the equilibrium configuration almost in¬ 
stantaneously. 

Another relevant class consists of “sampled 
data” systems, whose jumps are essentially due 
to digital components which operate at a fixed 
sampling rate, which will be considered in the 
rest of this entry. In such a case, letting xm be the 
sampling period, the time domain of the hybrid 
system is fixed as (see Fig. 1) 

!— {(t, k) ! t e \ktM 5 (k -|- 1)tm ]5 k e Z}, 

(3) 

all jumps happen exactly for ( t, k) with t = (k + 
1 )tm, and then ( 1 ) can be simplified as 


w = Sw , (4a) 

w + = Jw , (4b) 

and ( 2 ) can be simplified as 

x = Ax + Bu + Pw, (5a) 

= Ex + Rw , (5b) 

e = Cx + Qw , (5c) 


since flow and jump times are clear from the 
context. 

For the latter class of systems, by using linear 
time-invariant hybrid control laws and observers 


(and an easily provable separation principle), it is 
easily shown that: 

• Under a hybrid stabilizability hypothesis, 
state feedback stabilization of (5) is easily 
achieved. 

• Output feedback stabilization of (5) from e 
is also trivial under an additional hybrid de¬ 
tectability hypothesis. 

• Under hybrid detectability of the cascade 
of (4) and (5), w can be asymptotically 
estimated from e. 

Due to the above facts, it can be assumed without 
loss of generality that (5) is asymptotically sta¬ 
ble (equivalently, that all eigenvalues of Ee AtM 
have modulus strictly less than one). Asymptotic 
stability then yields incremental stability , since 
letting x and x denote two motions under the 
same inputs u, w and only differing in their initial 
states, it is immediate to see that their difference 
x := x — x evolves as 

x = x — x = Ax -\- Bu -\- Pw 
— (Ax + Bu + Pw), 

i + = x + — = Ex + Rw — (Ex + Rw), 

that is, x = Ax, = Ex, and so it is 
just a free motion of the plant, asymptotically 
converging to zero. Incremental stability implies 
that regulation is achieved as soon as it is shown 
that for any exogenous input w it is possible 
to find an input u and an initial state of (5) 
such that e is identically zero, since then any 
other motion arising from a different initial state 
will asymptotically converge to the motion with 
identically zero e. Moreover, it is easy to see that 
asymptotic stability of the origin actually implies 
uniform, global, and exponential stability of any 
trajectory for such systems. 


Hybrid Steady-State Generation 

From this point on, the rest of the presentation 
will be focused only on the case where the prob¬ 
lem data are of the form (3) to (5), since this 
allows to provide an uncluttered view on some 
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peculiar features of hybrid steady-state motions, 
without the burden of having to take care of 
delicate stability issues arising in more general 
contexts. 

Based on the preceding discussion, there is no 
loss of generality at this point in assuming that: 

• Plant (5) is asymptotically stable , which is 
equivalent to all eigenvalues of Ee A ™ having 
a magnitude strictly less than one. 

• Exosystem (4) is Poisson stable , which is 
equivalent to all eigenvalues of Je SxM having 
a magnitude equal to one. 

It is also customary to distinguish between full 
information and error feedback regulation, where 
in the first case controller C has access to the 
complete state (w, x) of the cascade of £ and V, 
whereas in the second case C can only measure 
the output e of V. 

Having assumed asymptotic stability of plant 
V, the only role of compensator C consists in 
generating the correct steady-state input, since 
then, by incremental stability of V, asymptotic 
regulation is ensured from any initial state. Re¬ 
calling the expression of T in (3), for the follow¬ 
ing developments it is useful to define the jump 
times tk and the elapsed time of flow since last 
jump g as 

tk := kxM , a(t,k) := t — kr M : 

the arguments of a(t,k) will usually be omitted 
since clear from the context. Note that g satisfies 
g = 1, cr + = 0, and it is often explicitly 
introduced as an additional timer variable. 

The Full Information Case 

Consider the candidate steady-state motion and 
input: 


~x ss (t,k )' 


n(a)- 

_u ss (t,k)_ 


_I»_ 


Requiring that such expressions actually charac¬ 
terize a response of the considered plant, as well 
as the associated output is zero, amounts to ask 
that: 

• During flows, x ss (t,k ) has to satisfy the two 
equations: 


x ss (t,k ) = fl(G)w(t,k) + n (G)w(t,k), 

x ss (t,k) = Ax ss (t,k) + Bu ss (t,k ) 

+ Pw(t,k). 

• At jumps, xf s (t,k) has to satisfy the two 
equations: 

x+( 4 + u k) = n(0)w + (tk+i,k), 
x+ s (tk+\,k) = Ex ss (t k +uk ) + Rw(t k+ uk). 

• For the output e ss to be identically zero: 

0 = Cx ss (t,k ) + Qw(t,k). 

Substituting (6) in the above conditions and con¬ 
sidering that such relations should hold for all 
values of w, the following hybrid regulator equa¬ 
tions are obtained: 

fl(cr) + U(cr)S = AU(g) + BT(g) + P, 

(7a) 

n( 0 )/ = £fI(T M ) + R, (7b) 
o = cn((7) + e. (7c) 

Equations (7) can be shown to be both necessary 
and sufficient for (6) to solve the output regula¬ 
tion problem under the considered assumptions. 
Once a solution of (7) is available, the full in¬ 
formation regulator simply reduces to the time- 
varying static feedforward controller 

u(t,k ) = T(G)w(t,k) (8) 

which just provides as input the steady-state input 
u ss characterized as in (6); in fact, since (5) is 
incrementally stable (as follows from its asymp¬ 
totic stability, which was assumed without loss of 
generality), its output response under the control 
law (8) must converge to the output response 
associated to (6). 

For later use, note that in the non-hybrid case 
where V and £ only flow 

w = Sw, (9a) 
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%SS (t) 


"n" 

_Uss(t)_ 


_r_ 


x = Ax + Bu + Pw, (9b) 

e = Cx + Qw, (9c) 

the candidate steady state (6) is replaced by 


wit), (10) 


and (7) reduces to the celebrated regulator equa¬ 
tions (or Francis equations ) 

ns = yin + #r + p, (iia) 

o = cn + e, (lib) 

and, as above, assuming without loss of gener¬ 
ality that the plant is asymptotically stable, the 
full information regulator reduces to the time- 
invariant static feedforward controller: 


u(t,k) = Tw(t,k) 


% = F% + Ge, 

u = H%, 


~x ss (t,k)- 


"n (a) - 


= 

£(a) 

_u ss {t,k)_ 


r (cr) 


£(0 )J = LS(r M ). (15b) 
T(a) = 7/£(a), (15 c) 

Equations (7) and (15) can be shown to be both 
necessary and sufficient for (13) to solve the 
output regulation problem under the considered 
assumptions and generalize the corresponding 
conditions for the non-hybrid case where V and 
8 only flow (see (9)) and (13) and (14) are 
replaced by 


% = F% + Ge , 
u = H%, 


w(t), 


Xss(t) 


"n" 

?«(0 

= 

£ 

_Uss(t)_ 


_r_ 


(16a) 

(16b) 

(16c) 


and (15) reduces to 


( 12 ) 


The Error Feedback Case 

When the exosystem state is not measured, a 
dynamic compensator of the form 


£S = F£, 
T = 


(17a) 

(17b) 


(13a) 

(13b) 

(13c) 


which is also supposed to flow and jump accord¬ 
ing to the a priori fixed time domain T considered 
for the plant, is introduced, and the corresponding 
candidate steady-state motion including £ is 


w(t,k). (14) 


By following similar steps as above, requiring 
invariance of such a manifold in the space of 
(x,%,u,w), as well as zero output on it, leads 
to the conclusion that in addition to (7), the 
following relations must be satisfied as well: 


£(a) + £(a)S = FZ(a). 


(15a) 


Relations (17) are an expression of the in¬ 
ternal model principle, stating that in order to 
achieve error feedback regulation, the compen¬ 
sator C must include a suitable “copy” of the 
exosystem, namely, (17a) imposes a constraint on 
the £ dynamics of C which, coupled with (17b), 
ensures that the signal u ss = Tw used in the full 
information case can be equivalently produced 
(without measuring w!) as u ss = //££. A similar 
interpretation can be given to (15), which must 
be required in addition to (7) in order for (13) to 
solve the hybrid error feedback output regulation 
problem. 


Key Features in Hybrid vs Classical 
Output Regulation 

While the previous section mainly aimed at show¬ 
ing how the classical theory generalizes in the 
hybrid case (at least for a special class of hybrid 
systems), the aim of this section is to point out 
some of the striking differences between the two 
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cases. Before proceeding further, and in order to 
keep focus on the characterization of the steady- 
state response, it is worth mentioning here that al¬ 
though time-varying systems will be considered, 
no issue regarding nonuniform stability (like in 
general nonautonomous systems) arises since the 
timer cr just ranges in the compact set [0, xm\ due 
to the assumed periodic structure of T (see also 
the end of section “Synchronous Jumps”). 

Comparing the classical and the hybrid output 
regulator and considering that V and £ are time 
invariant, it seems somewhat strange that in the 
output feedback case the linear time-invariant 
regulator (16a) and (16b) generalizes to a hybrid 
linear time-invariant regulator (13), whereas in 
the full information case the linear time-invariant 
regulator (12) generalizes to a hybrid linear time- 
varying regulator (8). 

One argument in favor of the time-varying reg¬ 
ulator (8) is based on the following consideration. 
It is well known that (11) has a unique solution 
in the case of a square plant (m = p) under the 
nonresonance condition between the zeros of V 
and the eigenvalues of £, requiring that 


be performed, the following discussion will be 
mainly based on showing the simplest examples 
exhibiting the pathologies of interest. 

Consider the system with r M = l (so that 
tk = k , for all k e Z) and 


w = 0, (18a) 

= —w, (18b) 


*2 


"-1 0 " 
0 -2 


x\ 

*2 


+ 



0" 

ip 

(18c) 


[x + l 


"o r 

V 

1 


_2e 1_ 

_*2_ 


(18d) 


* = [oi] 



(18e) 


The unique steady-state solution achieving out¬ 
put regulation can be simply computed. In fact, 
by (18a) and (18b), 

w(t,k ) = (— l)^w(0,0); 


rank 


A — si B 
C 0 


— n + p, 


Vs e AOS'), 


then, by ( 18e) it appears that e ss = 0, V(t,k) e T 
implies 


where A (S) denotes the spectrum of S. In such a 
case, (11) amounts to a system of nq + pq linear 
equations in nq + mq unknowns (the elements 
of n, T), which might be expected to be satisfied 
since m > p. If one were trying to use the unique 
constant solution (fl,r) of (11) as a solution 
of (7), clearly (7a) and (7c) would be satisfied, 
but then (7b) would impose other nq equations 
on n which would unlikely be satisfied. For 
this reason, apparently the additional degree of 
freedom offered by choosing time dependent II 
and T might be of help. In fact, it can be shown 
that if m = p and under a hybrid nonreso¬ 
nance condition (involving Ee AzM and Je SlM ) 
between V and £, (7 a) and (7b) have a unique 
solution for any choice of T (cr), so that the design 
boils down to satisfying (7c) by choosing T(cr); 
but is this always possible? In order to answer 
this nontrivial question, a different path must be 
followed. While a complete formal analysis can 


X 2 ,ss(t,k) = w(t,k ) = (—1)^(0,0), 

V(t,k) e T, 

which in turn implies that X 2 ,ss — 0 for all t e 
(k,k + 1), k e Z and the unique steady-state 
input 

Uss = 2X2,ss W. 

Since (18d) implies that xi tSS (tk+i,k + 1) = 
X 2 ,ss(tk+uk) = w(tk+i,k) and (18c) implies 
that x\, ss (t,k) = —e~^~ k ^x\, ss (tk,k), for t e 
(tk, 4+i), it follows that 

Xi ,ss(t,k) = -e~ {t ~ tk) w(t k ,k), t € (t k ,t k+ 1 ), 

(19) 

which finally is coherent with the jump equation 
for X2,ss in (18d) since 
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X2,ss(tk+i,k+l)=2ex l ' SS (t k+ uk)+x 2 ,ss(tk+uk) 

=2e(-e~ l )w(t k ,k ) + w(t k ,k) 
(20a) 

= -w(t k ,k ) (20b) 

=(—l) i+1 w(0,0). (20c) 


Before commenting the meaning of the above 

derived steady-state evolution, it is worth noting 
that (18) might actually derive from an original 
system with (18c) replaced by 


xi 

*2 


-1 0 
1 -2 


*2 



"0" 


"0" 

+ 

1 

u 

1 


( 21 ) 


under the preliminary state feedback 


u = —x\ + v. (22) 

Such a feedback renders the subspace 
{x : X 2 = 0} unobservable (when the system 
only flows) and reveals that the dynamics of X\ 
in (18c) is the flow zero dynamics of V, that 
is, the zero dynamics of V when jumps are 
inhibited. Having set the stage, several interesting 
observations can be made now. 

The flow zero dynamics samples the ex¬ 
ogenous signal w at jumps and then evolves 
according to its own modes (see (19)). In fact, 
while in the classical case (10) the state and input 
at steady state can be expressed as a constant 
matrix times the current value of w, the real 
nature of the time dependence of T and II in (6) 
is linked to this phenomenon of sampling w(tk, k ) 
and propagating along the zero dynamics. A 
suitable analysis shows that fl (a), r(cr) contain 
products of matrices with rightmost factor e~ So 
(which recovers w(tk,k) = e~ So w{t , k) from the 
current value w(t,k) of w) and leftmost factor 
containing the fundamental matrix of the flow 
zero dynamics. It is worth mentioning that the 
“motion along the zeros” in the present context is 
strongly related to the same kind of motions used 
for perfect tracking in non-hybrid systems. The 
above insight about the nature of the dependence 
on g in (6) also reveals why in the output feed¬ 
back case (13) such dependence is not needed: 
the required modes of the flow zero dynamics in 


that case are provided by copying them in the 
compensator dynamics! 

An even stronger consequence of the analysis 
above is a flow zero dynamics internal model 
principle, which essentially states that any output 
feedback compensator solving the output regula¬ 
tion problem must be able to produce as free re¬ 
sponses (during flow) a suitable subset of the nat¬ 
ural modes of the flow zero dynamics (and a suit¬ 
ably modified version applies to the feedforward 
static compensator (8)). It is worth noting that 
while the classical internal model principle re¬ 
quires exact knowledge of the exosystem modes 
(which is kind of a mild requirement, especially 
when the exosystem models references, or con¬ 
stant offsets), the flow zero dynamics internal 
model principle requires the exact knowledge of 
the modes of the zero dynamics, which typically 
depends on not precisely known plant parame¬ 
ters; clearly, this fact poses serious questions in 
view of the achievement of robust regulation. 

A final point, also raising serious issues about 
what can be robustly achieved (and how) in the 
setting of hybrid output regulation, is the fact 
that generically, existence of solutions is not 
robust to arbitrarily small parameter varia¬ 
tions. In particular, looking again at the compu¬ 
tations in (20), it should be clear that the involved 
functions are all fixed by previous reasonings, 
whereas satisfaction of (20) crucially depends on 
exact cancellations of certain coefficients. Any 
small variations of such coefficients in (18d) 
imply that the problem admits no steady state 
yielding zero output. This fact is in sharp contrast 
with classical regulation, where the nonresonance 
condition ensures existence of (different) solu¬ 
tions for small parameter variations. It has to be 
noted, though, that under additional conditions, 
robust existence of solutions is guaranteed if 
the plant is fat, that is, m > p. Using again the 
previous example, this is the case if an additional 
input is introduced 


~X\ 


f-i o 1 

V 


"1 0" 

U\ 


"0" 

_* 2 _ 


- 1 

<N 

1 

O 

_ i 

_* 2 _ 

+ 

_° l . 

U2 

+ 

_1_ 


since then even a constant (suitably chosen) value 
of u\ can be used to ensure that when the time to 
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jump arrives, the value of X\ is such to ensure 
a correct jump for X 2 (remember that since X\ 
is unobservable during flows, its motion can be 
changed as wished if this helps with ensuring that 
the observable X 2 achieves zero output). 

Summary and Future Directions 

The investigation of the output regulation prob¬ 
lem for hybrid systems is still at a very early 
stage. While the issues of stabilization of the 
manifold where regulation is achieved seem to 
be a relatively better understood topic (possibly 
drawing from a richer literature on stabilization 
of hybrid systems), the geometry and design of 
such manifold appear to involve several much 
more intricate issues, whose understanding will 
be crucial in order to achieve more complete 
solutions. 

Already in the very simplified case of linear 
dynamics and synchronous jumps, the important 
role played by the whole flow zero dynamics 
for feasibility (existence of solutions in the nom¬ 
inal parameter values) and by the availability 
of more inputs than outputs for well posedness 
(existence of solutions for slightly perturbed pa¬ 
rameter values) marks a strong difference with 
the linear non-hybrid case, where both properties 
are granted by satisfaction of the nonresonance 
condition, which only involves the spectrum of 
the zero dynamics, even for square plants. 

While the expected final goal of this investiga¬ 
tion should hopefully lead to the design of robust 
output regulators based on a suitable internal 
model principle, a deeper understanding of the 
structure of the steady-state motion achieving 
regulation, as well as of the effect of additional 
inputs in shaping it, seems to be an important 
preliminary step towards such goal. 

Cross-References 

► Hybrid Dynamical Systems, Feedback Control 
of 

► Nonlinear Zero Dynamics 

► Regulation and Tracking of Nonlinear Systems 


Recommended Reading 

Foundational contributions on classical output 
regulation are Francis and Wonham (1976), Davi¬ 
son (1976), and Wonham (1985); more recent 
monographs include Huang (2004), Trentelman 
et al. (2001), Pavlov et al. (2005), Saberi et al. 
(2000), and Byrnes et al. (1997). Goebel et al. 
(2012) provides a solid introduction to a pow¬ 
erful and elegant framework for hybrid systems, 
including a thorough discussion of stability is¬ 
sues related to those mentioned here. Regulation 
problems (mainly reference tracking) for classes 
of hybrid systems with asynchronous jumps are 
presented in Biemond et al. (2013), Forni et al. 
(2013a,b), Morarescu and Brogliato (2010), and 
Galeani et al. (2008, 2012); synchronous jumps 
(and the ensuing advantages) are considered e.g., 
Sanfelice et al. (2013). The class of linear systems 
with synchronous jumps considered in sections 
“Hybrid Steady State Generation” and “Key Fea¬ 
tures in Hybrid vs Classical Output Regulation” 
has been proposed in Marconi and Teel (2010, 
2013) and studied in Cox et al. (2011, 2012); the 
issues related to flow zero dynamics, fat plants 
and robustness have been discussed in Carnevale 
et al. (2012a,b, 2013), partly developing remarks 
contained in Galeani et al. (2008, 2012). 
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Abstract 

Parallel robots are closed chains consisting of a 
fixed and moving platform that are connected by 
a set of serial chain legs. Parallel robots typically 
possess both actuated and passive joints and may 
even be redundantly actuated. Although more 
structurally complex and possessing a smaller 
workspace, parallel robots are usually designed 
to exploit one or more of the natural advantages 
they possess over their serial counterparts, e.g., 
higher stiffness, increased positioning accuracy, 
and higher speeds and accelerations. In this chap¬ 
ter we provide an overview of the kinematic 
and dynamic modeling of parallel robots, a de¬ 
scription of their singularity behavior, and basic 
methods developed for their control. 


Keywords 

Closed kinematic chain; Closed loop mechanism; 
Parallel manipulator 


Introduction 

A parallel robot refers to a kinematic chain in 
which a fixed platform and moving platform 
are connected to each other by several serial 
chains, or legs. The legs, which typically have 
the same kinematic structure, are connected to 
the fixed and moving platforms at points that are 
distributed in a geometrically symmetric fashion. 
The Stewart-Gough platform (Fig. 1) is a well- 
known example of a parallel robot: each of the 
six legs is a UPS structure (i.e., consisting of 
rigid links serially connected by a universal, pris¬ 
matic, and spherical joint), with the prismatic 
joint actuated. Other examples of parallel robots 
include the 6 x RUS platform of Fig. 2, the 
haptic interface device of Fig. 3, and the eclipse 
mechanism of Fig. 4. 

Parallel robots can be regarded as a special 
class of closed chain mechanisms (i.e., chains 
that contain one or more closed loops) and are 
purposely designed to exploit the specific ad¬ 
vantages afforded by the closed chain structure, 
e.g., for improved stiffness, greater positioning 
accuracy, or higher speed. Parallel robots should 
be distinguished from two or more cooperating 
serial robots that may form closed loops during 
execution of a task (e.g., a robotic hand grasping 
an object). Some of the fastest velocities and 
accelerations recorded by industrial robots have 
been achieved by parallel robots, primarily by 
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placing the actuators on the fixed platform and 
thereby minimizing the mass of the moving parts. 

Many of the model-based techniques devel¬ 
oped for the control of traditional serial chain 
robots are also applicable to a large class of 
parallel robots. On the other hand, kinematic and 
dynamic models for parallel robots are inher¬ 
ently more complex. Parallel robots also pos¬ 
sess features not found in serial robots, e.g., 
passive joints, the possibility of redundant actu¬ 
ation, and a diverse range of singularity behav¬ 
ior, that need to be considered when designing 
a control law. We therefore begin with a brief 
overview of the kinematic and dynamic mod¬ 
eling of parallel robots before discussing their 
control. 

Modeling 

Kinematics 

Whereas the kinematic degrees of freedom, or 
mobility, of a serial chain robot can be obtained as 
the sum of the degrees of freedom of each of the 
joints, the situation is somewhat more complex 
for parallel robots and closed chains in general, 
since only a subset of the joints can be indepen¬ 
dently actuated. The mobility of a parallel robot 
corresponds to the total degrees of freedom of 
the joints that can be independently actuated. In 
some cases the number of actuated joint degrees 


of freedom may exceed the kinematic degrees of 
freedom, in which case we say that the robot is 
redundantly actuated. 

A parallel robot with a designated end-effector 
frame also has a notion of forward and inverse 
kinematics. While for serial chains the forward 
kinematics is a well-defined mapping and the 
inverse kinematics can typically have multiple 
solutions, for parallel robots the situation is less 
straightforward. For the Stewart-Gough platform 
of Fig. 1, in which the leg lengths can be adjusted 
by actuating the prismatic joints, the inverse kine¬ 
matics is unique and straightforward to obtain, 
whereas the forward kinematics will have multi¬ 
ple solutions. For other types of parallel robots 
in which the legs themselves contain one or 
more closed loops, both the forward and inverse 
kinematics can have multiple solutions. 

The notion of kinematic singularities for par¬ 
allel robots is also much more involved than 
the case for serial robots. Whereas kinematic 
singularities for serial chain robots are charac¬ 
terized by configurations at which the forward 
kinematics Jacobian (i.e., the linear mapping re¬ 
lating joint velocities to end-effector frame ve¬ 
locities) becomes singular, for parallel robots and 
closed chains in general, there exist other notions 
of singularities not found in serial chains. For 
example, given a parallel robot with kinematic 
mobility m - if the parallel robot consists only 
of one degree-of-freedom joints, this implies that 
exactly m joints can be actuated - there may exist 
configurations in which these m joints cannot be 
independently actuated. Conversely, even if the m 
actuated joints are each fixed to some value, the 
parallel robot may fail to be a structure, i.e., some 
of the links may be able to move. 

In the above scenario, choosing a different set 
of m actuated joints may remedy this situation, 
in which case such singularities are referred to 
as actuator singularities. Configurations at which 
singularity behavior occurs regardless of which 
joints are actuated are denoted configuration sin¬ 
gularities. The final class of singularities are end- 
effector singularities, which correspond to the 
usual serial chain notion of kinematic singularity, 
in which the end-effector loses one or more 
degrees of freedom of available motion. 
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Parallel Robots, Fig. 2 6 x RUS platform 



Parallel Robots, Fig. 3 A 3 x PUZJ haptic interface 

Dynamics 

In the case of a parallel robot whose actuated 
degrees of freedom coincides with its kinematic 
mobility m, it is possible to choose an indepen¬ 
dent set of generalized coordinates of dimension 
m, denoted q G M m and typically identified with 
the actuated joints, and to express the dynamics 
in the standard form 

M{q)q + C(q, q)q + G(q ) = r, (1) 

where r G M m denotes the vector of input 
joint torques, M(q) denotes the n x n mass ma¬ 
trix, the matrix-vector product C{q, q)q denotes 
the vector of Coriolis terms, and G(q) e M m 



Parallel Robots, Fig. 4 The 3 x PPRS eclipse parallel 
mechanism 

denotes the vector of gravitational forces. The 
structure of the dynamic equations is identical 
to that for serial chain robots. Also like the case 
for serial chain robots, the Coriolis matrix term 
C(q,q ) G M mxm is not unique, so that one 
should ensure that the correct C(q,q ) is used 
in, e.g., any control law whose stability depends 
on the matrix M(q) — 2 C(q,q) being skew- 
symmetric. 
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It is also important to keep in mind that the 
q must satisfy the kinematic constraint equations 
imposed by the loop closure constraints. That is, 
if 0 e W 1 denotes the vector of all joints (both 
actuated and passive), then q e M m , m < n, 
will be a subset of 0 whose values can only 
be obtained by solution of the kinematic con¬ 
straint equations; depending on the nature of the 
kinematic constraints, one may have to resort to 
iterative numerical methods. 

If the parallel robot is redundantly actuated, 
then the dynamics are subject to a further set of 
constraints on the input torques. Letting q e denote 
the set of independent generalized coordinates 
and q a be the vector of all actuated joints, the 
vector of actuated joint torques r a must then 
further satisfy S T r a = W T r, where r denotes 
the vector of joint torques for an equivalent tree 
structure system that moves identically to the 
redundantly actuated parallel robot and W and S 
are defined, respectively, by 


S-" 

oq e 


w = 3 ir- 

oq e 


( 2 ) 


Compared to the dynamics for serial chain robots, 
the dynamics for parallel robots is, in general, 
considerably more complex and computation¬ 
ally involved. The recursive algorithms that are 
available for computing the inverse and forward 
dynamics of serial chain robots can also be used 
to develop similar recursive algorithms for par¬ 
allel robot dynamics; however, the computations 
will be considerably more involved and require 
multiple iterations. 


Motion Control 

Exactly Actuated Parallel Robots 

For parallel robots whose actuated degrees of 
freedom match the kinematic mobility (this ex¬ 
cludes the set of all redundantly actuated parallel 
robots), most control laws developed for serial 
chain robots are also applicable. This is not al¬ 
together surprising in light of the similarity in 
the structure of the kinematic and dynamic equa¬ 
tions between serial and parallel robots. Control 


laws for serial robots are also covered in this 
handbook, and we refer the reader to ►Linear 
Matrix Inequality Techniques in Optimal Control 
for the essential details. Here we summarize 
the most basic control laws and point out any 
additional computational or other requirements 
that are needed when applying these laws to 
parallel robots. Note that other control laws and 
techniques developed for serial chain robots, e.g., 
robust, sliding mode, can also be applied with the 
same additional considerations and requirements 
outlined below: 

1. Computed torque control: Computed torque 
control for parallel robots has the same control 
law structure as for serial robots, i.e., 

t = M(q)(-K p e-K v e) + x f f, (3) 

where e denotes the tracking error, K p and K v 
are the proportional and derivative feedback 
gain matrices, and ryy denotes the feedfor¬ 
ward term required to cancel the nonlinear dy¬ 
namics. Robust versions of computed torque 
control are also applicable to parallel robots 
under the same set of conditions, e.g., estab¬ 
lishing appropriate bounds on the mass matrix 
eigenvalues and on the norm of the Coriolis 
matrix. 

2. Augmented PD control: The augmented PD 
control law for serial robots is also applicable 
to parallel robots, i.e., 

r = —K p e - K v e + M(q)q d + C(q, q)q + G(q), 

( 4 ) 

where qd is the reference trajectory to be 
tracked and K p and K v are the proportional 
and derivative feedback gains. Asymptotic sta¬ 
bility is also established under the same con¬ 
ditions. 

3. Adaptive control: Because the dynamic equa¬ 
tions for parallel robots are also linear in the 
link mass and inertial parameters, i.e., 


M{q)q + C(q,q)q + G(q) = ®(q,q,q)p, 

( 5 ) 
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where p denotes the vector of link mass and 
inertial parameters, adaptive control laws de¬ 
veloped for serial robots can also be used. 

4. Other control methods: There exist 
numerous control methods developed for 
serial robots, e.g., task space or operational 
space control, sliding mode control, and 
various nonlinear control techniques; with 
few exceptions most of these algorithms can 
also be applied to exactly actuated parallel 
robots with minimal modification. 


Redundantly Actuated Parallel Robots 

As described earlier, parallel robots exhibit a 
much more diverse range of singularity behavior 
than their serial counterparts, many of which 
depend on the choice of actuated joints (actua¬ 
tor singularities). One way to eliminate actuator 
singularities is via redundant actuation, i.e., the 
total degrees of freedom of the actuated joints ex¬ 
ceeds the kinematic mobility of the mechanism. 
Redundant actuation offers some protection in 
the event of failed actuators and, when combined 
with an appropriate control law, offers an effec¬ 
tive means of reducing joint backlash, increas¬ 
ing speed and payload and stiffness, controlling 
compliance through the generation of internal 
forces, and even improving power efficiency (as 
an analogy, the human musculoskeletal system 
is redundantly actuated by antagonistic muscles). 
Of course, redundant actuation introduces a new 
set of control challenges, since the control in¬ 
puts must be designed so as not to conflict with 
the kinematic constraints inherent in the parallel 
robot; loosely speaking, the actuated joints can 
no longer be independently controlled, since the 
consequences of unintended antagonistic actua¬ 
tion may be catastrophic. 

The control of cooperating manipulators (see 
► Optimal Control and Mechanics) has a long 
history in robotics, and many of the control tech¬ 
niques developed for such multi-arm systems can 
also be applied to redundantly actuated parallel 
robots. One can also apply the control strategies 
developed for exactly actuated parallel robots to 
the redundantly actuated case, but modifications 


are necessary to account for the different structure 
of the dynamic equations. 

Like all model-based control algorithms, the 
above control laws are subject to model un¬ 
certainties. Whereas in serial chains the effects 
of model uncertainty simply lead to errors in 
tracking, for redundantly actuated parallel robots, 
the consequences can lead to internal forces in 
addition to end-effector tracking errors. Perhaps 
the most significant effect of any modeling errors 
is that, unlike the serial chain case, the kine¬ 
matic errors can potentially alter the shape of 
the configuration space (recall that the configu¬ 
ration space will in general be a curved space 
for closed chains) and also interfere with any 
PD feedback introduced into the control. The 
development of control laws that are robust to 
such modeling errors and disturbances remains 
an open and ongoing area of research in parallel 
robot control. 


Force Control 

Both hybrid force-position control and impedance 
control are well-known and widely applied 
concepts in serial robots and can be extended 
in a straightforward manner to exactly actuated 
parallel robots. Recall that the basic feature of 
hybrid force-position control is that the task 
space is decomposed into force- and position- 
controlled directions, whereas in impedance 
control, the goal is have the robot maintain 
a certain desired spatial stiffness in the task 
space. Controllers that combine aspects of force- 
position and impedance control have also been 
proposed and developed for both serial robots 
and exactly actuated parallel robots. Modeling 
errors will cause deviations in both the force- 
and position-controlled directions - leading 
to motions in force-controlled directions and 
forces in position-controlled directions - which 
can be addressed by, e.g., a switching control 
strategy. 

The problem of force control for redundantly 
actuated parallel robots, which encompasses 
both force-position and impedance control, has 
also received some attention in the literature. 
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The main difference with the exactly actuated 
case is that internal forces can now be generated, 
which requires a more detailed and coordinate- 
invariant examination of stiffness. Control 
methods that combine elements of force-position 
and impedance control for redundantly actuated 
parallel robots have received only limited 
attention in the literature. 


Cross-References 

► Linear Matrix Inequality Techniques in Opti¬ 
mal Control 

► Optimal Control and Mechanics 

► Optimal Control via Factorization and Model 
Matching 

► Optimal Sampled-Data Control 


Recommended Reading 

The monograph (Merlet 2006) offers a detailed 
and comprehensive treatment of all aspects of 
parallel robots, with a particularly thorough treat¬ 
ment of the kinematics and singularity analysis. 
Mueller (2008) provides an excellent survey of 
the dynamics and control of redundantly actuated 
parallel robots and is based on the preceding 
work (Mueller 2005). Cheng et al. (2003) exam¬ 
ines in detail the dynamic model for redundantly 
actuated parallel robots and the basic control 
strategies; Nakamura and Ghodoussi (1989) also 
examines dynamic models for redundantly actu¬ 
ated parallel robots. Stiffness analysis and con¬ 
trol of redundantly actuated parallel robots are 
addressed in Yi and Freeman (1993), Chakarov 
(2004), and Fasse and Gosselin (1998). Analysis 
of specific parallel robots engaged in various 
control tasks includes Caccavale et al. (2003), 
Honegger et al. (1997), Kim et al. (2001), and 
Satya et al. (1995). The basic references on robot 
control are Murray et al. (1994), Spong et al. 
(2006), Anderson and Spong (1988), and Ghor- 
bel (1995) focuses on PD control for closed 
chains. 
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Abstract 

The particle filter computes a numeric approxi¬ 
mation of the posterior distribution of the state 
trajectory in nonlinear filtering problems. This 
is done by generating random state trajectories 
and assigning a weight to them according to 
how well they predict the observations. The 
weights are instrumental in a resampling step, 
where trajectories are either kept or thrown 
away. This exposition will focus on explaining 
the main principles and the main theory in an 
intuitive way, illustrated with figures from a 
simple scalar example. A real-time application 
is used to graphically show how the particle 
filter solves a nontrivial nonlinear filtering 
problem. 


Keywords 

Estimation; Kalman filter; Nonlinear filtering; 
Sequential Monte Carlo 


Introduction 

The particle filter computes an arbitrarily good 
solution to nonlinear filtering problems. The 
goal in nonlinear filtering is to compute the 
posterior distribution of the state vector in 
a dynamic model, given measurements that 
are related to the state. Bayes rule provides 
a recursive but computationally intractable 
solution. Monte Carlo (MC) methods can 
essentially solve all Bayesian inference problems. 


However, for nonlinear filtering, the complexity 
increases exponentially in time. The MC 
approach would be to generate a large number 
of state trajectories (called particles) and 
their corresponding sequences of predicted 
measurements and then weighs together the 
trajectories according to how well the predicted 
and actual measurement sequences match each 
other. 

With increasing time, the fit is deemed to 
be poor, since the state space increases expo¬ 
nentially in time. This is usually referred to as 
the depletion (or degeneracy) problem. The ap¬ 
proach in the particle filter is to simulate only 
one step at the time and then resample the tra¬ 
jectories if needed. For this reason, the parti¬ 
cle filter is sometimes referred to as a sequen¬ 
tial Monte Carlo method. The resampling step 
keeps the trajectories that give a good fit, while 
the bad ones are discarded. The novel idea in 
the particle filter when it was first published in 
1993 was the introduction of this resampling 
step. 

Depletion is still a problem, despite the 
resampling step. Mitigating depletion has 
ever since the beginning been the most 
pressing issue in applied particle filtering. This 
tutorial will present the basic particle filter 
algorithm and discuss ways to avoid depletion 
problems both in general terms and in a simple 
example. 

The particle filter computes an approximation 
to the Bayes optimal filter, conditioned on a 
sequence of observations and a nonlinear non- 
Gaussian system. It is important to note that the 
PF approximates the posterior distribution of 
the state trajectory, from which the mean and 
covariance are easily extracted. In contrast, the 
extended Kalman filter (EKF) computes the mean 
and covariance for an approximate dynamical 
system (linearized with Gaussian noise). The 
unscented Kalman filter (UKF) likewise also 
approximates the mean and covariance. Both 
EKF and UKF can only approximate unimodal 
(one peak) posterior distributions. There are 
filter bank approximations, like the interacting 
multiple model (IMM) algorithm, that can keep 
track of a given number of modes in the posterior. 
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However, the PF does this in a more natural 
way. 

The Basic Particle Filter 

Nonlinear filtering aims at estimating the 
distribution of a state sequence x\ : n = 
(x\ , * 2 , • • •, xjy) from a sequence of observations 
yv.N = (yi, y2, • • •, Jtv), given a state space 
model of the form 

*k+i = f(*k,Vk) or p(x k +\\x k ), (la) 

y k =h(x k ,e k ) or p(y k \x k ). (lb) 

Here, v k denotes process noise, and e k is the 
measurement noise. The stochastic variables 
v k , e k for all k and Vo are assumed mutually 
independent, with known distributions p v , p e , 
and p XQ , which are all being part of the model 
specification. 

The particle filter (PF) works with a set of 
random trajectories. Each trajectory is formed 
recursively by iteratively simulating the model 
with some randomness and then updating the 
likelihood of each trajectory based on the ob¬ 
servation. In words, we first evaluate the set of 
particles at hand by comparing how well they 
predict the current observation. In this way, the 
particles are assigned a weight. We keep the 
particles with large weight and throw away the 
particles with small weight, using a stochastic 
resampling procedure. After this step, we get a 
smaller set of particles, where many particles 
have several replicas. We then simulate each 
particle to the next observation time using the 
dynamical model. After this prediction step, all 
particles will be unique (because they are based 
on different realizations of the process noise). 
Below, the basic algorithm (sometimes called 
bootstrap PF or sequential importance resampling 
(SIR) PF) is summarized. 

• Define a set of random states (particles) by 
sampling x^° ~ p Ml (x 0 ). 

• Iterate in k = 0,1,...: 

1 Measurement update: Compute the weight 
o) k ^ = Peiyk ~ h(x k ^)) and normalize so 

(i) _ i 

2^=1"* - L 


2 Resampling: Resample each particle with 
probability co k . 

3 Time update: Simulate one time step by 
taking v k ^ ~ PviVk) and then set x k + x = 

/(xf ,4°). 

The main design parameter here is the number 
N of particles. A common trick to make the 
filter more robust is to increase the variance of 
p v and p e above. This is called dithering (or 
jittering) and is a practical way to get more robust 
nonlinear filters. 

To illustrate some of the aspects and for 
later reference, a simple example will be 
introduced. 


Example: First-Order Linear Gaussian 
Model 

The Kalman filter (KF) provides the posterior dis¬ 
tribution in an analytical form for linear Gaussian 
models and is thus suitable for evaluations and 
comparisons. A linear Gaussian model looks like 

x k + x = Fx k + v k , v k ~ N(0, Q ). (2a) 

y k = Hx k + e k , e k ~ N(0, R), (2b) 

*o ~ N (ji 0 , P 0 ), (2c) 

We will use the scalar case for the illustrations, 
and the figures that follow are based on F = 0.9, 

H = 1, Q = 1, R = 0.01, Pq = 1. The 
particle filter in the scalar case simplifies to the 
Matlab algorithm in Table 1 . Figure 1 compares 
the sample-based representation of the PF with 
the Gaussian distribution provided by the KF 
for the first two time steps. This shows how 
well the marginal distribution p(x k \y\ :k ) is 
approximated by the samples x^ from the PF. 
A rule of thumb is that 30 samples are needed to 
approximate a univariate Gaussian distribution. 
As will be discussed later, the number of 
samples is effectively only 10 here, which 
explains the small deviation of the Gaussian 
functions. 

To illustrate the fundamental depletion prob¬ 
lem in the PF, the set of trajectories x^} k that 
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Particle Filters, Table 1 Matlab code for scalar linear Gaussian model 


% Simulation 

y=filter([0 H] , [1 -F],[sqrt(P0)*randn(1,1);... 

sqrt(Q)*randn(N-l,1)])+sqrt(R)*randn(N,1); 
% Particle filter 
x=mu0+sqrt(P0)*randn(N,1); 
for k=l:K 


w=exp(-(y(k)-H*x).~2/R); 
w=w/sum(w); 
xhat(k)=w'*x; 

P(k)=w'*(x-xhat(k))."2; 
x=resample(x,w); 
x=F*x+sqrt(Q)*randn(N,1); 

end 


% Measurement update 
% Normalization 
% Estimate 
% Variance 
% Resampling 
% Time update 




Particle Filters, Fig. 1 Set of samples {jc ^},-=\ : n compared to first a Gaussian approximation of the particles {blue) 
and second to the true posterior distribution provided by the Kalman filter {green) 


approximates the posterior (smoothing) distri¬ 
bution p(xk\yi±) is illustrated in Fig. 2. The 
upper plot shows a case where the trajectories 
are all the same initially. The behavior in the 
upper plot is typical for the basic particle filter 
in cases where the measurements are more in¬ 
formative than the state transition model (small 
measurement noise, R < Q). The lower plot 
shows a particle filter that is working better, and 
the modification is explained in the following 
section. 


Proposal Distributions 

The time update in the basic PF predicts parti¬ 
cles in step 4 according to the dynamic model. 
The most general derivation of the particle filter 
allows for sampling from a more general pro¬ 
posal (also called importance) distribution. This 
proposal distribution can be any function that 
can be sampled from, and it can depend on both 
the previous state and the current measurement. 
From a filtering perspective, it may appear as 
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Particle Filters, Fig. 2 Set of trajectories {xf? 10 — 
xi:io}i=v.N for two different proposal distributions, one 
bad one (prior) leading to particle depletion and one good 


one (likelihood). Note that the smoothing distribution 
p(xk\yv.io) can be approximated with the set of particles 
{x^}i=\:N using the marginalization principle 


“cheating” to look at the next measurement when 
doing the time update, but one has to look at a full 
cycle of the iteration scheme. 

If a proposal distribution of the functional 
form q(xk\xk-i,yk) is used, then steps 1 and 3 
have to be modified as follows: 

1 Weight update: Time and measurement up¬ 
dates: 


w- 


,(0 


,(o p(4°i4-i) 


yk) 


, (3a) 


w ti « 


(3b) 


3 Prediction: Generate samples from the pro¬ 
posal 


4+i ~ q(*k+i l4°> yk +0 (3c) 

The most natural proposal distributions are the 
following: 


• The prior q(x k +i\x^\y k +i) = p(x k+l \x£ } ), 
as used in the basic PF. 

• The likelihood q(xk+i\x%\yk+i) oc 
p(yk+i\xk+i). For the model (2), the proposal 
becomes N (yk/h, RjJ h 2 ). 

• The optimal (minimizing weight variance) 
choice q(x k+ i\x%\y k+l ) oc p(y k+l \x k+l ) 
p(x k + 1 | 4 ). For the model (2), the optimal 
proposal is provided by one cycle of the 
Kalman filter, initialized with the particle 



The optimal proposal keeps the weights constant, 
and this would in theory avoid depletion, where 
depletion is interpreted as excessive weight vari¬ 
ance. Figure 2 compares the set of trajectories for 
the prior and likelihood proposals, respectively. 
Apparently, the likelihood proposal is to prefer 
here, since it suffers less from depletion in the 
particle history. The practical limitation with the 
last two alternatives is that one has to be able to 
sample from the likelihood, so in practice there 
needs to be more measurements than states in the 
model. 
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Particle Filters, Fig. 3 The efficient number of particles N c s(k)/N for prior proposal (blue) and likelihood proposal 
(green) (N = 1,000) 


Adaptive Resampling 

Resampling is crucial to avoid depletion. Without 
resampling, all trajectories except for one will 
get zero weight quite quickly. However, there is 
no need to resample at every iteration. Actually, 
resampling increases the weight variance, which 
is undesired. The question is how to decide if 
resampling is needed. The efficient number of 
particles estimated as 


N&Qc) = 


E n / (0 \ 

/=1 Kid 


( 4 ) 


is one suitable indicator. If all particles have the 
same weight w^ k = 1/A, then A e ff(fc) = A. 
Conversely, if one weight is one and all other 
zero, then A e ff(/c) = 1. Thus, A e ff can be inter¬ 
preted as a measure of how many particles that 
actually contribute to the solution. 

Figure 3 shows the evolution of A e ff(/:) for 
prior and likelihood proposals, respectively. With 
resampling in every iteration, the likelihood pro¬ 


posal performs very well with N e ff(k ) ^ A, 
while the prior proposal effectively uses only 
10% of the particles. Thus, A e ff(£) is a good 
indicator of the quality of the proposal distribu¬ 
tion. 

In Fig. 1, A = 100 so effectively 10 samples 
are contributing to the Gaussian approximation, 
which as mentioned before is too small a number 
to get a good result. 

As a comparison, Fig. 3 also shows A e ff if 
resampling is never used, then N e ff(k) normally 
decreases over time. The likelihood proposal does 
not decrease as fast as the prior proposal, and for 
this very short data sequence resampling is really 
not needed at all. 

In summary, resampling increases weight vari¬ 
ance and decreases the performance of the filter. 
On the other hand, without resampling the effec¬ 
tive number of particles converges monotonously 
to only one. So, the idea of adaptive resampling 
is natural. The key idea is to resample only if 
the effective number of particles is small. The 
usual rule of thumb is that resampling is needed 
if N e ff(k) < 2A/3. 
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Resampling was the main contribution in Gor¬ 
don et al. (1993) to get a working algorithm. 

Marginalization 

The posterior distribution approximation 
provided by the particle filter converges with the 
number of particles. In theory, the convergence 
rate is gk/N, where gk is a polynomial function 
in time k. In practice, it appears that the required 
number of particles increases very quickly with 
state dimension. Unless a very good proposal 
distribution is found, the practical limit for the 
state dimension is around 3-4 as a rough rule of 
thumb. 

For applications with a large number of states, 
one can in many cases still use the particle filter. 
The idea is to find a linear Gaussian substructure 
in the model and then divide the state vector 
Xk into two parts: x l k for the states that appear 
linearly and x k for the remaining states. Bayes 
rule provides the factorization 

p( x k’ x 1:k\yi:k) = pix l k \x n vk ,y l:k )p{x n vk \y l , k ) 

( 5 ) 

With a linear Gaussian substructure for x l k , given 
the whole trajectory x\. k , then the Kalman filter 
applies and provides a Gaussian distribution for 
the first factor in (5). The second factor is re¬ 
solved using a marginalization procedure so that 
the particle filter can be applied. 

The bottom line is that each particle is associ¬ 
ated with one Kalman filter. The method is called 
Rao-Blackwellized particle filter, or marginalized 
particle filter, in literature. 

Illustrative Application: Navigation 
by Map Matching 

A nontrivial application where the PF solves a 
nonlinear filtering problem, where Kalman filter- 
based approaches would fail, is described in Fors- 
sell et al. (2002). The problem is to compute a 
robust estimate of the position of a car, without 
using infrastructure such as cellular networks or 


satellites. The approach is based on measuring 
wheel speeds on one axle and from that dead 
reckon a nominal trajectory using standard odo- 
metric formulas. A road map is then used as a 
measurement, to rule out impossible or unlikely 
maneuvers. There is no numeric measurement yk 
in this approach, but the likelihood p(yk\%k) in 
the model (lb) is large when Xk corresponds to a 
road position and decays quickly to zero outside 
the road network. Figure 4 illustrates the particle 
cloud gradually focuses around the true position 
with time. In particular, the number of modes in 
the posterior distributions is rather high initially, 
but decreases over time, in particular after each 
turn. 

The particle filter is sometimes believed to be 
too computer intensive for real-time applications. 
As described in Forssell et al. (2002), this demon¬ 
strator implemented a particle filter on a pocket 
computer anno 2001 with N = 15,000 parti¬ 
cles running in 10 Hz. Thus, the computational 
complexity of the particle filter should not be 
overemphasized in practice. 

Summary and Future Directions 

The particle filter can be seen as a black-box 
solution to the nonlinear filtering problem, where 
any nonlinear dynamical model with arbitrary 
noise distributions can be plugged in. The main 
tuning parameter is the number of particles, and 
the PF will work in theory if this number is large 
enough. In practice, there are many tricks the 
user has to be aware of to mitigate the curse of 
dimensionality (depletion) that occurs for large 
state spaces (more than three) or long time se¬ 
quences (more than a couple of samples). One 
engineering trick is dithering, to increase the vari¬ 
ance in the involved noise distributions from their 
nominal values. More theoretical ways to miti¬ 
gate depletion include clever choices of proposal 
distributions to sample from and marginalization 
(to solve a subset of the estimation problem with 
a Kalman filter). 

Current and future research directions include 
the issues above. A further trend concerns 
the related smoothing problem, which is 
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Particle Filters, Fig. 4 Car navigation using wheel speed 
and a street map. The figures illustrate of how the particle 
representation of the position posterior distribution of 
position (course state is now shown) improves over time. 
After four turns, the posterior is essentially unimodal, and 


interesting in itself, but which has turned out 
to be instrumental in joint state and parameter 
estimation problems. There are also many 
attempts to make the particle filter more robust, 
including ideas of filter banks and invoking 
a second layer of sampling algorithms to 
implement the proposal distribution. There is also 
a trend to use the particle filter as a computational 


a position marker can be shown. The circle denotes GPS 
position, which is only used for comparison, (a) After first 
turn, (b) After second turn, (c) After third turn, (d) After 
fourth turn 


engine for more complex problems, such as 
the simultaneous localization and mapping 
(SLAM) problem, and to approximate the 
probability hypothesis density (PHD) for multi¬ 
sensor multi-target tracking. Finally, there 
are a large number of papers reporting on 
applications in traditional as well as new 
disciplines. 





















1044 


Perturbation Analysis of Discrete Event Systems 


Cross-References 
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Filters 


Recommended Reading 

Particle filtering (PF) as a research area started 
with the seminal paper (Gordon et al. 1993) 
and the independent developments in Kitagawa 
(1996) and Isard and Blake (1998). The state of 
the art is summarized in the article collection 
Doucet et al. (2001), the surveys Fiu and Chen 
(1998), Arulampalam et al. (2002), Djuric et al. 
(2003), Cappe et al. (2007), Gustafsson (2010), 
and the monograph Ristic et al. (2004). 
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Abstract 

Perturbation analysis (PA) is a systematic 
methodology for estimating the sensitivities 
(gradient) of performance measures in discrete 
event systems (DES) with respect to various 
model or control parameters of interest. PA takes 
advantage of the special structure of DES sample 
realizations and is based entirely on observable 
system data. In particular, it does not require 
knowledge of the stochastic characterizations of 
the random processes involved and is simple 
to implement in a nonintrusive manner. PA 
estimators, therefore, enable implementations 
for real-time control in addition to off-line 
optimization. The article presents the main 
ideas and statistical properties of PA techniques 
for both DES and recent generalizations to 
stochastic hybrid systems (SHS), especially for 
the simplest class of sensitivity estimators known 
as infinitesimal perturbation analysis (IPA). 
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Introduction 

Sensitivity analysis is an essential component 
of the system design process in a wide vari¬ 
ety of application areas. In essence, it provides 
quantitative variations of performance metrics 
resulting from possible perturbations in design 
set points and, hence, can be used in optimization 
and control as well as provide measures of per¬ 
formance robustness. Perturbation analysis (PA) 
is a systematic technique for computing sample- 
based sensitivity estimators of performance met¬ 
rics in discrete event systems (DES) by using 
the special properties of their sample realizations. 
The effectiveness of such estimators (e.g., unbi¬ 
asedness) depends on the characteristics of the 
DES to which PA is applied and on the specific 
performance metric of interest. The purpose of 
this article is to present and explain some of the 
main ideas and fundamental techniques of PA. 

Figure 1 depicts an abstract schematic where 
the operation of a stochastic system depends on 
a parameter 6 that is chosen from a given set 
(9. Let J(9) be an expected value performance 
function of the system, and suppose that J(9) = 
E\L(6)\, where E[ • ] denotes expectation and 
L{6) is a sample realization computable from 
a sample path of the system. In many situa¬ 
tions J(6) lacks a closed-form expression, and 
its sample realization, L(0), provides the most 
practical way for its estimation. Applications of 
sensitivity analysis often concern the effects of 
perturbations in the parameter 6 on the sample 
realization L{6). Denoting such perturbations by 
AO, their effects can be characterized by the 
difference term L(Q -\- AO) — L{0). PA provides 
such difference terms from the same sample path 
that was used for computing L(0). Furthermore, 
if 0 e R n and the function L(-) is differentiable, 
then PA can compute the gradient term VL(0) 


m 

HO + AO) 
VL(0) 


Perturbation Analysis of Discrete Event Systems, 
Fig. 1 Framework for perturbation analysis (PA) 


from the same sample path. These sample path- 
based sensitivities can be used, under suitable 
conditions, to estimate the quantities J(0 + AO) — 
J(0) and VJ(0), respectively. 

The PA theory was pioneered by Yu-Chi Ho 
who led its eventual development by his own 
group and other researchers over two decades. 
The early works were motivated by optimal re¬ 
source management problems in manufacturing 
and concerned the effects of buffer allocation on 
throughput in transfer lines (Ho and Cassandras 
1983; Ho et al. 1979). Subsequently PA was 
developed in the setting of queueing networks by 
virtue of their wide use as models in applications. 
In this setting, typically 0 is a set point param¬ 
eter affecting service times, inter-arrival times, 
routing fractions, buffer sizes, and various flow 
control laws; J(Q) is an expected value perfor¬ 
mance metric like average delay, throughput, and 
loss; and L(0 ) is a sample realization of J(0). 
The special structure of sample paths of queueing 
networks often yields simple PA algorithms for 
the difference terms L{6 + AO) — L(0), as well 
as the gradient term VL(0), from the common 
sample path. The PA techniques for computing 
L(0 + AO) — L(0) are collectively referred to as 
finite perturbation analysis (FPA), while those for 
computing V L (6) are called infinitesimal pertur¬ 
bation analysis (IPA) (Ho et al. 1983). Much of 
the development of PA has focused on IPA, rather 
than FPA, due to its greater simplicity and natural 
use in optimization, and, hence, it will be the 
focal point of this article. For comprehensive ex¬ 
positions of PA and its various techniques, please 
see Ho and Cao (1991), Glasserman (1991), and 
Cassandras and Lafortune (2008). 

The purpose of the IPA gradient, VL(0), is 
to estimate V J(0). This, however, is only useful 
as long as VL(0) is an unbiased realization of 
V J(0), namely, 

£[VL(0)] = V£[L(0)] = V/(6>), 

and in this case it is said that IPA is unbiased 
(Cao 1985). Since J{9) = E[L(0 )], unbiased¬ 
ness means the commutativity of the operators of 
differentiation with respect to 6 and integration 
(expectation) in the probability space, and this is 


System 
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closely related to the condition that, w.p.l, the 
random function L{9) is continuous throughout 
(9. As a matter of fact, the two conditions are 
practically synonymous. However, shortly after 
the emergence of IPA, it became apparent that 
in many queueing models of interest, L(9) is 
not continuous and, hence, IPA is not unbiased 
(Heidelberger et al. 1988). Subsequently various 
techniques to overcome this problem were ex¬ 
plored, including the so-called cut and paste of 
the sample paths and re-parametrization of the 
underlying probability space via statistical con¬ 
ditioning. For a more comprehensive coverage 
of such techniques, please see Cassandras and 
Lafortune (2008) and references therein. These 
techniques can yield unbiased gradient estimators 
in principle but often at the expense of pro¬ 
hibitive computing costs. Recently an alternative 
approach has emerged, based on stochastic flow 
models (SFM) that are comprised of fluid queues 
(Cassandras et al. 2002). It extends, significantly, 
the class of models and problems where IPA is 
unbiased and has the added advantage of yielding 
very simple gradient estimators. 

The following sections of this article present 
IPA in the general setting of DES, explain the 
limits of its scope in queueing models, describe 
alternative PA techniques for extending those 
limits, present the SFM approach, and conclude 
with some thoughts on future research directions. 


DES Setting for IPA 

IPA can be applied to any DES modeled 
as a stochastic timed automaton, defined in 
Cassandras and Lafortune (2008) and discussed 
in ►Models for Discrete Event Systems: An 
Overview. Briefly, a stochastic timed automaton 
is a sextuple (£, X, T, p, po,G), where £ is an 
event set, A is a state space, and r(x) c£ is the 
set of feasible events when the state is x, defined 
for all x G X. The initial state is drawn from 
Po(x) = P[X o = x]. Subsequently, given that 
the current state is x, with each feasible event 
i G r(x), we associate a clock value F ? -, which 
represents the time until event i is to occur. Thus, 
comparing all such clock values, we identify the 


triggering event E' = argmin*er(*){^}, where 
F* = min ier(x){Yi} is the inter-event time (the 
time elapsed since the last event occurrence). 
To simplify the notation we define e' := E '. 
Thus, with e' determined, the state transition 
probabilities p(x';x,e') are used to specify 
the next state x'. Finally, the clock values are 
updated: F 7 is decremented by F* for all i 
(other than the triggering event) which remain 
feasible in x', while the triggering event (and all 
other events which are activated upon entering 
x') is assigned a new lifetime sampled from a 
distribution G 7 . The set G = {G 7 : i e £} defines 
the stochastic clock structure of the automaton. 

Let T a , n denote the nth occurrence time of 
event a e £, and let V a , n denote a realization 
of the lifetime event distribution G a such that 
V a , n = Toi, n ~ Tp,m for some (any) event ft G £ 
and m G {1,2,...}. One can then always write 
T a , n = Vp l3 ki + ... + Vp st k s for some s. Let 
us now consider a parameter 0 G R which can 
only affect one or more of the event lifetime 
distributions G a (x;9 ); in particular, 0 does not 
affect the state transition mechanism. The case 
where 9 G R n can be handled in similar ways, 
but the one-dimensional case permits us to use 
the derivative notation rather than the gradient 
symbol, which simplifies the presentation. View¬ 
ing lifetimes as functions of 0 , V a ,k(9), it can 
be shown (under mild technical conditions, see 
Glasserman (1991)) that 

dV aJc _ 

d0 [3G„(x;0)/9*WV ' ' 

where the subscript (V a ^,0) indicates that the 
corresponding derivative of G a is evaluated at 
the point (V a ^, 6). This describes how a pertur¬ 
bation in 9 generates a perturbation in the as¬ 
sociated event lifetime V a ^- Such a perturbation 
can now propagate through the DES to affect 
various event occurrence times according to the 
dynamics prescribed by the stochastic timed au¬ 
tomaton. Event time derivatives d T a , n (9)/d9 are 
given by 


dT a . n 

dO 




(2) 
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where r](a, n; m) is a triggering indicator tak¬ 
ing values in {0,1} so that r](a,n; fi,m) = 1 if 
the nth occurrence of event a is triggered by the 
mth occurrence of /?, r\(cL,n\ci,n) = 1 for all 
a e £, rj(a,n; ft ', m') = 1 if r](a,n; /3, m) = 1 
and r)(l3, m; /3', m') = 1 , and r](a, n; /?, m) = 0 
otherwise. 

This leads to a general-purpose algorithm for 
evaluating event time derivatives along an ob¬ 
served sample path (see Algorithm 1) of a DES 
modeled as a stochastic timed automaton. In 
particular, we define a perturbation accumulator, 
A a , for every event a e £. The accumulator A a 
is updated at event occurrences in two ways: (i) It 
is incremented by dV a /dO whenever an event a 
occurs, and (ii) it is coupled to an accumulator 
Ap whenever an event (possibly = a) 
occurs that activates an event a. No particular 
stopping condition is specified, since this may 
vary depending on the problem of interest. 

Sample Function Derivatives. Since many 
sample performance functions L(6) of interest 
can be expressed in terms of event times T a ^ n , we 
can use (2) and Algorithm 1 


Algorithm 1 General-purpose IPA algorithm for 
stochastic timed automata_ 

1. Initialization 

If event a is feasible at xo: A a : = 

dV atl /dO 

Else, for all other a € S: A a := 0 

2. Whenever event is observed 

If event a is activated with new lifetime V a : 

2.1. Compute dV a /dO through ( 1 ) 

2 . 2 . A a := Ap + dV a /dO 


in order to obtain derivatives of the form 
dL(6)/dO. As an example, a large class of such 
functions is of the form 

L t (9)= f C(x(t,9))dt , 

Jo 

where C(x(t,9 )) is a bounded cost associated 
with operating the system at state x(t, 9). Then, 


dL T _ N ^dT k 

de ~ ^ de [c(Xfc - l} C( ** )] > 

k =1 

where N(T) counts the total number of events 
observed in [0, T] and Xk is the state remaining 
fixed in any interval (7^, Tk+ 1 ) with Tk = T a , n 
for some a e £ , n = 1 , 2 ,.... 


Estimation of Performance Measure Deriva¬ 
tives. Using dLr(0)/d0 from above, we can 
obtain unbiased estimates of dJ / d6 if the follow¬ 
ing condition holds: 

d J {9) _ ~dL T (0y 
d6 d6 

As mentioned earlier, this key condition is closely 
related to the continuity of the sample perfor¬ 
mance functions. A discontinuity often is caused 
by a swap in the order of two events that results 
from small variations in 9 and yields different 
future state trajectories. However, it is possible 
that the future state trajectory following the oc¬ 
currence of the two events is invariant under 
their order, and in this case the two events are 
said to commute. This commuting condition, de¬ 
fined by Glasserman, was shown to be identical, 
under broad assumptions, to the continuity of 
the sample functions L(9) and, hence, to the 
unbiasedness of IPA (Glasserman 1991). 

The main ideas discussed in this section will 
next be illustrated on a simple queue. 

Queueing Example 

Consider the IPA gradient (derivative) of the 
expected sojourn time (delay) in a GI/G/1 queue 
with respect to a real-valued parameter of its 
arrival process. Assume that the queue is empty at 
time t = 0, and it serves its customers according 
to the order of their arrivals. Let us denote by 
ak , k = 1 , 2 ,..., the arrival times, and by 
Sk, k = 1 , 2 ,..., the service times of consecutive 
customers. Furthermore, we denote by Vk, k = 
1,2,..., the kth inter-arrival time, namely, Vk = 
ak—ak-i , where ao := 0. Observe that the queue 
is a stochastic timed automaton as defined earlier, 
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with event space {arrival, departure}, state space 
{0,1,...} representing the queue’s occupancy, 
and feasible event set {arrival} when x = 0 and 
{arrival, departure} when v > 0. 

Let 6 e R is a parameter of the distribution of 
the inter-arrival times, and, hence, the realizations 
of v depend on 9 in a functional manner. For 
example, if the arrival process is Poisson and 9 
is its rate, then a realization of the inter-arrival 
times has the form v = — (9 ln(l — co), where 
co e [0,1] is a uniform variate. To emphasize 
the dependence of v on 9, we denote it by v(9), 
while its dependence on co is implicit. Similarly, 
the arrival times depend on 9 and, hence, are 
denoted by ak(9 ), but the service times Sk do 
not depend on 6. The departure time of the kth 
customer and its delay depend on 9 and are 
denoted by dk(9 ) and Dk(9 ), respectively. The 
forthcoming paragraphs discuss the derivatives 
of these functions with respect to 0; we use the 
prime symbol to indicate such derivatives in order 
to simplify the notation. 

Define the sample performance function 

N 

L n (0 ) :=N~ l J2 D k(- 9 ) 

k = 1 

for a given N > 0. Under stability conditions, 
with probability 1 (w.p.l), lim^-^ oo L n (9) = 
J(9), where J(9) denotes the mean of the delay’s 
stationary distribution. The role of IPA is to 
estimate J'(9 ) via the sample derivative L' N (9 ), 
and this is justified as long as lim^v^oo E' N (6) = 
J'(9) w.p. 1. In this case, IPA is said to be strongly 
consistent. In contrast, unbiasedness pertains to 
the performance function Jn( 9) \= E[Ljy (^)] 
and means that E[L' n (9 )] = J' N (9). The con¬ 
cepts of strong consistency and unbiasedness are 
closely related except that the latter concerns 
finite-horizon processes while the former pertains 
to stationary distributions in steady state. Both 
concepts have been extensively investigated in 
recent years: strong consistency in the setting of 
Markov chains and Markov decision processes 
Cao (2007) and unbiasedness in the context of 
stochastic hybrid systems, as will be described 
in the sequel. We will focus the rest of the 
discussion on the issue of unbiasedness. 


Since L N (6) = N~' D k (6), its IPA 

derivative is L’ N {6) = N~ l Y^k = i ^(0). and 
since D^(9) = a^(9) — dk(9 ), it follows that 
D' k (9) = a' k (9) — d f k (9). The last two derivative 
terms are computable via the following recursive 
procedures. First, a^{9 ) = a^-\{9 ) + Vk(9), and, 
hence, 

a' k (0 ) = a'^iO) + v' k (6). 

The term v' k (9) has to be obtained directly from 
the realization of v, and this often is possible 
since such realizations depend functionally on 
9\ for instance, in the previous example, v = 
—9 ln(l — co) and, hence, v'(9) = — ln(l — co). 
Next, dk(9) is given by the Lindley equation 

dk(9) = max {^(0), <4-i (9)} + Sk, 

and, therefore, denoting by Ik the index of the 
customer that started the busy period containing 
customer k, we have that 

d’ k {0) = a[ k {0). 

From these recursive relations it follows that 
E>' k (9) = 0 if customer k starts a busy period and 
D' k (9) = — Xj= 4 +i v'i(9) if customer k does 
not start a busy period. 

These equations shed light on the structure of 
IPA in a general class of queueing networks. First 
there is the perturbation generation , namely, the 
sampling of derivative (gradient) terms directly 
from the sample sequence of variates (co) which 
defines the sample path; that was v'(9) in the 
above example. These terms drive the recursive 
equations that yield the IPA derivatives. The re¬ 
cursive equations often are based on the tracking 
of certain events such as the start of busy periods 
or idle periods, and the process of tracking the 
derivatives through them is referred to as pertur¬ 
bation propagation. In the above example it is 
obvious how the perturbation propagation tracks 
the busy periods at the queue. Furthermore, in a 
network setting, the perturbations can propagate 
from one queue to the next in a natural fashion. 
For instance, suppose that customers departing 
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Perturbation Analysis of Discrete Event Systems, 
Fig. 2 Queue with two customer classes 

from the queue analyzed above enter a second 
queue. Then the derivative terms d' k (0) of the up¬ 
stream queue act as the derivative terms a' k (0 ) of 
the downstream queue. This structure of pertur¬ 
bations’ generation and propagation through the 
tracking of busy periods and other events often 
yields simple recursive algorithms for computing 
the IPA derivatives. 

Concerning the issue of unbiasedness, it is 
clear that in the above example, the sample func¬ 
tion Ln (0) is continuous in 6 and, hence, IPA is 
unbiased. However, in many systems of interest 
the IPA, derivative is biased. For example, con¬ 
sider the two-input, single-server queue shown 
in Fig. 2, where customers are served according 
to their arrival order regardless of source. Sup¬ 
pose that 0 is a parameter of the upper arrival 
process, but not of the lower arrival process, and 
denote the respective inter-arrival times of the 
input streams by k = 1 , 2 ,..., and 

v>2, m m = 1,2,..., as indicated in the figure. 
Furthermore, let d\^{0) denote the departure 
time from the queue of the kth customer that 
came from the upper source, and let d2, m (0) 
denote the departure time from the queue of the 
mth customer that came from the lower source. 
Similarly, let D\^(0) and T> 2 ,m($) be the delays 
of the kth customer from the upper source and 
the mth customer from the lower source, respec¬ 
tively. Lastly, in analogy with the previous exam¬ 
ple, consider the sample performance functions 
L hN {0) := N~ l ZLi D hk(0) and L 2>JV (0) := 
N- l ZLiD 2 , m (0)- 

The IPA derivatives L\ N ( 6 ) and L' 2 N (0) have 
quite similar expressions to those derived earlier, 
but they are not unbiased. To see this point, 
suppose that is a monotone increasing 

function of 0, and consider the functions a\^{0), 
d\,k(0), and d2,k(0) for a common sample path. 
Suppose that at some point 0 the order of arrivals 
of the nth customer from the upper source 


and the mth customer from the lower source 
is swapped. Then the service order of these 
customers will be swapped as well, inducing 
discontinuities in d\^{0) and d2, m (0) at the point 
0 = 0. Consequently, the sample performance 
functions L\^{0) and L2 ,n( 0) also will be 
discontinuous at 0 = 0 , and hence, their IPA 
derivatives are biased. Furthermore, if the queue 
is a part of a network and its output process 
directs customers to other queues, then the 
discontinuities in the various traffic processes 
will propagate downstream. 

The causes of biasedness in queueing net¬ 
works include multiple customer classes, non- 
Markovian routing, and loss (spillover) due to 
finite buffers. This leaves out a limited class of 
networks where IPA can be unbiased and, hence, 
useful in applications. The following sections 
describe various approaches to overcome this 
problem. 

IPA Extensions 

When IPA fails (because the commuting con¬ 
dition is violated or a sample function exhibits 
discontinuities in 0), one can still use the PA ap¬ 
proach to derive unbiased performance sensitivity 
estimates. There are two ways to accomplish this: 
(i) by modifying the stochastic timed automaton 
model so that IPA is “made to work” and (ii) by 
paying the price of more information collected 
from the observed sample path, in which case, 
the same essential PA philosophy can lead to 
unbiased and strongly consistent estimators, but 
these are no longer as simple as IPA ones. Re¬ 
garding (i), the main idea here is that there may 
be more than one way to construct (statistically 
equivalent) sample paths of a stochastic DES, 
and while one way leads to discontinuous sample 
functions L{0 ), another does not; a variety of 
such ways is provided in Cassandras and Lafor- 
tune (2008). Regarding (ii), the methodology of 
smoothed perturbation analysis (SPA) (Gong and 
Ho 1987) provides a generalization of IPA in 
which more information is extracted from a DES 
sample path in order to gain some knowledge 
about the magnitude of jumps in L{0). 
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The main idea of SPA lies in the “smoothing 
property” of conditional expectation. If we are 
willing to extract information from a sample path 
and denote it by Z , called the characterization 
of the sample path, then we can evaluate, not just 
the sample function L(0), but also the conditional 
expectation E[L(6) \ Z\ (provided we have 
some distributional knowledge based on which 
this expectation can be evaluated). This can result 
in a much smoother function of 6 than L(6). 
Thus, starting with the condition for an IPA 
estimator to be unbiased, 

VJ(0) = £[VL(0)], 

we rewrite the left-hand side above as shown 
below, replacing J{6) = E[(L(0 )] by the ex¬ 
pectation of a conditional expectation: 

V/(<9) = VE[L(Q)\ = VE[E[L{6) \ Z]], 

(3) 

where the inner expectation is a conditional one 
and the conditioning is on the characterization 
Z. Treating E[L(9) \ Z\ as the new sample 
function, we expect it to be “smoother” than 
L(6), and, in particular, continuous in 0. Then, 
under some additional conditions (comparable 
to those made in the development of IPA) the 
interchange of differentiation and expectation in 
(3) can be justified: 

V/(0) = E[VE[L(0) I Z]]. 

Letting L z (d ) := E[L(6) \ Z], the SPA 
estimator of V J(Q) is 

[V/(0)] SPA — VL*(0). (4) 

Naturally, the idea is to minimize the amount 
of added information represented by Z, since 
this incurs added costs we would like to avoid. 
The choice of the characterization Z generally 
depends on the sample function considered and 
the system under study. 

A number of other extensions to IPA have also 
been developed (see Cassandras and Lafortune 
2008). It is also worth mentioning that the PA 
approach can be applied to a parameter 0 taking 


values from a finite set S — {6q, 0\, ... ,0m}- 
The theory of concurrent estimation and sample 
path constructability provides techniques to 
estimate performance measures of the DES 
through the process of constructing sample 
paths under each of 6q,6\, ... ,6m\ for details 
see Cassandras and Lafortune (2008). 

SFM Framework for IPA 

The stochastic flow model (SFM) framework 
essentially consists of fluid queues which forego 
the notion of the individual customer and focus 
instead on the aggregate flow. In such a fluid 
queue, traffic and service processes are character¬ 
ized by instantaneous flow rates as opposed to the 
arrival, departure, and service times of discrete 
customers. The SFM qualifies as a stochastic hy¬ 
brid system with bilayer dynamics: discrete event 
dynamics at the upper layer and time-driven dy¬ 
namics at the lower layer. The discrete events are 
associated with abrupt (discontinuous) changes in 
traffic-flow processes, such as the boundaries of 
busy periods at the queues. In contrast, the time- 
driven dynamics describe the continuous evolu¬ 
tion of flow rates between successive discrete 
events, usually by differential equations or ex¬ 
plicit functional terms. Performance metrics that 
are natural to SFMs typically reflect quantitative 
measures of flow rates, like average throughput, 
buffer workload, and loss. 

Due to the smoothing effects of SFMs, they 
appear to provide a far more natural setting for 
IPA than their analogous discrete queueing coun¬ 
terparts. Furthermore, their IPA gradients often 
are computable via extremely simple algorithms 
that are based entirely on the observed sample 
path. Consequently SFMs could, in principle, be 
implemented on the sample paths generated by an 
actual system rather than simulations thereof and 
thus be used in real-time optimization. 

All of this next will be explained via a con¬ 
crete example of a queue which, though simple, 
captures the salient features of the SFM setting 
for IPA. For a more comprehensive discussion, 
please see Cassandras et al. (2010), Wardi et al. 
(2010), and Yao and Cassandras (2013). 
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Perturbation Analysis of Discrete Event Systems, 
Fig. 3 Basic SFM 


Consider the fluid queue depicted in Fig. 3 
whose input and output flow rate processes 
are denoted, respectively, by a(t) and 8(t). 
The output flow process depends on the input 
flow process via the action of the server as well as 
the buffer size. The server is characterized by an 
instantaneous processing rate, denoted by 
and the buffer size, namely, the maximum amount 
of fluid the buffer can hold, is denoted by c. Fluid 
overflow occurs when the inflow (arrival) rate 
exceeds the service rate while the buffer is full, 
and the overflow (loss) rate is denoted by y(t). 

Suppose that a(t) and /3(t ) are random func¬ 
tions defined on a suitable probability space, and 
assume that they are piecewise continuous and of 
bounded variation w.p. 1. In order to describe their 
functional relations to 8{t ) and y(t ), we define 
the state variable to be the amount of fluid in 
the buffer (workload) and denote it by x(t). The 
dynamics of the system evolve according to the 
following one-sided differential equation, 

( 0, if x(t) = 0 and a(t) < p(t) 

—— = j 0, ifx(Y) = c and a(t) > /3(f) 

dt ( a(t) — /3(t), otherwise, 

and 8{t ) and y(t) are related to them via 


= I if V0 > o 

K ) ( a(t), if x{t) = 0, 


and 


= I “ ^(0. if x(t) = c 
YK ) (0, if x(t) < c. 


instance, 9 can be the on time of the flow from an 
off/on source, a uniform rate of the server, or the 
threshold level in a threshold-based flow control. 
Then the aforementioned traffic processes are 
functions of 9 and t and, hence, are denoted by 
a(9;t), P(6;t), x(9;t), etc. Fix the parameter 9 , 
and consider the evolution of the system over a 
given time horizon [0, T]. Performance measures 
of interest in applications include the average 
loss rate over the horizon [0, T] and the average 
workload there which is related to the delay by 
Little’s Law. Related to them are the sample per¬ 
formance functions L y j{6) := y (9, t)dt and 
L x j{9 ) := Jq x(0,t)dt; the former is called 
the loss volume and the latter, the cumulative 
workload. 

To illustrate the forms of their IPA derivatives, 
consider the basic SFM shown in Fig. 3, and let 0 
be its buffer size, namely, 9 = c. We say that 
a busy period of the queue is lossy if it incurs 
some loss at any amount. Let us denote by Nt 
the number of lossy busy periods in the horizon 
[0, T]. Then (see Cassandras et al. 2002), the IPA 
derivative of the loss volume has the following 
form: 

L' yJ (0) = -N t , 

where again we use the prime symbol to de¬ 
note derivative with respect to 9. This formula 
amounts to a counting process and indeed it is 
very simple. As an example, Fig. 4 depicts a 
typical state trajectory derived from a sample 
path. It is readily seen that the first busy period is 
lossy while the second one is not, and therefore, 
= — 1 • 

Concerning the cumulative workload, suppose 
that the queue has M lossy busy periods in the 
time interval [0, T], and let us enumerate them by 
the counter m = 1,..., M . Moreover, denote by 
u m the first time the buffer becomes full in its mth 
lossy busy period and by v m , the end-time of that 
busy period. Then (see Cassandras et al. 2002), 


Network arrangements of such fluid queues, with 
specified routing and control schemes, provide a 
rich class of SFMs. 

Let 9 e R be a parameter of the inflow rate, 
the service rate, or a network control law; for 


M 

= y ] — Um)- 

m = 1 

In the example provided by Fig. 4, L' x T {9) = 

V\ — U\. 
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Fig. 4 State trajectory of the SFM 


These equations for the IPA derivatives not 
only are very simple, but require no knowledge 
of the specific form of the processes {a(7)} or 
{/?(/)}, their realizations, or underlying probabil¬ 
ity law. They depend only on limited informa¬ 
tion which is directly observed from the sample 
path and, hence, are said to be nonparametric or 
model free. Furthermore, they have been shown 
to possess considerable robustness to modeling 
variations that do not cause significant alterations 
of the busy periods of the queue. A case of 
interest is when the SFM formalism is used as 
an abstraction of a queue. Then the above IPA 
formulas that were derived from an analysis of 
the SFM can be successfully applied to sample 
paths that are generated from the discrete queue. 
This is in contrast to the IPA formulas that are 
derived from the discrete queue, which generally 
are highly biased. 

All of these properties of IPA in the SFM 
setting, including its unbiasedness, simplicity, the 
nonparametric nature of its algorithms, and its ro¬ 
bustness to model variations, have had extensions 
to SFM networks and systems beyond the basic 
model (see Cassandras et al. 2010; Wardi et al. 
2010; Yao and Cassandras 2013). As mentioned 
above, this suggests the potential application of 
IPA not only in system optimization via off-line 
simulation but also in real-time control where 
the sample paths are generated from the actual 
system. 

Summary and Future Directions 

In the past 10 years, the focus of research on 
IPA has shifted from the setting of queueing 
systems to the framework of SFMs. The main 
reason for this shift is that IPA yields unbiased 


gradient estimators for a considerably richer class 
of networks and performance functions in the 
SFM setting than for their analogous queueing 
models. Furthermore, the algorithms for comput¬ 
ing the IPA gradients often are nonparametric, 
robust to modeling variations, and very simple to 
compute, and, hence, they hold out promise of 
implementations in real-time control in addition 
to off-line optimization. 

In analogy with the extension of the scope of 
IPA from queueing systems to stochastic timed 
automata, the SFM framework has been extended 
to stochastic hybrid systems (SHS), defined in 
Cassandras et al. (2010). In such systems, in¬ 
cluding SFMs, the functional description of the 
time-driven dynamics is changed according to the 
occurrence of specific events. However, whereas 
in SFMs these dynamics are described by explicit 
functional relations, in SHS they are expressed 
by differential equations. The aforementioned, 
appealing properties of IPA gradients in the SFM 
setting appear to have extensions to the wider 
context of SHS. 

Future directions in the use of IPA are ex¬ 
pected to focus on the control of high-speed 
large-volume networks and, more generally, the 
control of stochastic hybrid systems. 


Cross-References 

► Models for Discrete Event Systems: An 
Overview 

► Perturbation Analysis of Steady-State Perfor¬ 
mance and Sensitivity-Based Optimization 
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Abstract 

We introduce the theories and methodologies 
that utilize the special features of discrete event 
dynamic systems (DEDSs) for perturbation anal¬ 
ysis (PA) and optimization of steady-state per¬ 
formance. Such theories and methodologies usu¬ 
ally take different perspectives from the tradi¬ 
tional optimization approaches and therefore may 
lead to new insights and efficient algorithms. 


The topic discussed includes the gradient-based 
optimization for systems with continuous param¬ 
eters and the direct-comparison-based optimiza¬ 
tion for systems with discrete policies, which is 
an alternative to dynamic programming and may 
apply when the latter fails. Furthermore, these 
new insights can also be applied to continuous¬ 
time and continuous-state dynamic systems, lead¬ 
ing to a new paradigm of optimal control. 


Keywords 

Gradient estimation; Sample-path techniques; 
Sensitivity analysis; Queueing networks 


Introduction 

In this chapter, we introduce the theories and 
methodologies that utilize the special features 
of discrete event dynamic systems (DEDSs) 
for perturbation analysis (PA) and optimization 
of steady-state performance. Such theories and 
methodologies usually take different perspectives 
from the traditional optimization approaches 
and therefore may lead to new insights and 
efficient algorithms. Furthermore, these new 
insights can also be applied to continuous-time 
and continuous state dynamic systems, leading to 
a new paradigm of optimal control. 

As discussed in ►Perturbation Analysis of 
Discrete Event Systems, perturbation analysis 
(PA) can be applied to both performance in 
finite-period and steady-state performance. This 
chapter will mainly focus on the latter and a 
related topic, the sensitivity-based optimization 
of steady-state performance of stochastic discrete 
event dynamic systems. 


Gradient-Based Approaches 

Basic Ideas 

The gradient-based performance optimization of 
discrete event dynamic systems (DEDSs) consists 
of three steps: 
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1. Developing efficient algorithms to estimate 
the performance gradients using the special 
features of a DEDS 

2. Studying the properties of the gradient esti¬ 
mates, including investigating whether they 
are unbiased and/or strongly consistent 

3. With the gradient estimates, developing effi¬ 
cient optimization algorithms 

Steps 1 and 2 are referred to as PA, and Step 3 is 
usually done together with standard gradient- 
based optimization approaches, such as hill¬ 
climbing type of approaches and stochastic 
approximation approaches such as the Robbins- 
Monro algorithm (Robbins and Monro 1951). 

Our focus here is on PA for steady-state per¬ 
formance. The main principle for estimating the 
gradients of steady-state performance is decom¬ 
position. In a DEDS, a small change in the 
value of a system parameter induces a series of 
changes on the system’s sample path; each of 
such changes is called a perturbation. A single 
perturbation alone will affect the sample path and 
therefore affect the system performance. Such an 
effect is typically small, and therefore, the linear 
superposition usually holds. Thus, the effect of a 
small parameter change on the steady-state per¬ 
formance can be determined by summing up the 
effects of all the perturbations induced (or gener¬ 
ated) by the parameter change. This principle is 
illustrated by Fig. 1, and it applies to many differ¬ 
ent systems with different performance criteria. 
Following the principle, efficient algorithms can 
be developed, and their strong consistency can be 


proved Cao (2007). Its application to queueing 
systems and Markov systems will be discussed in 
the following two subsections. 

Queueing Systems 

The gradient-based approach for DEDSs starts 
with queueing systems, known as infinitesimal 
perturbation analysis (IPA), or simply PA, 
► Perturbation Analysis of Discrete Event 
Systems. Queueing systems are widely used 
as a model for many DEDSs in literature 
and have very unique structural features, and 
PA utilizes such special features to develop 
fast algorithms to estimate the performance 
gradients. 

We first give a brief explanation for the simple 
rules of PA of queueing systems (Ho and Cao 
1983, 1991). Consider a closed Jackson network 
with M servers and N customers. The service 
times of customers at server i are exponentially 
distributed with service rate /X/ , i = 1, 2, • • • , M. 
After a customer completes its service at server 
i , it goes to server j with a routing probability 
qij, i, j = 1,2, • • • , M. The service discipline is 
first come first served. The number of customers 
at server i is denoted as Uj, and the system 
state is denoted as n = (n\,n 2 , ••• ,^m)- The 
state process is n(t) = (ni(t), n 2 (t), • • • , ^m(O) 
with Hi (t) being the number of customers at 
server i at time t, i = 1,2, • • • , M, t > 0. 
Define 7) as the Zth state transition time of the 
process n(t). 


Perturbation Analysis 
of Steady-State 
Performance and 
Sensitivity-Based 
Optimization, Fig. 1 The 
decomposition of 
performance changes 



realization 
' factors^ 












Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 


1055 


Perturbation Analysis 
of Steady-State 
Performance and 
Sensitivity-Based 
Optimization, Fig. 2 
Perturbation generation 
and propagation in a 
queueing network 


Perturbations q g * 

generated at server 1: 1- U 2: | 3: | 4: 



Figure 2 illustrates an example of a sample 
path where each stair-style line represents a tra¬ 
jectory of the number of customers at one server, 
and the customer transitions among servers are 
indicated by dotted arrows. 

An exponentially distributed service time with 
rate /z can be generated according to s = — - In £, 
where £ is a uniformly distributed random num¬ 
ber in [0, 1]. Now let the service rate of a server, 
say server k, change from /z^ to /z^ + A/z^, 
A fik << i^k- Then the service time s will change 

to s' = --A— In t (the same t is used for both 

s and s'). Thus, the perturbation of the service 
time induced by the change in /z^ is 

A s = s'-sk-^^s. (1) 

In summary, because of a small (infinitesimal) 
change of service rate A/z^, every customer’s 
service completion time at server k will obtain 
(be delayed by) a perturbation that is propor¬ 
tional to its original service time with a multiplier 
— This is the rule of perturbation genera¬ 
tion. 

Next, we observe that a perturbation will af¬ 
fect the service starting and completion times 
of other customers in the same server and in 
other servers in the network. We say a pertur¬ 
bation will be propagated through the network. 


First, the perturbations generated at a server will 
accumulate until this server is idle. When the 
server enters an idle period, all its perturbations 
will be lost. (After the idle period, the service 
starting time is determined by the customer that 
terminates the idle period, which carries the per¬ 
turbation of another server.) Second, when a 
customer finishes its service at server i and enters 
server j after an idle period of server j , server j 
will obtain the same amount of perturbations as 
server i . If server j is not idle at this time, the 
perturbation at server i will only affect the arrival 
time of this customer and will not affect any other 
customers/servers. 

Figure 2 illustrates the perturbation generation 
and propagation process of a network in which 
the service rate of server 1 is decreased with 
an infinitesimal amount. Given a system sam¬ 
ple path obtained by simulation or observation, 
we can record the perturbations of all servers 
as they are generated and propagated along the 
sample path. From the perturbations we can get 
a perturbed sample path and finally get the per¬ 
formance changes caused by the change in the 
service rate. Specifically, we have the following 
algorithm for the perturbations of the service 
completion times (if server k’s service rate is 
perturbed). 

In Step 2, we add Sk,i as the perturbation 
generated, instead —^^Skj as indicated by (1). 
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Algorithm 1 Perturbation Analysis 

1. Initialization: Set variables A, = 0, i = 1,2, • • • , M. 

2. Perturbation generation: At the /th service completion 

time of server k, set A& = + Sk,i, l = 1,2, • • •, 

where Skj is the service time of the l th customer at 
server k. 

3. Perturbation propagation: If a customer from server i 
terminates an idle period of server j , set Ay = A/, 
ij = 1,2, ••• ,M. 


The factor — will be canceled when esti- 
mating performance derivatives with respect to 
A ilk. The algorithm yields a perturbed sample 
path. With the original path and the perturbed 
path, we can estimate the original and perturbed 
(steady-state) throughput, then the estimates of 
its derivative with respect to /x^ can be obtained, 
and it is proved that the estimate is strongly 
consistent. Only a few lines need to be added in 
the simulation code to obtain the derivatives. Ex¬ 
periments show that the results are very accurate 
(error is around 5 %, compared with analytical 
results, Ho and Cao 1983). 

Markov Systems 

The decomposition principle shown in Fig. 1 
has been applied to Markov systems (Cao 
2007). 

Consider an irreducible and aperiodic Markov 
chain X = {X n : n > 0} on a finite state 
space S = {1,2, • • • , M} with transition prob¬ 
ability matrix P = [p(j\i)] e [0, l] MxM . 
Let tv = {jv\ ,..., tvm) be the vector repre¬ 
senting its steady-state probabilities and / = 
(/l, fa, * * * , /m) T be the reward (or cost) vector, 
where “T” represents transpose. We have Pe = 
e , where e = (1,1, • • • , \) T is an M-dimensional 
vector whose all components equal 1, and we 
have tv = tv P . We consider the long-term av¬ 
erage (steady-state) performance defined as 

M 

n = E„(f) = 

i = 1 

= tv f = lim —, w.pA, (2) 
L^oo L 


where 

L—1 

F l = E /(*/)• 

1=0 

Let P' be another ergodic transition probabil¬ 
ity matrix on the same state space. Suppose P 
changes to P(8) = P + SQ = 8P' + (1 - S)P, 
with <5 > 0, Q = P' — P = [q(j |z)], and 
the reward function / keeps the same. We have 
Qe = 0. The performance measure will change 
to rj(8) = T) + At](8). The derivative of r] in the 
direction of Q is defined as = lim^o • 
In this discrete-state Markov system, a pertur¬ 
bation means that the system is perturbed from 
one state i to another state j. For example, 
consider the case where ^(f \k) = 
and q(l\k) = 0 for all / ^ i,j. Suppose that 
these probabilities change to q(i\k) = \ + <5, 
q(j \k) = ± — 8, and q'(l\k) = 0 for all / ^ i, j . 
Then it may happen that at some time in the orig¬ 
inal sample path the system transits from state k 
to state i , but in the perturbed path it transits from 
state k to state j instead. In this case, we say 
that the change in transition probabilities induces 
a perturbation from i to j at this time. To study 
the effect of such a perturbation, we consider two 
independent sample paths X = {X n ;n > 0} and 
X' = {X' n \n > 0} with X 0 = i and X' = y; 
both of them have the same transition matrix P . 
The average effect of a perturbation from i to j , 
i, j = 1, • • • , M, on Fl can be measured by the 
perturbation realization factor defined as 


d(i,j ) = lim E 

L^oo 


'L—1 


yy /(*/)-/™ i z o 


l_/=o 


= i,X' = j 


(3) 


The matrix D e 1Z MxM , with d(i,j) as its 
(/, y )th element, is called a realization matrix. 
We can prove that D satisfies the equation (Cao 
2007) 

D-PDP t = F, (4) 

where F = fe T — ef T . Because d(i,j ) = 
—d(j, /), for any i,j, we may define a vector 
g = (g(l), • • • , g(M)) T such that 
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d(i,j) = g(j) ~g(0, (5) 

and D = ge T — eg T . g is called a performance 
potential , which can be estimated with many 
sample path-based algorithms, and it satisfies the 
Poisson equation (Cao 2007) 

(/ - P + en)g = /. (6) 

Intuitively, (3) and (5) indicate that every visit 
to state i contributes to F L on the average by the 
amount of g(i), so the effect of a perturbation 
from i to j is d(i, j) = g(j ) - g(i). Now, 
we consider a sample path consisting of L tran¬ 
sitions. Among these transitions, on the average 
there are Ltt, transitions at which the system 
is at state i. After being at state i , the system 
jumps to state j on the average Liti p(j \i) times. 
If the transition probability matrix P changes 
to P(8) = P + 8Q, then the change in the 
number of visits to state j after being at state 
i is Ln t q{j\i)8 = Lnt[p\j\i) - p(j\\i)\8. 
This contributes a change of {Lni [p'(j |z) — 
p(j |/)]5}g(/) to F l . Thus, the total change in F L 
due to the change of P to P{8) is 

M 

A F l = y] Lni[p'(j\i) - p(j\i)]8g(i) 
ij = 1 

= L(nQg)8. 


Finally, we have 


dr] 1 A F l 

—— = lim -- 

do 8 —o L 


= jtQg = jt(P'-P)g. (7) 


If the reward function also changes from / to 
f(8) = f-\- 8(f' — /), then (7) becomes 

+ /') - (Pg + /)]• (8) 

Further Works 

The ideas described in the previous subsections 
may stimulate new research topics in theoretical 
analysis, estimation algorithms, and applications. 
Here we can only give a very brief review for 
some of them. 


1. There is a large literature on whether the 
PA-based derivative estimates are unbiased 
(finite period) and/or strongly consistent 
(steady state). This was first formulated in 
Cao (1985) and was further discussed in 
Heidelberger et al. (1988). By now, there 
have been extensive studies in this direction: 
proving the unbiasedness or consistency for 
various systems and modifying the approach 
for system when the IPA estimates are not 
unbiased (e.g., Cassandras and Lafortune 
1999; Fu and Hu 1997; Glasserman 1991). 
This also includes the recently proposed 
fluid model; see ►Perturbation Analysis of 
Discrete Event Systems. 

2. Another research topic is how to develop fast 
and efficient algorithms for estimating the 
performance gradients, especially in the case 
of Markov systems; see e.g., Cao and Wan 
(1998), Baxter and Bartlett (2001), and Cao 
(2005). This is called policy gradients in the 
reinforcement learning literature. 

3. There are also research works on how the 
gradient estimates and policy iteration (see 
section “Direct Comparison and Policy Iter¬ 
ation”) can be combined with stochastic ap¬ 
proximation approaches to develop fast con¬ 
vergent optimization algorithms; see Marbach 
and Tsitsiklis (2001) for gradient-based ap¬ 
proach and Fang and Cao (2004) for policy 
iteration-based approach. 


Direct Comparison and Policy 
Iteration 

The sensitivity-based view has been extended 
to optimization in discrete spaces of policies. 
With this view, we can develop a new approach 
to performance optimization based on a direct 
comparison of the performance of any two poli¬ 
cies. This provides an alternative to the standard 
dynamic programming to solving the Markov 
decision processes (MDP) types of problems; it 
also has been applied to solve some problems 
when the standard MDP fails (Cao 2007; Cao and 
Wan 2013). 
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In an MDP, there is an action space denoted as 
A. For simplicity, we only consider the discrete 
case. When the system is at any state i e <S, an ac¬ 
tion a = d (i ) is taken, which controls the system 
transition probability, denoted as p a (j |/), /, j e 
S, and the reward function, denoted as f(i,a). 
The mapping d : S ^ Ais called a policy. Since 
a policy corresponds to a transition probability 
matrix P d = [p d ^(j \i)]f*j =1 and the reward 
vector f d = (/(M(l)),’-.. 
we also call the pair (P, /) a policy. 

Consider two policies (P, f ) and ( P', /'), 
and assume that the Markov chain under both 
policies is ergodic. We use prime “ ' ” to de¬ 
note the quantities associated with (P\ /'). First, 
multiplying both sides of the Poisson equation (6) 
with n' on the left and after some calculations, we 
get 

ri'-ri = n'{( P'g + f) - (Pg + /)}. (9) 

This is called a performance difference formula, 
and many optimization results can be derived 
from it in an intuitive way. 

Policy Iteration and the Optimality Equation 
The difference formula (9) has a nice decompo¬ 
sition structure: it contains two factors, the first 
one 7r', which does not depend on P, and the 
second one (P'g + /') — (Pg + /), in which all 
the parameters are known except the performance 
potential g , which can be obtained by only ana¬ 
lyzing system with P. This nice feature makes 
the difference formula the basis of performance 
optimization. 

For two M -dimensional vectors a and b, we 
define a = b if a(i) = b(i) for all i = 

1,2 - • • , M\ a < b if a(i ) < b(i) for all i = 

1,2 - • • , M\ a < b if a(i) < b(i ) for all i = 

1,2 • • • , M ; and <2 :< b if < 2 (/) < b(i) for at least 
one i and a(j ) = &(y) for other components. 
The relation < includes =, < 9 and <. Similar 
definitions are used for the relations >, >:, and 
>. 

Next, we note that Tt'(i) > 0 for all i = 
1,2, • • • , M. Thus, from (9), we know that if 

(P' — P)g + (f — f) h 0, then rj' — rj > 0. 


From (9) and the fact n' > 0, the proof of the 
following lemma is straightforward. 

Lemma 1 If Pg + / •< (<) P'g + /', then 

r1 < (<) rf. 

It is interesting to note that in the lemma, we 
use only the potentials with one Markov chain, 
i.e., g. Thus, because of the special structure of 
the performance difference formula (9), if the 
condition in Lemma 1 holds, to compare the 
performance measures under two policies, we 
may only need the potentials with one policy. 

Policy iteration and the optimality equation 
can be easily derived from (9) and Lemma 1 . 


Algorithm 2 Policy Iteration 

1. Guess an initial policy do, and set k = 0. 

2. (Policy evaluation) Obtain the potential g dk by solving 
the Poisson equation (/ — P dk )g dk + rj dk e = f dk or 
estimating it on a sample path. (The superscript “^4” 
is added to quantities associated with policy d^.) 

3. (Policy improvement) Choose 

4+1 € arg jmax [f d + P d g dt ] j , (10) 

component-wise (i.e., to determine an action for each 
state). If in state i, action d^(i) attains the maximum, 
and set<4+i(0 = d k (i). 

4. If <4+i = d k , stop; otherwise, set k \= k + 1 and go 
to Step 2. 


It follows directly from Lemma 1 that when 
the iteration does not stop, the performance im¬ 
proves at each iteration. It can be proved easily by 
construction that the iteration stops at an optimal 
policy. Again, from Lemma 1 , when the iteration 
stops at a polict d , it holds 

A +/ = max ( f d + P d g‘ l \. (11) 

This is the Hamilton-Jacobi-Bellman (HJB) opti¬ 
mality equation. If we further define the Q-factor, 

Q d Q,a ) = f(i,a) + y^p a (j\i)g d (j)\. (12) 

j 

Then the policy iteration equation (10) and the 
HJB equation (11) become 
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<4+i(0 := argma x{Q dk (i,a)} (13) 

aeA 

and 

Q 3 (i,a)=max{Q 3 (i,P)}. (14) 

The only difference between (8) and (9) is 
that 7T in (8) is replaced by 7t' in (9). This leads 
to an interesting observation: policy iteration in 
MDPs in fact chooses the policy with the steepest 
directional derivative as the policy in the next 
iteration. Therefore, policy iteration in fact can 
be viewed as the “gradient-based” optimization 
in a discrete space. 

In our approach, the HJB equation and pol¬ 
icy iteration are obtained from the performance 
difference equation (9), which compares the per¬ 
formance of any two policies. It is hence called 
a direct-comparison based approach. This ap¬ 
proach also applies to more general problems, 
such as multichain Markov systems, systems with 
absorbing states, and problems with other per¬ 
formance criteria such as the discounted perfor¬ 
mance and the bias, etc. 

The direct-comparison approach has been suc¬ 
cessfully applied to the ftth-bias optimality prob¬ 
lem (Cao 2007). Essentially, starting with the 
performance difference formulas (those similar 
to (9) for different performance), we can de¬ 
velop a simple and direct approach to derive the 
results that are equivalent to the sensitive dis¬ 
count optimality for multichain Markov systems 
with long-run average criteria (Puterman 1994), 
and no discounting is needed and no dynamic 
programming is used. The approach, motivated 
by the development for discrete event dynamic 
systems, provides a clear overall picture for the 
area of MDP. 

The direct-comparison approach can also be 
applied to some problems where dynamic pro¬ 
gramming fails, including the event-based opti¬ 
mization problems, where the sequence of events 
may not be Markovian; see the next section. 

Event-Based Optimization 

It is well known that for most systems modeled 
by Markov processes, the state spaces are too 


large, and it is not computationally feasible to 
implement policy iteration or to solve the HJB 
equations. On the other hand, in many practical 
problems in engineering, finance, and social sci¬ 
ences, control actions are only taken when certain 
events occur. For example, in the traffic control 
of a network of subnetworks, often times one 
cannot control the traffic in the same subnetwork, 
and control actions are only applied when there 
are packets transferring among different subnet¬ 
works. In a portfolio management problem, the 
investor sells or buys stocks when the price his¬ 
tory experiences some predetermined patterns 
(e.g., reaches some level). In a sensor network, 
actions are taken only when one of the sensors 
detects some abnormal situations. In a material 
handling problem, actions are taken when the 
inventory level falls below certain threshold. 

Conceptually, anything happened in the past 
can be chosen as an event. However, the number 
of such events is too big (much bigger than the 
number of states), and studying all these events 
makes analysis infeasible and defeats our original 
purpose. Therefore, to properly define events, we 
need to strike a balance between the generality on 
one hand and the applicability on the other hand. 

In the event-based setting, an event e is de¬ 
fined as a set of state transitions with certain 
common properties. That is, e := {(z, j) : 

z, j G S and (i,j) has common properties}, 
where (z, j) denotes a state transition from z to 
j. This definition can also be easily generalized 
to represent a finite sequence of state transitions. 
We shall see that in many real problems, the 
number of events requiring control actions is 
usually much smaller than that of the states. 

An event-based policy d is defined as a map¬ 
ping from £ to A, with £ being the space of all 
events. That is, d : £ A. Let V e denote the 
set of all the stationary and deterministic policies. 
The reward function under policy d is denoted 
as f d = f (z, d(e )), and the associated long-run 
average performance is denoted as rj d . When an 
event e happens, we choose an action a = d{e) 
according to a policy d , where e G £ and d G V e . 
Our goal is to find an optimal policy d which 
maximizes the long-run average performance as 
follows. 
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d = argmax {rj d \ (15) 

dev e 

The main difficulty in developing the event- 
based optimization theories and algorithms lies in 
the fact that the sequence of events is usually not 
Markovian. The standard optimization approach 
such as dynamic programming does not apply 
to such problems. However, we can apply the 
direct-comparison approach to this event-based 
optimization problem. Here we give a very brief 
discussion. 

Consider an event-based policy d, d e 
V e . When event e occurs, the conditional 
transition probability is denoted as p d( ^ e \j \i,e). 
Let 7t d (i\e) be the conditional steady-state 
probability of state i when event e occurs under 
policy d . Define the aggregated Q-factor 

Q d (e,a ) = X!,- n d (i\e) 
x [/(/,«) + X; P a (j\i,e)g d (j)] • (16) 

We may use these aggregated Q-factors to de¬ 
velop an event-based policy iteration algorithms 
as Algorithm 2 and obtain the policy iteration 
equation (cf. (13)) 

dk+\(e) := argmax{Q^(e, a)}, e e £, 
aeA 

(17) 

and the event-based HJB optimality equation (cf. 
(14)) is 

Q d (e,a) = max{Q S (e,p)}. (18) 
fieA 

It can be proved that if the conditional probability 
7i h (i\e ) does not depend on the policy; i.e., 

7t h (i\e) = n d (i\e), V /, e, and h, d , (19) 

then policy iteration (17) indeed leads to a se¬ 
quence of increasing performance and (18) spec¬ 
ifies an event-based optimal policy. 

The event-based aggregated Q-factor (16) can 
be estimated on a sample path in the same way as 
for potentials (Cao 2007). The number of events 
requiring actions is usually much smaller than the 


number of states, and in some cases such as the 
network admission control problem, it is linear to 
the system size. 

The crucial condition for the above event- 
based optimization is (19). There are many prob¬ 
lems, such as the control of the networks of 
networks and the portfolio management problem, 
for which the condition holds; there are also many 
problems for which the condition does not hold. 
The error in applying (18) and policy iteration 
(17) comes from the difference between 7t h (i \e ) 
and 7t d (i \e ) at each iteration. Further research is 
needed. 

We use a computer network as an example, in 
which each computer or router can be modeled 
as a queueing subnetwork and the computer 
network is then a network of such subnetworks. 
Assume that there is an M subnetwork, and 
subnetwork m, m = 1,2, • • • , M, consists of 
k m servers. The number of customers at the 
yth server of subnetwork m is denoted as n m j, 
j = 1,2, • • • , k m , and the number of customers 
in all the servers in subnetwork m is denoted as 
N m = Yl k fL\ n m,j • Suppose the service time is 
exponentially distributed, then the system state 
is n := (n 1 , 1 , *• • ,n - i,*** 

and the aggregated state is N := (N\ , • • • , Nm)- 
Suppose that the transition probabilities among 
the servers in the same subnetwork are fixed and 
we can only control the transition probabilities 
among the subnetworks, and furthermore, we 
can only observe N. Then the problem can 
be modeled as an event-based optimization 
with an event being a customer transition 
among subnetworks, and meanwhile, the 
aggregated state is N. It can be proved that 
the condition (19) holds, and therefore, we 
may apply policy iteration (17) and HJB 
equation (18). 

Many existing problems fit the event-based 
framework. For example, in a partially observ¬ 
able Markov decision process (POMDP), we may 
define an observation, or a sequence of observa¬ 
tions, as an event. Other examples include state 
and time aggregations, hierarchical control (hy¬ 
brid systems), and options. Different events can 
be defined to capture the special features in these 
different problems. In this sense, the event-based 
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approach may provide a unified view to these 
different problems (Cao 2007). 


The Sensitivity-Based Approach and 
Optimal Control 

We refer to the approaches discussed in the 
previous sections the sensitivity-based approach. 
When the parameters are continuous, it is the 
gradient-based approach, and the focus is on de¬ 
veloping efficient algorithms that utilize the par¬ 
ticular system structure to estimate the gradients 
(to justify the algorithms, the unbiasedness or the 
consistency of the estimates should be proved). 
When the parameters are discrete, it is the direct- 
comparison approach that leads to policy iteration 
and the HJB equations, and policy iteration can 
be viewed as the gradient-based method in dis¬ 
crete spaces. This approach is different from the 
conventional dynamic programming, and it has 
been successfully applied to MDP with different 
criteria and the ft-bias optimization problems as 
well as the event-based optimization and some 
other problems that dynamic programming fails. 

This sensitivity-based approach was motivated 
by the study of discrete event dynamic systems 
(DEDS). It has been realized that the principles 
and methodologies developed for DEDSs also 
apply to the optimization of continuous-time and 
continuous-state (CTCS) systems. Here are some 
examples: 

1. CTCS systems: For CTCS systems, the dy¬ 
namic is driven by Brownian motions or Levy 
processes. The transition probability matrix in 
DEDS should be replaced by the infinitesimal 
generator in CTCS, which is an operator on 
the space of continuous functions. With the 
performance difference formulas, we can re¬ 
develop the stochastic optimal control theory 
for many performance measures, including 
the long-run average, discounted performance, 
and finite horizon problems, with no dynamic 
programming; see Cao et al. (2011). 

2. Time-inconsistent optimization problems: In 
behavioral finance, people’s preference is 
modeled with a distorted probability. For 
example, a risk-taking person buys lotteries, 


because in her/his mind, s/he enlarges the 
possibility of winning a large sum, and a risk 
averse person buys insurance, because s/he is 
afraid of a big loss and therefore enlarges its 
probability. 

The optimization problem with a distorted 
probability suffers from the time-inconsistent is¬ 
sue; i.e., an optimal policy for the problem in 
period [t,T), 0 < t < T, is not optimal in the 
same period for the problem in [0, T]. Thus, the 
standard dynamic programming fails. 

The gradient-based approach has been applied 
to the portfolio management problem with prob¬ 
ability distortion. With this approach, we discov¬ 
ered that the performance with distorted probabil¬ 
ity maintains some sort of linearity called mono¬ 
linearity. This property shed new insights to the 
portfolio management problem and the nonlinear 
expected utility theory (Cao and Wan 2013). 

Conclusion 

A sensitivity-based approach has been developed 
to the performance optimization of discrete event 
dynamic systems (DEDS). The approach utilizes 
the dynamic structure of DEDS. For systems with 
continuous parameters, it is the gradient-based 
optimization, in which the special feature of a 
DEDS helps in developing efficient algorithms to 
estimate the performance derivatives; for systems 
with discrete policies, it is the direct-comparison- 
based approach, with which policy iteration and 
HJB equations can be derived intuitively by us¬ 
ing the performance difference formulas. Policy 
iteration can be viewed as the gradient method 
in a discrete space. The estimation of gradients 
and the implementation of policy iteration can be 
carried out on a given sample path, and efficient 
online learning algorithms can be developed (Cao 
2007). 

The sensitivity-based approach was developed 
for DEDSs, but its principle also applies to 
systems with continuous-state spaces. The 
approach provides an alternative to the traditional 
dynamic programming and therefore can be 
applied to some problems where dynamic 
programming does not work. 
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► Perturbation Analysis of Discrete Event Sys¬ 
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Synonyms 

Proportional-Integral-Derivative Control 

Abstract 

Since their introduction in industry a century 
ago, proportional-integral-derivative (PID) con¬ 
trollers have become the de facto standard for 
the process industry. In this entry, fundamentals 
of PID control are outlined, starting from the 
basic control law. Additional functionalities and 
the tuning and automatic tuning of the parameters 
are then considered. 


Keywords 

Anti-windup; Autotuning; Controller tuning; 
Derivative action; PI control; Proportional 
control; Proportional-integral-derivative control; 
Ziegler-Nichols 

Introduction 

A proportional-integral-derivative (PID) con¬ 
troller is a three-term controller that has a long 
history in the automatic control field, starting 
from the beginning of the last century. Owing 
to its intuitiveness and relative simplicity, in 
addition to the satisfactory performance that it is 
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able to provide with a wide range of processes, 
it has become the de facto standard controller 
in industry. It has been evolving along with 
the progress of technology, and nowadays it is 
very often implemented in digital form rather 
than with pneumatic or electrical components. It 
can be found in virtually all kinds of control 
equipments, either as a stand-alone (single¬ 
station) controller or as a functional block in 
Programmable Logic Controllers (PLCs) and 
Distributed Control Systems (DCSs). Actually, 
the new potentialities offered by the development 
of the digital technology and of the software 
packages have led to a significant growth of the 
research in the PID control field: new effective 
tools have been devised for the improvement 
of the analysis and design methods of the basic 
algorithm as well as for the improvement of the 
additional functionalities that are implemented 
with the basic algorithm in order to increase its 
performance and its ease of use. 

The success of the PID controllers is also 
enhanced by the fact that they often represent the 
fundamental component for more sophisticated 
control schemes that can be implemented when 
the basic control law is not sufficient to achieve 
the required performance or when a more com¬ 
plicated control task is of concern. 

Basics 

Using a PID controller means applying a feed¬ 
back controller that consists of the sum of three 
types of control actions: a proportional action, an 
integral action, and a derivative action. 

The proportional control action is proportional 
to the current control error, according to the 
expression 

u(t) = K p e(t) = K p (r(t)^y(t )), (1) 

where u is the controller output, K p is the pro¬ 
portional gain, r is the reference signal, and y 
is the process output. Its meaning is straightfor¬ 
ward, since it implements the typical operation of 
increasing the control variable when the control 
error is large (with appropriate sign). The transfer 


function of a proportional controller can be triv¬ 
ially derived as 

C(s) = K p . (2) 

The main drawback of using a pure proportional 
controller is that, in general, it cannot set to zero 
the steady-state error. This motivates the addition 
of a bias (or reset) term Ub , namely, 

u{t) = K p e(t) + Ub. (3) 

The value of Ub can then be adjusted manually 
until the steady-state error is reduced to zero. 

In commercial products, the proportional gain 
is often replaced by the proportional band PB , 
which is the range of error that causes a full-range 
change of the control variable, i.e., 


The integral action is proportional to the integral 
of the control error, i.e., 

u(t ) = Kj f e{x)dr, (5) 

Jo 

where is the integral gain. It appears that 
the integral action is related to the past values 
of the control error. The corresponding transfer 
function is 

C(.v) = —. (6) 

The presence of an integral action allows to 
reduce the steady-state error to zero when a step 
reference signal is applied or a step load distur¬ 
bance occurs. In other words, the integral action 
is able to set automatically the correct value of 
Ub in (3) so that the steady-state error is zero. For 
this reason, the integral action is also often called 
automatic reset. 

While the proportional action is based on the 
current value of the control error and the integral 
action is based on the past values of the control 
error, the derivative action is based on the pre¬ 
dicted future values of the control error. An ideal 
derivative control law can be expressed as 
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u(t) = K d 


de(t ) 
dt 


(7) 


where K d is the derivative gain. The correspond¬ 
ing controller transfer function is 


C(s) = K d s. (8) 


adopted. Suitable conversion formulae can be 
applied to obtain an ideal PID controller equiv¬ 
alent to a series one. Obtaining an equivalent PID 
controller in series form starting from an ideal 
one is possible only if the zeros of the ideal PID 
controller are real. 


The meaning of the derivative action can be better 
understood by considering the first two terms of 
the Taylor series expansion of the control error at 
time T d ahead: 

e{t + T d )~e{t) + T d ^-. (9) 

dt 

If a control law proportional to this expression is 
considered, i.e., 

u(t) = K p (e(t) + T d ^-\ (10) 

this naturally results in a PD controller. The 
control variable at time t is therefore based on the 
predicted value of the control error at time t + T d . 
For this reason, the derivative action is also called 
anticipatory control , or rate action , or pre-act. 

The combination of the proportional, integral, 
and derivative actions can be done in different 
ways. In the so-called ideal or non-interacting 
form, the PID controller is described by the 
following transfer function: 

Q(s) = ^1 + -L + T dS ^ ; ( 11 ) 

where K p is the proportional gain, 7} is the 
integral time constant, and T d is the derivative 
time constant. An alternative representation is the 
series or interacting form: 

C s (s) = K' p (l + (7> + l) 

= K ;(Tll±iyT's+l), (12) 

where the fact that a modification of the value 
of the derivative time constant T' d affects also 
the proportional action justifies the nomenclature 


Additional Functionalities 

The expression (11) or (12) of a PID controller is 
actually not employed in practical cases because 
of a few problems that can be solved with suitable 
modifications of the basic control law. 

Modifications of the Derivative Action 

From Expressions (11) and (12), it appears that 
the controller transfer function is not proper, 
because of the derivative action, and therefore, 
it cannot be implemented in practice. Indeed, the 
high-frequency gain of the pure derivative action 
is responsible for the amplification of the mea¬ 
surement noise in the manipulated variable. This 
problem can be solved by filtering the derivative 
action with (at least) a first-order low-pass filter. 
The filter time constant should be selected in 
order to suitably filter the noise and to avoid a 
significant influence on the dominant dynamics 
of the PID controller. Thus, it can be selected 
as T d /N , where N generally assumes a value 
between 1 and 33, although in the majority of the 
practical cases its setting falls between 8 and 16. 
Alternatively, the overall control variable can be 
filtered. 

Another issue related to the derivative ac¬ 
tion that has to be considered is the so-called 
derivative kick. In fact, when an abrupt (step¬ 
wise) change of the set-point signal occurs, the 
derivative action is very large, and this results 
in a spike in the control variable signal, which 
is undesirable. This problem could be simply 
avoided by applying the derivative term to the 
process output only instead of the control error. In 
this case, the ideal (not filtered) derivative action 
becomes 

j f \ _ „ T dy(t) 

u(t) — KpT d 


( 13 ) 
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Obviously, when the set-point signal is constant, 
applying the derivative term to the control error 
or to the process variable is equivalent. Thus, 
the load disturbance rejection performance is the 
same in both cases. 

Set-Point Weighting for Proportional 
Action 

A typical problem with the design of a feedback 
controller is to achieve a high performance both 
in the set-point following task and in the load 
disturbance rejection task at the same time. For 
example, for stable processes, a fast load dis¬ 
turbance rejection is achieved with a high-gain 
(aggressive) controller, which gives an oscillatory 
set-point step response on the other side. This 
problem can be approached by using a two- 
degree-of-freedom control architecture, where a 
feedback controller is designed to achieve a high 
bandwidth and therefore a satisfactory load dis¬ 
turbance rejection performance, and then the set- 
point signal is filtered before applying it to the 
closed-loop system. 

In the context of PID control, this can be 
achieved by weighting the set-point signal for 
the proportional action, that is, to define the 
proportional action as follows: 


Anti-windup 

One of the most well-known possible sources 
of performance degradation is the so-called inte¬ 
grator windup phenomenon, which occurs when 
the controller output saturates (typically when a 
large set-point change occurs). In this case, the 
system operates as in the open-loop case, since 
the actuator is at its maximum (or minimum) 
limit, regardless of the process output value. The 
control error decreases more slowly than in the 
ideal case (where there are no saturation limits), 
and therefore, the integral term becomes large (it 
winds up). Thus, even when the value of the pro¬ 
cess variable attains that of the reference signal, 
the controller still saturates due to the integral 
term, and this generally yields large overshoots 
and settling times. 

In order to cope with this problem, an addi¬ 
tional functionality designed for this purpose can 
be conveniently used. This can be done in differ¬ 
ent ways. For example, in the conditional inte¬ 
gration approach, the integration is stopped when 
the control variable saturates and the control error 
and the control variable have the same sign. 
Alternatively, in the back-calculation approach, 
the integral term is recomputed when the con¬ 
troller saturates by feeding back the difference of 
the saturated and unsaturated control signal. 


u(t) = K p (Pr(t)-y(t )), (14) 

Tuning 


where the value of /3 is between 0 and 1. 

In this way, the control scheme has a feedback 
controller (11), and the set-point signal is filtered 
by the system 


1 +j3T i s + 7} T d s 2 
1 + TiS + TiT d s 2 ' 


(15) 


The load disturbance rejection task is decoupled 
from the set-point following task, and obviously 
it does not depend on the weight /3, which can 
be employed to smooth the (step) set-point sig¬ 
nal in order to damp the response to a set- 
point change. The smaller the value of ft, the 
smaller the overshoot and the higher the rise 
time. 


The selection of the PID parameters, i.e., the 
tuning of the PID controller, is obviously the 
crucial issue in the overall controller design. This 
operation should be performed in accordance 
with the control specifications (which should take 
into account the set-point following, the load 
disturbance rejection, the control effort, and the 
robustness of the system). A major advantage of 
the PID controller is that its parameters have a 
clear physical meaning, and therefore, manual 
tuning is relatively simple. For example, for sta¬ 
ble processes, increasing the proportional gain 
leads, in general, to a faster but more oscillatory 
response. In fact, by increasing K p for the same 
value of the control error, the proportional control 
action increases, and so does the aggressiveness 
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of the controller. Then, increasing the integral 
time constant (i.e., decreasing the effect of the 
integral action) results, in general, in a slower 
response but in a more damped system. This is 
because a larger value of 7} implies a smaller 
value of the control action at a given time instant 
of the transient response (assuming the same 
values of the past control errors). Finally, increas¬ 
ing the derivative time constant gives a damping 
effect. Indeed, if the set point is constant, the 
derivative action is proportional to the derivative 
of the process variable with a negative sign, and 
therefore, the derivative action increases (with a 
negative sign) when the slope of the transient 
response increases (so that a big overshoot is 
avoided). However, in this context, much care 
should be taken to avoid increasing the derivative 
time constant too much as an opposite effect 
might occur in this case and an unstable system 
could eventually result. This is because the pre¬ 
diction over a too long time interval might be 
wrong. 

The above considerations can be understood 
by considering a transient response in the time 
domain but also by considering the frequency 
response of the system and how it changes by 
modifying the PID parameters. For example, the 
effect of increasing the three controller actions 
can be seen as translating any point of the Nyquist 
plot in each of the directions shown in Fig. 1. 


From another point of view, analogous consid¬ 
erations can be done by considering the Bode 
plot. For example, considering a PID controller 
in ideal form with a filter on the overall control 
action, the effect of modifying the three param¬ 
eters in the controller Bode plot is shown in 
Fig. 2. The effects of the parameter modifica¬ 
tion in the achieved performance can be better 
ascertained by plotting the frequency response 
of the loop transfer function for different cases. 
As an example, consider the process P(s ) = 
l/(10s + l)e~ 4s and the PID controller C(s ) = 
K p ( 1 + j~ s + Tds) Tf X s+x where the parameters 
are in the following ranges: K p e [2,15/4], 
Ti e [4,16], Td e [1.5,4], being always Tf = 
0.1. From Fig. 3, it appears that an increment of 
K p yields an increment of the bandwidth and a 
decrement of the phase margin. Conversely, an 
increment of 7} has an opposite effect. Finally, 
it can be seen that the increment of Td initially 
yields to an increment of the bandwidth and of 
the phase margin at the same time, but then a 
sudden increment of the bandwidth occurs, and 
this corresponds to a sudden decrement of the 
phase margin (because of the dead time of the 
process) with a possible loss of stability. 

In any case, in order to ease the procedure, a 
large number of tuning rules have been proposed 
in the last century, starting from the well-known 
Ziegler-Nichols ones (Nichols and Ziegler 1942). 


PID Control, Fig. 1 Effect 
of an increment of the three 
PID control actions on a 
point of the Nyquist plot 


Nyquist Diagram 
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Different approaches in this context are analyzed 
hereafter. 

Empirical Tuning 

Empirical tuning methods (like the Ziegler- 
Nichols and its refinements, Cohen-Coon, or 
Chien-Hrones-Reswick ones (O’Dwyer 2006)) 
consist in selecting the parameters of the PID 
controllers by using some empirical formulae 
which give the PID gains based on parameters of 
the process. Usually, the process parameters are 
those of a first-order-plus-dead-time (FOPDT) 
model or the ultimate gain and frequency of 
the process itself (the ultimate gain is the largest 
value of a proportional-only control that produces 
a sustained oscillation of the process variable, 
that is, that results in a marginally stable closed- 
loop system, while the ultimate frequency is 
the frequency of the corresponding sustained 
oscillation). These parameters can be obtained by 
means of a simple open-loop (step response) or 
closed-loop (relay feedback) experiment. 

Model-Based Tuning 

In model-based tuning (like the Dahlin’s and 
Haalman’s methods and the Internal Model Con¬ 
trol one (O’Dwyer 2006)), the PID control law 
is determined analytically starting from a process 
model and by selecting an appropriate (closed- 
loop) target transfer function. The user generally 
selects the desired closed-loop time constant as 
a tuning parameter which allows the handling of 
the trade-off between aggressiveness and robust¬ 
ness (and control effort). In this context, accord¬ 
ing to the well-known SIMC (Simplified Internal 
Model Control) tuning rules (Skogestad 2003), 
the closed-loop time constant should be selected 
equal to the dead time of the process. 

Optimal Tuning 

Optimal tuning rules aim at minimizing a given 
objective function. Usually, an integral function 
of the control error is selected for this purpose, 
for example, 


J = 



t n e 2 (t)dt 


(16) 


where n = 0,1,2 or the Integrated Absolute 
Error 

p DO 

IAE = / \e(t)\dt. (17) 

Jo 

By solving the optimization problem for different 
kinds of (normalized) processes and by interpo¬ 
lating the results, it has been possible to obtain 
tuning formulae that give the PID gains based 
on the process parameters. Actually, it should 
be noted that as no (robustness) constraints are 
considered in the optimization procedure, a poor 
robustness may eventually result in the control 
system. 


Robust Tuning 

Recently, tuning rules which explicitly consider 
the robustness issue have been devised. In partic¬ 
ular, the maximum of the sensitivity function is 
often considered as robustness index. A selected 
value of M s is then employed as a constraint 
in finding the optimal PID parameters which 
minimize a given performance index. An addi¬ 
tional constraint on the maximum complemen¬ 
tary sensitivity function can also be considered. 
For example, the AMIGO tuning rules (Astrom 
and Hagglund 2004) have been devised by ap¬ 
plying this approach, where the integral gain is 
maximized in order to obtain the best reduction 
of load disturbances. 


Automatic Tuning 

The functionality of automatically identifying the 
process model and tuning the controller based on 
that model is called automatic tuning (or, sim¬ 
ply, auto-tuning) (► Autotuning). In particular, an 
identification experiment is performed after an 
explicit request of the operator, and the values 
of the PID parameters are updated at the end 
of it (for this reason, the overall procedure is 
also called one-shot automatic tuning or tuning- 
on-demand). The design of an automatic tuning 
procedure involves many critical issues, such as 
the choice of the identification procedure (usually 
based on an open-loop step response or on a relay 
feedback experiment), of the a priori selected 
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PID Control, Fig. 2 Effect 
of an increment of three 
PID parameters on the 
controller Bode plot. The 
modifications are 
considered separately, 
namely, two parameters are 
fixed, while the other is 
modified 




Bode Diagram 
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PID Control, Fig. 3 

Example of the effect of an 
increment of three PID 
parameters on the loop 
transfer function Bode plot. 
The modifications are 
considered separately, 
namely, two parameters are 
fixed, while the other is 
modified 
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(parametric or non parametric) process model, 
and of the tuning rule. 

The one-shot automatic tuning functionality 
is available in practically all the single-station 
controllers available on the market. More ad¬ 
vanced control units might provide a self-tuning 
functionality, where the identification procedure 
is continuously performed during routine process 
operation in order to track possible changes of 
the system dynamics and the PID parameters 
values are adaptively modified. In this case, all 
the issues related to adaptive control have to be 
taken into account. In particular, performance 
assessment methodologies, which are capable to 
evaluate if the PID design can be improved, are of 
significant relevance in this context (► Controller 
Performance Monitoring). 

Design Tools 

Although one of the major advantages of 
PID controllers is their relative simplicity, 
Computer-Aided Control System Design tools 
(►Computer-Aided Control Systems Design: 
Introduction and Historical Overview) have been 
developed in order to help the user in their 
design (starting from the identification of the 
process) by taking into account the different 
control requirements in a given application 
(Guzman et al. 2008). In this context, all the 
additional functionalities can be considered, 
as well as more complex control architectures 
where, in any case, the PID control is still the 
basic element (►Control Structure Selection, 
► Control Hierarchy of Large Processing Plants: 
An Overview). 

Summary and Future Directions 

PID controllers are the most employed controllers 
in industry, and the knowledge about their use 
is well established, with the presence of many 
effective tuning and automatic tuning techniques. 
Despite this, PID controllers are still being de¬ 
veloped under many points of view. For example, 
design methodologies for more complex control 


schemes (like cascade control or control of mul¬ 
tivariable systems with or without the use of a 
decoupling strategy) can be improved. Further, 
the advancement of the technologies poses new 
problems that need to be addressed. For example, 
the use of wireless sensors and actuators calls for 
event-based PID controllers whose design should 
take into account the asynchronous sampling. 
The availability of faster and faster microproces¬ 
sors has also stimulated an increasing interest in 
fractional-order PID controllers which allows a 
more flexible design at the expense of an incre¬ 
ment of the complexity. 


Recommended Reading 

Basic concepts of PID controllers can be found 
in almost every book on process control. For a 
detailed treatment, see (Astrom and Hagglund 
2006) where all the methodological as well as 
technological aspects are covered. An excellent 
collection of tuning rules can be found in 
O’Dwyer (2006). More advanced topics can be 
found in Tan et al. (1999), Yu (2006), Johnson 
and Moradi (2005), Knospe (2006), Visioli 
(2006), Wang et al. (2008), Visioli and Zhong 
(2010), and Vilanova and Visioli (2012). 


Cross-References 

► Autotuning 

► Computer-Aided Control Systems Design: In¬ 
troduction and Historical Overview 

► Control Hierarchy of Large Processing Plants: 
An Overview 

► Controller Performance Monitoring 

► Control Structure Selection 
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Abstract 

The main types and variables of pilot-aircraft 
systems and pilot control response characteristics 
are considered. The basic regularities of pilot be¬ 
havior exposed in closed-loop systems are briefly 
discussed. Different types of models of pilot 
behavior are reviewed including classical models 
(McRuer’s and structural) and an optimal control 
model. 


Keywords 

Crossover pilot model; Describing function; 
Manual control; Pilot behavior; Pilot optimal 
control model Remnant spectral density; 
Structural model 


Introduction 

Modern flight control and navigation systems 
are characterized by two features: (1) they em¬ 
ploy fly-by-wire controls and (2) they introduce 
extensive automation support into the cockpit, 
ranging from complex augmented flight control 
systems in manual control modes to powerful 
flight management computers and autopilots that 
assume responsibility for most flight control tasks 
(and which may operate the aircraft in ways that 
are difficult for pilots to monitor and understand). 
These modern systems leave the pilot in a su¬ 
pervisory control mode most of the time. Conse¬ 
quently, crew members monitor, supervise, plan, 
and, in essence, serve as information managers. 
The level of supervisory control tasks can be 
different from conventional command control in 
which the operator issues auto-pilot commands 
(“set altitude”, “set airspeed,” etc.) and task-level 
control in which the operator issues commands 
such as “line formation,” “trail formation,” etc. 
Although civilian pilots have experience flying 
their aircraft manually, they are seldom in active, 
direct control of the aircraft. However, if a failure 
or unexpected upset occurs, they are required 
to assume control immediately. As for military 
pilots, they (especially fighter pilots) use manual 
control in the majority of piloting tasks. 

The effective use of manned flight vehicles has 
always required a satisfactory match of vehicle 
characteristics (which include vehicle dynamics, 
control manipulators, displays) with the human 
pilot’s characteristics as a flight controller. The 
provision of proper vehicle handling qualities by 
the flight control system and display and manip¬ 
ulator design has often posed serious problems 
which the vehicle system engineer must solve. 

Their solutions require the knowledge of mu¬ 
tual interactions between the pilot and the vehi¬ 
cle. The understanding of such interactions re¬ 
quires a mathematical theory which can be used 
to explain known findings and to predict new 
ones. For handling qualities, such theory is based 
on the methods of control engineering and treats 
the pilot-vehicle system as a closed-loop (in gen¬ 
eral, a multiloop) entity. The sine qua non of the 
theory is a model of pilot dynamic characteristics 
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Pilot-Vehicle System Modeling, Fig. 1 Pilot-aircraft system 


in a form suitable for application using relatively 
conventional control engineering techniques. An 
adequate description of a pilot’s dynamics re¬ 
sponse characteristics is not easily obtained be¬ 
cause of the pilot’s inherent adaptability and 
capacity for learning. 


Main Variables of the Pilot-Aircraft 
System 

The pilot-aircraft manual control system, shown 
in Fig. 1, is characterized by a number of 
variables. The main group of these variables 
is the so-called task variables which comprise 
all the system inputs (command inputs i(t), 
disturbances d(t)) and control system elements 
(display, manipulators, and controlled element 
dynamics, which is defined by the aircraft frame 
and flight control system dynamics). 

A specific feature of pilot-aircraft systems 
is the dependence of the piloting task on the 
task variables. For different piloting tasks, these 
variables or their parameters differ too. Stability 
of the closed-loop system is always a necessary, 
though not sufficient, criterion for the control 
strategy. Consequently, the pilot’s dynamics are 
profoundly affected by the display and controlled 
element dynamics, because his response must be 
adapted to provide the necessary loop stability 
and accuracy. The characteristics of the other task 
variables ( i(t),d(t )), related to the mission and 
control strategy, also exert direct influence on the 
pilot dynamics, although their effects are more 


in the nature of adjustment and emphasis than of 
changes in fundamental form. 

These variables constitute an enormous range 
of possible conditions and piloting tasks. In ad¬ 
dition to the task variables, the other groups 
of variables-procedural (^-instructions, training 
schedule order of presentation of trials etc.), en¬ 
vironmental (e-illumination, vibration, temper¬ 
ature, and so forth), pilot centered (cr-physical 
condition, motivation etc.)—have less influence 
on pilot-aircraft system features. 


Types of Pilot-Aircraft Systems 

The structure of the pilot-aircraft system depends 
on the piloting task. Some tasks (for example, 
the pitch tracking task) can be interpreted with 
the help of the single loop compensatory block 
diagram. In that case the pilot perceives only the 
error signal, y(t) = e(t) = i(t ) — x(t ), and 
control c(t). Figure 1 is the pilot pitch control 
command. The other tasks require more com¬ 
plicated descriptions. For example, the landing 
task is a multiloop compensatory task, where 
the inner loop closed by the pilot is the pitch 
control loop. Some piloting tasks are multichan¬ 
nel control tasks, in which the pilot perceives 
several visual stimuli (for example pitch angle 
and bank angle) and generates commands in 
several channels too. Pilots also perceive stimuli 
of different sensing modalities (visual, vestibular, 
kinesthetic). In cases where these influence his 
actions, the multimodality of the pilot-aircraft 
system has to be analyzed. 
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A great many past experiments in which 
human dynamic measurements were taken 
have been conducted for investigation of 
compensatory tracking tasks. Some practical 
piloting tasks (e.g., aim-to-aim tracking in case 
when the target flies against a background of 
clouds) correspond to pursuit conditions. In that 
case, the pilot perceives the information about 
the error signal e{t ) and the input signal i (t). 

In many piloting tasks the single loop compen¬ 
sation system defines the main features of more 
complicated types of pilot-aircraft systems and its 
flying qualities. Therefore, this type of the system 
has been investigated in more depth. 


Pilot Control Response 
Characteristics 

The most obvious aspect of human dynamic be¬ 
havior in a manual control task is the pilot’s 
control actions within that task. When the key 
variables are fixed and the signals in the control 
loop are approximately time stationary over an 
interval of interest, the pilot-vehicle system can 
be presented as a quasi-linear system. In that 
case, the pilot response can be presented by 
two components: the pilot-describing function, 
W p ( jco ), taking into account the linear portion of 
pilot response on the stimulus e(t), and remnant 
n e (t ), which takes into account all nonlinear, 
nonstationary effects of pilot behavior (Fig. 2). 


In the majority of piloting tasks n e (t) is a 
stationary process characterizing the remnant 
spectral density S ne n e (co ) (McRuer and Krendel 
1974). The pilot control response characteristics 
Wp(jco) and S neUe (co ) depend explicitly on 
the task variables (McRuer and Jex 1967; 
McRuer et al. 1968). In much experimental 
research, the technique for identification of 
these characteristics was based on the use 
of an input signal consisting of the sum of 
non-harmonically-related sine waves with cut 
off frequency cot at 1.5, 2.5, and 4 rad/s 
and different controlled element dynamics 
(Allen and Jex 1972; Magdaleno 1972; Shirley 
1969). 

In addition to control response, other types of 
pilot’s responses also characterize his behavior: 
physiological ( F ) and psychophysiological x/r re¬ 
sponses (Fig. 1). For one of the psychophysio¬ 
logical response characteristics, the pilot opinion 
rating (PR) is widely used in experimental in¬ 
vestigations as well as for the measurement of 
pilot control response. Pilot opinion ratings are 
defined by specialized scales (e.g., the Cooper- 
Harper scale (Cooper and Harper 1969)). 


Modeling Pilot Behavior in Manual 
Control 

Experimental investigations have demonstrated 
a specific regularity: for a variety of forcing 
functions and controlled elements the slope of the 
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Pilot-Vehicle System Modeling, Fig. 2 Quasi-linear paradigm for the human pilot 
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Pilot-Vehicle System 
Modeling, Fig. 3 Pilot 
structural model 



open-loop describing function \ Woh(j° ) )\ vs fre¬ 
quency was unity, i.e., —20dB/dec in the region 
of the crossover frequency co c (McRuer and Jex 
1967). This observation has led to the conclusion 
that near co c , Wol(Jco) can be presented by the 
“crossover model” (McRuer and Jex 1967) 

(Dr 

WolUcd) = W p {j(D )• Wc = —e~^ 

JCD 

This model has two parameters: 

C0 c = Mcoitec) + A CD(CDi) 
r e = T 0 (co c ) + At (o)i) 


A more complicated model of pilot describ¬ 
ing function (“structural model”) was offered by 
R. Hess (1979, 1984). It takes into account the 
additional inner loop generated by the pilot as 
a result of his response to the kinesthetic cue 
(Fig. 3). The modification of this model (Efremov 
and Tjaglik 2011) demonstrated good agreement 
with the pilot describing function as measured in 
experiments. One of the features of this modified 
model is the criterion used for the parameter 
optimization: I = min[cr?] or I = min[cr? + 
fia 2 ]. This procedure requires the knowledge of 
the pilot remnant spectral density. For the single 
loop system, such a model was developed by 
Levison et al. (1969). 


For the controlled element dynamics Wc = 
j(rf+i) ? i ncre ase of constant T leads to an 
increase of To and a decrease of coco • The empiri¬ 
cal dependences of A co c and Ar e on &>/ obtained 
for the rectangular form of input spectrum are the 
following: Aco = 0.18&>/, At = —0.07a;/. 

McRuer proposed several modifications of the 
open-loop system crossover and pilot describing 
function models (McRuer and Krendel 1974). 
One of the simplest ones (used widely in many 
researches) which might be recommended for 
description of pilot-aircraft system characteristics 
in the crossover frequency range is the following 


W p (jco) = K p 


Tijco + 1 


Sn e n e (0)) = O.OlTT 


+ 

1 + T p co 2 


In the limited number of researches, the classic 
approach to pilot modeling considered above was 
used for more complicated types of the pilot- 
aircraft system, when the pilot perception of mo¬ 
tion cues was taken into account (multimodality 
system (Hess 1990)) or for a case of the multiloop 
pilot-aircraft system (Stapleford et al. 1967). 

A different approach to pilot behavior model¬ 
ing was developed by Kleiman et al. (1970). It is 
based on the modern optimal control theory and 
assumes that the pilot’s goal is to minimize the 
cost function: 


The selection of the parameters K p , T L , and 7/ 
is carried out by using “adjustment rules” so 
that the closed-loop system conforms to experi¬ 
mental frequency response characteristics. These 
adjustment rules reflect the main features of pilot 
behavior - adaptation and optimization. 


1 C 1 

I = lim — / (xQx t + uQ c u t + uG c u T )dt 
T T Jo 

The model takes into account the main pilot 
limitation parameters: time delay in perception, 
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Pilot-Vehicle System 
Modeling, Fig. 4 Optimal 
pilot model 



Human operator model 


the observation and motor noises, and the neuro¬ 
muscular dynamics. 

The predictive part of the model consists of 
the optimal controller (—L ), Kalman filter and 
predictor (Fig. 4). The software for definition of 
these elements allows the use of this model for 
the different types of the pilot-aircraft systems. 

The classical and optimal pilot behavior mod¬ 
els have been applied widely for different man¬ 
ual control tasks: the development of alternative 
criteria for flying qualities (Efremov et al. 1998; 
Neal and Smith 1971), the flight control system 
(Schmidt 1979) and display design (Klein and 
Clement 1973), the analysis of reasons for pilot- 
induced oscillation (McRuer 1997) and the de¬ 
velopment of means for its suppression (Efremov 
1995), and many others. 

In some of the researches, attempts have been 
made to find the relationship between the pa¬ 
rameters of the closed-loop system, pilot con¬ 
trol response characteristics, and pilot opinion 
ratings. The technique developed in these re¬ 
searches is called the “paper pilot technique” 
(Anderson 1970). The following modification of 
this technique has enabled a close match between 
the results of mathematical modeling (PR, T^, 
accuracy, etc.) of the different types of the pilot- 
aircraft system and the results of experimental 
investigations (Efremov and Ogloblin 2006). 


Summary and Future Directions 

Pilot behavior has been studied extensively 
for single-loop stationary manual control 
tasks. Two approaches to the mathematical 
modeling of the pilot behavior have been 


developed: classical and optimal control. Both 
of them have produced good agreement with 
experimental results. The discussed models 
describe one of the main features of the 
pilot adaptation - “parameter adaptation”, 
when a change of any task variable causes a 
change of human operator control response 
characteristics. Only a limited number of 
experimental investigations have been carried 
out for more complicated cases: multiloop and 
multimodality pilot-aircraft closed-loop systems. 
Broader investigations are necessary in the future 
to obtain accurate pilot mathematical models 
for these cases. Future investigation in pilot 
behavior modeling area is also necessary for 
better formulations of other aspects of pilot 
adaptation: 

- “Structural adaptation”, when the pilot selects 
the loops and the best type of behavior (com¬ 
pensatory, pursuit, etc.) appropriate for the 
different task variables and, in the case of the 
flight control system, changes in dynamics. 

- “Goal adaptation”, when a change of the pilot¬ 
ing task or a failure in the controlled element 
dynamics is accompanied by a change of the 
goals. 

Other future directions in pilot modeling are the 
development of models to predict the results 
in the case of sharp changes of controlled 
element dynamics, to optimize the controlled 
element dynamics, to define the relationship 
between the pilot control response characteristics 
and his opinion rating in different piloting 
tasks, to get new criteria for the handling 
qualities, prediction of pilot-induced oscillations, 
and to solve many other manual control 
problems. 
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Abstract 

Polynomial techniques have made important con¬ 
tributions to systems and control theory. Alge¬ 
braic formalism offers several useful tools for 
control system design. In most cases, control 
systems are designed to be stable and to meet 
additional performance specifications, such as 
optimality or robustness. The basic tool is a 
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parameterization of all controllers that stabilize 
a given plant. Optimal or robust controllers are 
then obtained by an appropriate selection of the 
parameter. An alternative tool is a reduction of 
controller synthesis to a solution of a polynomial 
equation of specific type. These two polyno¬ 
mial/algebraic approaches will be presented as 
closely related rather than isolated alternatives. 

Keywords 

Controller synthesis; Linear systems; Polynomial 
equation approach to control system design; 
Youla-Kucera parameterization of stabilizing 
controllers 

Stabilizing Controllers 

The majority of control problems can be formu¬ 
lated using the diagram shown in Fig. 1. Given a 
plant S , determine a controller R such that the 
feedback control system is stable and satisfies 
some additional performance specifications, such 
as reference tracking, disturbance attenuation, 
optimality, or robustness. 

Suppose that the plant and the controller are 
linear time-invariant single-input single-output 
continuous-time systems with real rational 
transfer functions Sand R , respectively. Stability 
is understood as the input-output stability , i.e., 
whenever the exogenous inputs <5 and p are 
essentially bounded in amplitude, so too are the 
output signals pi and ri (hence also s and v). 

It is natural to separate the design task 
into two consecutive steps: (1) stabilization 
and (2) achievement of additional performance 
specifications. To do this, all solutions of the first 
step, i.e., all controllers that stabilize the given 
plant , must be found. 

How can one characterize such controllers? 
Denote H s the reference-to-error transfer 
function (sometimes called the sensitivity 
function) and H c the disturbance-to-control 
transfer function (the so-called complementary 
sensitivity function) in the closed-loop control 
system, namely, 


1 +SR 1 + SR 

Now suppose that S can be expressed as the ratio 
of two coprime polynomials, S = b/a, and that 
the controller has alike form, R = q/p. Then the 
two closed-loop transfer functions can be written 
as 

P 

H s = a -— := aX, 

ap + bq 

H c =b -:= bY 

ap + bq 

Consequently, if R stabilizes S , then the ratio¬ 
nal functions X and Y are bound to be stable. 
These functions cannot be arbitrary, however, 
since H s + H c = 1. The stability equation 
follows as 

aX +bY = 1. 

Any stabilizing controller for S can be expressed 
as R = Y/X{= q/p ), where X and Y are a 
stable rational solution pair of the stability equa¬ 
tion. This solution can be expressed in parametric 
form: 

X=x + bW, Y = y-aW , 

furnishing in turn an explicit parameterization of 
the set of all stabilizing controllers for S : 

R= y-aW_ 
x + bW’ 

known as the Youla-Kucera parameterization. 
Here x and y are any polynomials satisfying the 
Bezout equation ax + by = 1, while W is 
a free parameter ranging over the set of stable 
real rational functions such that x + bW is not 
identically zero. 
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Example 1 Consider an integrator plant S (s) = 
1 /s. The Bezout equation admits a solution x = 
0 , y = 1 so that the set of all stabilizing 

controllers for S is given by 


m = 


1 -sW 
W 


for any stable real rational W ^ 0. 
The parameter 


W(s) = 


1 

7TT 


yields R = 1, a proportional gain controller. The 
parameter 


W(s) = 


s 

s 2 + s + 1 


results in a proportional-integral controller 


R(s) = 1 + 

s 

Taking W = 1 leads to the stabilizing controller 
R(s ) = 1 — s. The feedback system is stable, but 
it has a pole at s = oo. 


Additional Performance 
Specifications 

There is a simple formula that generates all the 
stabilizing controllers for a given plant. Using 
this formula, we can obtain a parameterization 
of all stable closed-loop transfer functions that 
can be obtained by stabilizing a given plant. The 
bonus is that the parameterization is affine in the 
free parameter W. In contrast, the controller R 
appears in a nonlinear fashion: 


V 

1 

■ 1 

R ' 


d 

_y _ 

~ 1 + SR 

s 

SR . 


r 


a(x + bW) 
b(x + bW) 


a(y — aW) 
b(y-aW ) 



design process and calculate R subsequently. 
Thus, the parameterization of all stabilizing con¬ 
trollers makes it possible to separate the design 
process into two steps: the determination of all 
stabilizing controllers and the selection of the 
parameter that achieves the remaining design 
specifications. The extra benefit is that both tasks 
are linear. 


Asymptotic Properties 

Asymptotic properties of control systems can 
easily be accommodated in the sequential design 
procedure. These include the elimination of an 
offset due to step references, the ability of system 
output to follow a class of reference signals, 
or the asymptotic elimination of specific distur¬ 
bances. 

In Fig. 1, asymptotic reference tracking means 
that the output ri follows the reference p as time 
approaches infinity, which is to say that the error 
s approaches zero for large times. On the other 
hand, we speak of asymptotic disturbance elimi¬ 
nation if the effect of the disturbance <5 decreases 
at the output r] for increasing time. In terms of 
Laplace transforms, s = H s p and rj = SH s 8 are 
to be stable rational functions. 


Example 8 .1 Consider the plant S (s) = 1 / (s + 
1). The Bezout equation admits a solution x = 0, 
y = 1. The set of all stabilizing controllers for S 
is 


R(s) = 


i - o + i)w 
w 


for any stable real rational W 0. The 
achievable sensitivity transfer functions are 
H s = (s + l)W. 

To track a step reference, p = l/s, we 
must take W = sW\ for any stable rational 
W\ ^ 0. To eliminate a sinusoidal disturbance, 
8 = s/(s 2 + co 2 ), we constrain the parameter as 
W = (s 2 + co 2 )W 2 for any stable rational W 2 ^ 
0. To meet both requirements, we simply take 
W = s(s 2 + co 2 )W 3 for any stable rational W 3 7 ^ 
0, say W = s(s 2 + co 2 )/(s + l) 4 . 

The resulting controller is 


As R and W are in a one-to-one correspondence, 
it is convenient to use W in lieu of R in the 


R(s) = 


3s 3 + (6 - w 2 )s 2 + (4 - co 2 )s + 1 

+ a) 2 ) 
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The controller obtained in Example 8.1 demon¬ 
strates the internal model principle: the unstable 
modes to be followed or eliminated must be gen¬ 
erated by the controller unless they are present in 
the plant. 


H2 Optimal Control 

The sequential design procedure will be further 
illustrated on the design of linear-quadratic op¬ 
timal controllers. Given a plant with transfer 
function S = b/a, the task is to find a con¬ 
troller that stabilizes the control system of Fig. 1 
while minimizing the H 2 norm of some closed- 
loop transfer function, say of the complementary 
sensitivity function H c . 

The H 2 norm is defined for any strictly proper 
rational function G analytic on the imaginary axis 
as 

00 


iigii 2 


1 

2 7T 


J I G(jco) \ 2 


dco . 


The set of complementary sensitivity functions 
that can be achieved in the stabilized control 
system is 

H c =b(y-aW), 


where W is a free stable rational parameter. The 
parameter will be selected so as to minimize the 
H 2 norm of H c . 

Let a/3 be a polynomial defined by keeping 
the stable (in Re s < 0) zeros of ab while 
replacing the unstable (in Re s >0) ones with 
their negative values. Then ab/afi is inner (or all 
pass) and 


l|tfc|| 2 = 



ayl 

a 


aWp 


Consider the decomposition 


c^l =r + q 

a a 


where r is a polynomial and q/a is strictly proper. 
With this decomposition, 


\H C u = 


+ || r — aWfi ||| 


because q/a and r — aWf} are orthogonal and 
thus the cross-terms contribute nothing to the 
norm. The last expression is a complete square 
whose first part is independent of W. Hence the 
minimizing parameter is W = r/ot/3, and if it 
is indeed stable and admissible, it defines the 
unique optimal controller. Otherwise, no optimal 
controller exists. 

The consequent minimum norm equals 

mlnll^H, = I | | 


Example 8.2 To illustrate, consider the plant 
S(s) = \/{s — 1). The class of all stabilizing 
controllers for S is found to be 


R(s) = 


1 — (s — 1)W 
W 


for a free stable rational parameter W 7 ^ 0. The 
complementary sensitivity transfer function is 


H c (s) = l-(s-l)W. 


Now a=s + l,/3 = l and the polynomial part 
of 


ayP s + 1 2 

-2JL = - = l +- 

a s — 1 s — l 

is r = 1. Thus H c attains minimum H 2 norm for 


W(s) = 


1 

^TT 


and the corresponding optimal controller is 
R(s) = 2 . 

The optimal complementary sensitivity func¬ 
tion is 


and || H c || 2 = y/2. 


Robust Stabilization 

The notion of robust stability addresses stabiliza¬ 
tion of plants subject to modeling errors, when 
the actual plant may differ from the nominal 
model, using a fixed controller. The ultimate goal 
is to stabilize the actual plant. The actual plant is 
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unknown, however, so the best one can do is to 
stabilize a large enough set of plants. 

Thus the basis technique to model plant un¬ 
certainty is to model the plant as belonging to 
a set. Such a set can be either structured - for 
example, there is a finite number of uncertain 
parameters - or unstructured: the frequency re¬ 
sponse lies in a set in the complex plane for every 
frequency. The unstructured uncertainty model 
is more important for several reasons. On the 
one hand, it is well suited to represent high- 
frequency modeling errors, which are generically 
present and caused by such effects as infinite¬ 
dimensional electromechanical resonance, trans¬ 
port delays, and diffusion processes. On the other 
hand, the unstructured model of uncertainty leads 
to a simple and useful design theory. 

The unstructured set of plants is usually 
constructed as a neighborhood of the nominal 
plant, with the uncertainty represented by 
additive or multiplicative perturbations. The size 
of the neighborhood is measured by a suitable 
norm, most common being the H 00 norm that is 
defined for any rational function G analytic on 
the imaginary axis as 

II Glloo = SU PI G(jco) | . 

C0 


so that | F(jco) | provides the uncertainty profile 
while A accounts for phase uncertainty. 

Now suppose that R is a controller that sta¬ 
bilizes the nominal plant S. Applying the small 
gain theorem, R is seen to stabilize the entire 
family of plants S& if and only if 

I|h c f|| 00 <i. 

This is a necessary and sufficient condition for 
robust stabilization of the nominal plant S. 

The set of all stabilizing controllers for S = 
b la is described by the formula 

y-aW 
x + bW 

where ax + by = 1 and W is a free stable 
rational parameter. The robust stability condition 
then reads 

\\b(y-aW)F\\ 00 <\. 

Any stable rational W that satisfies this inequality 
then defines a robustly stabilizing controller R for 
S. In case W actually minimizes the norm, one 
obtains the best robustly stabilizing controller. 


Let us illustrate the design for robust stability 
under unstructured norm-bounded multiplicative 
perturbations. Consider a nominal plant with 
transfer function S and its neighborhood S a 
defined by 


S A := (1 + FA)S 

where F is a fixed stable rational function and 
A is a variable stable rational function such that 
II A lloo < 1. 

The idea behind this uncertainty model is that 
FA is the normalized plant perturbation away 
from 1: 

— — 1 = FA. 

s 

Hence if || A < 1, then for all frequencies co 


Example 8.3 Consider a plant with the transfer 
function 


Sr(s) = 


S + 1 
5 — 1 


e 


—X S 


where the time delay r is known only to the 
extent that it lies in the interval 0 < r < 0.2. 
The task is to find a controller that stabilizes the 
uncertain plant S x . The time-delay factor e~ xs 
can be treated as a multiplicative perturbation of 
the nominal plant 


S(s) = 


5 + 1 
5-1 


by embedding S T in the family 


S A : = (1 + FA)S 


syjco) 

S(jco) 


= I F (y cd) | 


where A ranges over the set of stable rational 
functions such that || A < 1. To do this, F 
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Polynomial/Algebraic 
Design Methods, Fig. 2 

Bode plots of F {dotted) 
and e~ 02s — 1 {solid) 


Bode Plots 



should be chosen so that the normalized pertur¬ 
bation satisfies 


Sa(Jo)) 

S(jco) 


e - jcor -l\ < | F (jco) | 


for all co and r. A little time with the Bode 
magnitude plot shows that a suitable uncertainty 
profile is 


m = 


3 s + 1 
s + 9 


Figure 2 is the Bode magnitude plot of this F and 
e~ xs — 1 for r = 0.2, the worst value. 


The task of stabilizing the uncertain plant 
S r is thus replaced by that of stabilizing every 
element in the set S a, that is to say, by robustly 
stabilizing the nominal plant S with respect to the 
multiplicative perturbations defined by F. 

The set of all stabilizing controllers for S is 
found to be 


where 


P(i)=0.5(i + l)^2, 
s + 9 

3s + 1 

Q(s) = (s-l)(s + l)——. 

s + 9 

Since Q has one unstable zero at s = 1, it follows 
from the maximum modulus theorem that the 
minimum of the Hoq norm taken over all stable 
rational functions W is .P(l) = 0.4 < 1 and this 
minimum is achieved for 


W(s) = 


m ~ PQ) 
GO) 


1 15s+ 31 

To (s + 1) (3s + 1) 


Thus, the robust stability condition is satisfied, 
and the corresponding best robustly stabilizing 
controller is 


R(s) = 


2 s + 9 

i3s + r 


R(s) = 


0.5 —(s — \)W 
-o.5 + (s + Tyw 


Polynomial Equation Approach 


where W ^ 0.5/(.v + 1) is any stable rational 
parameter. The robust stability condition reads 

ii ^ - e^iioo < i 


In order to determine the set of all stabilizing 
controllers for a given plant, it is enough to deter¬ 
mine one particular solution of the Bezout equa¬ 
tion. It is therefore plausible that performance 
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specifications in addition to stability can be met 
by selecting an appropriate solution of a poly¬ 
nomial equation that is related to the Bezout 
equation. 

The reduction of controller synthesis to solv¬ 
ing polynomial equations is referred to as the 
polynomial equation approach to control system 
design. The equations involved are Diophantine 
equations of the form 


Thus W parameterizes all stabilizing controllers 
for S , the denominator polynomial d of W speci¬ 
fies the positions of the control system poles, and 
the numerator polynomial w of W represents the 
remaining degrees of freedom, i.e., parameterizes 
all stabilizing controllers that assign the specified 
poles. 

Example 8.4 Consider the plant S(s ) = l/(s — l) 
and the set of stabilizing controllers for S : 


ap + bq = d 

where a, b, and d are given polynomials and p , 
q are polynomials to be found. Such an equation 
is solvable for any d if and only if a and b are 
coprime polynomials. Then, the solution set is 
given by 


p = Po + bt, q = qo — at 


where po, qo is a particular solution and t is an 
arbitrary polynomial. 

Such an equation is in fact the pole placement 
equation. Thus, pole placement is a prototype 
control problem. 

Pole Placement 

The requirement of stability places all closed- 
loop system poles within the left half-plane Re 
s < 0. Very often, however, we wish to allocate 
the poles to a specific region of the half-plane or 
to achieve specific pole positions. 

Given a plant S = b/a, the set of all stabiliz¬ 
ing controllers for S is 


R(s) = 


l — (s — l)W 
W 


W ^ 0. 


Let the desired pole locations be given by the 
polynomial d = s 2 + 2s + 1. This is achieved 
by putting W = w/d for an arbitrary numerator 
polynomial w ^ 0. 

It is to be noted that d specifies the poles at 
finite positions only. Poles at s = oo will occur 
whenever R is not proper rational. To avoid this 
situation, w should be constrained to w = s + co 
for any real co. Then the set of controllers that 
achieve the desired pole placement is 


R(s) = 


(3 — co)s + (1 + co) 
s + co 


Alternatively, one can solve the pole placement 
equation ap + bq = d directly. The solution set 
is 


p = t , q = s 2 + 2s + \ — (s — l)t 
and q/ pis proper if and only if t = s-\-co,co real. 


y-aW 
x + bW 

where x, y are polynomials such that ax + by = 
1 and IT is a free stable rational parameter. Let 
W = w/d for a stable polynomial d. Then 

^ dy — aw q 

dx -\~bw p 

and the closed-loop pole polynomial is given by 
ap + bq = d(ax + by) = d . 


H 2 Optimal Control 

The H 2 optimal control is a special case of 
pole placement. Indeed, the optimal controller is 
given by 

R _ y -aW _ y _ gyp - ar _ q_ 
x + bW x + b-^ axfi> + br p 

and 

ap + bq = a{axfi> + br) + b(ayfi — ar) 

= gfi>(ax + by) = gfi . 
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Thus, the (finite) pole positions of the // 2 optimal 
control system are given by the pole polynomial 
d = a/3. The system has no poles at s = 00 as the 
optimal complementary sensitivity function H c is 
strictly proper. 

The pole placement equation, however, has 
more than one solution. Which one is optimal? 
The one with q/a is strictly proper. It is the 
solution pair p , q with q having a least degree. 

Example 8.5 Let us reconsider Example 8.2. As 
an alternative, one can solve the Diophantine 
equation 

(s — \)p + q = s + 1 

for the solution pair p , q such that q/(s — 1) 
is strictly proper. This yields the least-degree 
solution pair with respect to q , namely, p = 1, 
q = 2. The optimal controller is R = q/p = 2. 

Summary and Future Directions 

The benefits of representing stabilizing con¬ 
trollers by a single parameter include (1) 
easy accommodation of additional design 
specifications by selecting an appropriate 
parameter, (2) all transfer functions in a stabilized 
system are linear in the parameter (while they 
are nonlinear in the controller), and (3) the 
parameter belongs to a smaller set of stable 
rational functions (while the controller is any 
rational). 

The results presented here for linear time- 
invariant systems with rational transfer functions 
can be generalized to extend the scope of the 
theory to include distributed-parameter systems, 
time-varying systems, and even nonlinear sys¬ 
tems. 

The transfer functions of distributed- 
parameter systems are no longer rational, and 
coprime factorizations cannot be assumed 
a priori to exist. The coefficients of time- 
varying systems are functions of time, and the 
operations of multiplication and differentiation 
do not commute. In nonlinear systems, transfer 
functions are replaced by input-output maps. 


Technical assumptions may prevent one from 
parameterizing the entire set of internally 
stabilizing controllers; still, the subset may be 
large enough for practical purposes. For many 
systems of physical and engineering interest, the 
above difficulties can be circumvented and the 
algebraic/polynomial approach carries over with 
suitable modifications. 

Cross-References 

► Control of Linear Systems with Delays 

► Feedback Stabilization of Nonlinear 
Systems 

► H-Infinity Control 

► H 2 Optimal Control 

► Linear State Feedback 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 

► Spectral Factorization 

► Tracking and Regulation in Linear 
Systems 

Recommended Reading 

The use of polynomials, in one way or another, 
in feedback control system design can be traced 
back to Newton et al. (1957) and Jury (1958). 
The authors noted that for a closed-loop system 
to be stable, H c must absorb the plant unstable 
zeros. The plant was assumed to be stable; if 
this assumption were dropped, H s would have 
been found to absorb the plant unstable poles. 
These conditions are equivalent to polynomial 
divisibility conditions and hence to the Bezout 
stability equation, which appears later in Kucera 
(1974). 

The first attempt to use polynomials in an 
explicit manner is due to Volgin (1962), a 
student of Tsypkin. He obtained a solution of 
the pole placement problem through the solution 
of a polynomial equation, known as the pole 
placement equation. Astrom (1970) published a 
polynomial equation solution to the minimum 
variance control problem for minimum-phase 
plants. The ultimate publication that presents 
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the polynomial equation approach to multi-input 
multi-output control system design is Kucera 
(1979). 

The underlying problem in any control system 
design is that of stability. It is logical to design 
the control system step by step: stabilization 
first and then the additional performance 
specifications. To do this, we need to know 
any and all stabilizing controllers for the given 
plant. 

This problem was first addressed and solved 
for finite-dimensional, linear time-invariant 
systems using transfer function methods; see 
Larin et al. (1971), Kucera (1975), Youla et al. 
(1976a,b), and Kucera (1979). A state-space 
representation of all stabilizing controllers was 
derived later by Nett et al. (1984). 

It took decades to appreciate the importance 
of the result and come up with applications. The 
milestones were the observations by Desoer et al. 
(1980) that the polynomial fraction approach can 
be extended to linear systems with nonrational 
transfer functions, as well as the result by 
Hammer (1985) showing that the approach is 
applicable to a broad class of nonlinear systems. 
Further generalizations were obtained by Paice 
and Moore (1990), Anderson (1998), and 
Quadrat (2003, 2006). 

The parameterization of all controllers that 
stabilize a given plant was labeled the Youla- 
Kucera parameterization in Anderson (1998). 
This result launched an entirely new area of 
research and has ultimately become a new 
paradigm for control system design. 

Tutorial textbooks on this subject include 
Vidyasagar (1985), Doyle et al. (1992), and 
Kucera (2003, 2011). The reader is further 
referred to the survey papers by Kucera (1993), 
Anderson (1998), and Kucera (2007). 

Advanced and recent applications of 
the Youla-Kucera parameterization include 
stabilization under constrained inputs (Henrion 
et al. 2001), robust stabilization with fixed-order 
controllers (Henrion et al. 2003), accommodation 
of time-domain constraints on inputs and outputs 
(Henrion et al. 2005a), and determination of 
least-order stabilizing controllers (Henrion et al. 
2005b). 
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Abstract 

Voltage stability of electric power systems is a 
challenging topic both theoretically and in prac¬ 
tice. This article touches briefly on the main 
aspects of the problem and highlights theoretical 
foundations and fundamental methods for voltage 
stability analysis. The single-load radial system is 
used to introduce relevant concepts, such as the 
PV curve and the instability mechanism, while 
the implications for a meshed, multiple-load sys¬ 
tem are briefly outlined. Some applications to 
practical problems are briefly enumerated. 


Keywords 

Active and reactive power; Load dynamics; Load 
tap changers (LTC); Maximum power transfer; 
PV curve; Stability conditions 


Introduction 

Voltage stability is related to the maximum power 
transfer in an AC (alternating current) network. 
In normal conditions, system load demand should 
never come close to this limit. As, however, elec¬ 
tricity demand started swelling after 1970s with 
an increasingly faster pace, transmission network 
investments could not follow closely enough. 
Investment cost in transmission is usually high, 
and difficulties with environmental constraints 
and “not in my back yard” mentality of local 
communities did not make transmission network 
expansion any easier. Power systems are thus 
relying for their continuing operation more and 
more on (reactive power) compensation and auto¬ 
matic controls to maintain transmission capacity 
of relatively weakening networks. 

As a result several instances of voltage insta¬ 
bility started to appear in several industrialized 
countries after the 1980s (Taylor 1994) leading 
to smaller or larger area blackouts, much to the 
surprise of the power engineering community 
that was not prepared to deal with this type of 
events, in which a usual and expected phase of 
gradual voltage decline suddenly precipitates to 
an uncontrollable voltage drop leading to partial 
or total blackout after a succession of equipment 
disconnection by protection devices. 

In power system engineering practice, voltage 
drops following load ramping or sudden events, 
such as equipment loss (line, generator switching, 
etc.), usually referred to in power engineering 
literature as contingencies, are calculated by solv¬ 
ing a set of nonlinear algebraic equations known 
as the power flow problem. As these are “steady- 
state” equations, the dynamic aspect leading to 
an accelerating, cascading failure is not obvious. 
One should notice however in the above account 
the keyword “nonlinear”: nonlinear equations at 
the maximum power transfer limit no longer have 
a solution. This was and is one of the keys in 
understanding the voltage stability problem. To 
take it one step further, close to the loss of 
solution (loss of equilibrium), a set of dormant 
(up to this point) dynamics become dominant 
leading the system to instability. The following 
sections will explain these notions. 
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Power System Voltage Stability, Fig. 1 Single-load 
radial system 


Single Generator-Load (Radial) 

System 

Maximum Power Transfer 

In any electric network (DC or AC), there is a 
maximum power that can be transferred between 
any two nodes. In a two-node radial system, the 
maximum power transfer coincides with the well- 
known impedance matching conditions. For a 
radial AC system, when the load is restricted to 
a constant power factor, the impedance matching 
condition is that the source (network) impedance 
is equal in magnitude to the load impedance. 

Consider the radial system of Fig. 1. In this 
system we assume that the load active power P 
(and possibly reactive power Q) is fed through 
a transformer with adjustable tap ratio r (in per 
unit). The tap is automatically adjusted by a load 
tap changer (LTC) so as to keep the secondary 
voltage V 2 within a deadband. We will consider 
throughout that the LTC is a part of the load. 

The simplest case for this radial system is 
when both the line and transformer are lossless 
(Pg = P\ = P ) and the load is kept to unity 
power factor (Q = 0). The generator is assumed 
as a constant voltage source E. If we further 
assume that the transformer leakage reactance 
is negligible (Qi = Q = 0 in Fig. 1), the 
maximum power transfer in this simple case is 
encountered when the load impedance, as seen 
from the primary (r 2 /G), is equal to the line 
reactance: 

X = r 2 /G (1) 

where the load conductance G = P/F 2 2 . It can 
be readily shown that the maximum power in this 
case is P max = E 2 /2X. Note that this is a static 
condition that is not related to how the load varies 
with the voltage V 2 . 


The most popular way of visualizing the max¬ 
imum power condition is through the PV curve of 
Fig. 2, in which the consumed (transferred) power 
P is plotted versus the primary (transmission) 
side voltage V. 

In Fig. 2 the nose-shaped solid line is the net¬ 
work characteristic corresponding to all possible 
solution of the network equations for a given P 
(or V). The maximum power transfer is easily 
identified as the tip of the curve (point C). Note 
that PV curves can be plotted for any load power 
factor and line resistance. 

Load Dynamics and Voltage Stability 

As stated above, maximum power transfer is a 
static condition based on network equations only. 
To identify its relation to voltage stability, some 
form of load dynamics must be introduced. Load 
dynamics are generally changing the load charac¬ 
teristics so as to adjust load power consumption 
P to a given load demand P 0 . As a disturbance 
usually reduces voltage (and thus consumption 
of a voltage-sensitive load), load dynamics tend 
to restore the consumption to the pre-disturbance 
demand. 

Load restoration can be continuous, for in¬ 
stance, represented by a time-varying conduc¬ 
tance following the ODE: 



Clearly in this case, the stability condition is that 
the consumption P = GF 2 2 increases with the 
increase of the load conductance G: 


It is easily verified from Fig. 2 that this condition 
is met only in the upper part of the PV curve 
before point C, whereas in the lower part, after 
point C, increased conductance results in lower 
consumption violating (3). Clearly at C, (3) holds 
as an equality. 

Assuming the load on the secondary side to 
be a constant admittance, we can distinguish 
two types of load characteristics in Fig. 2: the 
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Power System Voltage 
Stability, Fig. 2 PV curve 
of the radial system 



transient (short-term) load characteristic shown 
with dotted lines corresponds to a specific trans¬ 
former tap ratio r, whereas the long-term load 
characteristic corresponds to equilibrium con¬ 
ditions where V 2 is within the deadband and 
approximately equal to V 0 and is shown with 
dashed lines for different load demands. 

Load dynamics can also be discrete, e.g., 
driven by the tap changing transformer of Fig. 1. 
As the LTC is trying to restore the secondary 
voltage, it will reduce r when V 2 < V 0 -d and 
will increase r when V 2 > V 0 + d, where d is 
half of the deadband. 

The effect of tap ratio increase in the upper 
and lower part of the PV curve is shown in Fig. 2 
(points S and U). In the upper part, increased r 
will reduce consumption which implies that V 2 
is also reduced as expected. In the lower part 
(point U), increased r will increase consumption 
indicating an increased V 2 and thus an unstable 
LTC operation. The stability condition in this 
case is 


Clearly for either discrete or continuous dynam¬ 
ics, at the maximum power point C, a stable and 
an unstable equilibrium branch come together, 
leaving no equilibrium points for higher demand. 
In bifurcation theory this point is known as a 
saddle-node bifurcation (SNB). 


Effect of Generation 

The generator behind the constant voltage source 
E of Fig. 1 supplies the active power consumed 
by the load (it would cover also active losses, if 
present). In practice this means that it requires 
a governor with PI (proportional plus integral) 
control, as is customary for autonomous systems 
and a prime mover of the required capacity. The 
generator also maintains the constant voltage E 
assumed in the calculations. This requires an 
automatic voltage regulator (AVR), which can be 
assumed in this simple example as being also 
of PI type. The AVR is adjusting the DC rotor 
(field) current of the synchronous generator so as 
to maintain the terminal voltage constant. 

For a given load, the active and reactive 
generation Pq + j Qq required is directly 
calculated from the network equations. The 
electromotive force (EMF) corresponding to 
the field current can then be determined using 
standard synchronous machine equations and 
preferably taking into account the saturation 
of the machine iron core (Van Cutsem and 
Vournas 1998). It should be noted that due to 
thermal constraints, it is not possible to exceed a 
maximum rotor current in continuous operation. 
This results in a rotor current limit that is 
enforced by the generator overexcitation limiter 
(OEL). If loading conditions are such that the 
OEL is activated, the generator terminal voltage 
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E cannot be maintained constant, and thus the 
voltage source E has to be replaced by a constant 
EMF in series with the generator reactance. This 
leads to a much more restrictive limit for the 
maximum power transfer. 

In power flow calculations, the generator exci¬ 
tation limit is usually represented by a maximum 
allowable reactive generation Q™ x . When this 
limit is reached, the reactive generation remains 
constant, and thus the terminal voltage is allowed 
to vary, i.e., the generator becomes a PQ bus. 
Note however that Q™ x of an actual generator is 
not constant but depends on terminal voltage and 
on active generation. 

In any case the overexcitation limit of syn¬ 
chronous generators and the resulting limitation 
of the reactive support they offer is an important 
factor determining maximum power and thus 
voltage stability limits. In practice voltage insta¬ 
bility is reached only after some critical genera¬ 
tors have reached the overexcitation limit. 

Voltage Instability Mechanism 

Following the preceding discussion, it is possible 
to describe the mechanism of voltage instability 
as follows (Van Cutsem and Vournas 1998): 

Voltage instability stems from the attempt of load 
dynamics to restore power consumption beyond 
the capability of the combined transmission and 
generation system. 

A voltage instability incident can occur either 
through a gradual load increase up to the max¬ 
imum power limit or most commonly following 
a contingency (or a cascade of contingencies) 
drastically reducing the maximum power transfer 
below the pre-contingency demand. Thus, any 
attempt at restoring power to the pre-contingency 
demand will induce an unstable response leading 
to voltage collapse. 

As the load dynamics are the driving force 
of voltage instability, the time scale of load 
restoration is the one characterizing voltage 
stability. Thus, fast recovering loads, such as 
induction motors and power electronics-driven 
devices, tend to restore load in a second or less 
and constitute what is known in power system 


dynamic analysis as the short-term time scale 
(Kundur et al. 2004). Study of relevant problems 
(motor stalling, etc.) is part of short-term voltage 
stability analysis. 

In a slower time scale of several seconds up 
to minutes, load recovery dynamics include the 
FTCs and thermostatically controlled loads. This 
is the time scale of long-term voltage stability 
analysis (Kundur et al. 2004). Note that for long¬ 
term voltage stability, the short-term dynamics 
such as those of motors and generators are con¬ 
sidered to be in equilibrium. In system represen¬ 
tation this assumption leads to the replacement of 
short-term differential equations with algebraic 
equilibrium equations. This assumption is known 
as the quasi-steady-state (QSS) approximation. 

Multiple-Load (Meshed) System 

The single-load system of Fig. 1 serves well in 
defining the voltage stability problem and helps 
visualize its significance through the PV curve 
representation and the maximum loading or crit¬ 
ical point C. In actual power systems, however, 
there are multiple loads defining a multidimen¬ 
sional space where it is sometimes tricky to apply 
the simple concepts of Fig. 1. For instance, it 
is important to distinguish between the supply 
system which can be represented by a Thevenin 
equivalent and the consumption part where loads 
affect each other and cannot be examined individ¬ 
ually, one at a time. 

Consider the power system of Fig. 3, where 
multiple generators are feeding a number of loads 
through a meshed network represented by the 
complex admittance matrix Y. The steady-state 
conditions of the system including generation and 
load are traditionally represented by the power 
flow equations: 

(Pa + jQ a) ~ (Pai + ]Qai) ~ Vi 
E. = l ViY^V* =0 i = l,...,N (5) 

Using real variables, (5) can be written as 

g(x, P) = o 


(6) 
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Power System Voltage 
Stability, Fig. 3 Meshed 
power system 



where p is the vector of independent parame¬ 
ters (load demands, generator setpoints) and x 
the vector of dependent variables (voltages and 
angles). 

Note that in this representation, the load 
is referred to the primary side of LTC 
transformers as in Fig. 1. This can be considered 
constant at equilibrium corresponding, for 
instance, to secondary (distribution side) 
voltage restoration at its setpoint value Vbi = 

V oi . 

Concerning generators the active power Pa 
cannot be treated as constant when load is vary¬ 
ing, so there has to be a participation factor 
attached to each generator bus that will represent 
primary or secondary frequency regulation char¬ 
acteristic (Van Cutsem and Voumas 1998). This is 
sometimes referred to as the distributed slack bus 
approach. For reactive power the limits Q™ x of 
reactive support should be set, beyond which the 
generator voltage is no longer constant (switch 
from PV to PQ bus). 

The solution of the N nonlinear complex 
equations (5) for a given load demand determines 
all complex voltages in the system. As in the 
simple radial system case, there may exist 
multiple solutions, some of which unstable, or 
no solutions at all. The stability limits, where (3) 
and (4) hold as equalities for the radial system, 
are now given by the singularity of the Jacobian 
of the equilibrium conditions (6): 


(7) 


The stability limit can be determined also by the 
singularity of the state matrix (Medanic et al. 
1987; Van Cutsem and Vournas 1998): 


det A = det 



= 0 


( 8 ) 


Note that the impedance matching condition for 
a single load amounts to the diagonal element 
an -0 which is much more strict than the singu¬ 
larity condition (8) that marks the actual onset 
of instability. The points satisfying (7) and (8) 
are critical points and form a multidimensional 
manifold in parameter space called bifurcation 
surface. 


Applications 

The above analysis briefly touches on fundamen¬ 
tals. Detailed analysis tools for voltage stabil¬ 
ity include (but are not limited to) continuation 
power flow, VQ curves, time simulation (short¬ 
term, long-term, QSS), sensitivity, and eigen¬ 
value/singular value analysis. Voltage security 
analysis is presently applied online in various 
control centers based on the above methods of 
analysis for a large number of contingencies. 
Countermeasures to voltage instability and col¬ 
lapse cover a wide spectrum, from automatic 
reactive devices switching to special protection 
controls and load shedding as a last resort. Fur¬ 
ther details can be sought in textbooks Taylor 
(1994) and Van Cutsem and Vournas (1998). 


detD x g = 0 
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Abstract 

Powertrain electrification and hybridization 
have rapidly become part of the portfolio of 
all major automotive manufacturers, ranging 
from hybrid-electric, to plug-in hybrid-electric, 
to battery-electric vehicles, to hybrid-hydraulic 
and hybrid-mechanical solutions. The increased 
complexity of the powertrain systems associated 
with hybrid vehicles presents interesting control 
challenges and problems, and this entry describes 
the more common architectures of hybrid-electric 


vehicle powertrains and their operation, focusing 
on the important problem of optimal control for 
energy management of hybrid-electric vehicles, 
on mode switching, and on battery management. 
In the conclusion, a connection is made between 
these problems and their interaction with 
intelligent transportation systems. 

Keywords 

Battery management; Intelligent transportation 
systems; Vehicle-grid interaction 

Introduction 

Increasingly stringent fuel economy and emis¬ 
sions regulations have required the automotive in¬ 
dustry to consider more fuel-efficient powertrains 
and alternative primary sources of transportation 
fuels. Powertrain electrification and hybridiza¬ 
tion have rapidly become part of the portfolio 
of all major automotive manufacturers, ranging 
from hybrid-electric, to plug-in hybrid-electric, 
to battery-electric vehicles, to hybrid-hydraulic 
and hybrid-mechanical solutions. The increased 
complexity of the powertrain systems associ¬ 
ated with hybrid vehicles presents interesting 
control challenges and problems. This entry de¬ 
scribes control problems associated with hybrid- 
electric vehicles (HEVs) and battery-electric ve¬ 
hicles (BEVs). 

HEV Powertrains 

An HEV powertrain contains at least two power 
sources: a primary engine - typically a com¬ 
bustion engine or a fuel cell fueled by a chem¬ 
ical fuel (in liquid or gaseous form) - and a 
secondary power source that makes use of a 
rechargeable energy storage system (RESS) that 
permits buffering the power demand of the ve¬ 
hicle so as to provide choices in the use of the 
power sources. While it is possible to design 
hybrid powertrains using secondary hydraulic or 
mechanical energy conversion and storage de¬ 
vices (hydraulic pump/motors and accumulators, 
mechanical flywheels), the majority of hybrid 
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HEV Configurations 


Power Flow 
Electrical Path 
Mechanical Path 




Convertor Battery 



Power Split 


Generator Convertor Battery 



Wheel 



Powertrain Control for Hybrid-Electric and Electric Vehicles, Fig. 1 Hybrid powertrain configurations (After 
Rizzoni and Peng (2013), courtesy: Dr. Chiao-Ting Li, the University of Michigan) 


powertrains in use today employ electric ma¬ 
chines and electrochemical energy storage de¬ 
vices (batteries and supercapacitors); thus, this 
entry focuses exclusively on hybrid-electric ve¬ 
hicles (HEVs). Electric vehicles (EVs) can be 
viewed as a special case of HEVs in which no 
internal combustion engine is present, and many 
of the considerations that follow apply also to 
hydraulic and mechanical hybrids. HEVs may be 
classified according to their powertrain architec¬ 
ture as shown in Fig. 1. 

A series HEV powertrain employs an electric 
machine (EM) to propel the vehicle while using 
an internal combustion engine (ICE) coupled to 
a second EM as an electrical generator set. In 
a series HEV, the electrical generator set can 
provide power directly to the electric traction 


system, via an electrical DC bus, or can charge 
an RESS (e.g., battery), or can perform both 
functions Motive power to the vehicle is delivered 
by the primary EM. Thus, a series HEV blends 
electrical power from an RESS with electrical 
power generated by an ICE-powered generator 
set to provide motive power to the vehicle. De¬ 
ciding how much electrical power to draw from 
each of the two power sources to meet the power 
demand of the vehicle is an important control 
objective. A further feature of interest is the 
ability to recover some of the kinetic energy of 
the vehicle during braking events by using the 
traction EM in generator mode to recharge the 
RESS. 

A parallel HEV powertrain blends mechanical 
power from the ICE and one or more EMs 










































































1092 


Powertrain Control for Hybrid-Electric and Electric Vehicles 


through appropriate mechanical coupling and 
transmission elements to deliver motive power 
to the vehicle or to recharge the RESS. In a 
parallel HEV powertrain, the same EM is used 
to provide power to the vehicle (motor mode) 
and to provide energy to the RESS (generator 
mode); in the latter case, the RESS can be 
recharged either by providing power from the 
ICE in excess of that required by the vehicle or 
by converting the kinetic energy of the vehicle 
into electrical power through the braking action 
of the EM. 

A third configuration, the one that is most 
commonly found among passenger vehicles in 
commercial production today, is the power-split 
HEV , in which the benefits of both series and 
parallel HEVs are achieved most commonly by 
using one or more planetary gear sets to couple 
two EMs - to the ICE on one side and to the 
driveline on the other. 

Regardless of architecture, HEV powertrains 
enable fuel savings and emissions reductions by 
operating in a variety of modes that include 
load leveling , regenerative braking , engine start- 
stop, and transmission optimization (Miller 2004; 
Rizzoni and Peng 2013). All of these functions 
benefit from the availability of an RESS and of 
bidirectional power converters, that is, the electric 
drive system(s) that can serve both motor and 
generator functions. 

HEV Operation 

An HEV is considered charge sustaining if the 
RESS is recharged only by power supplied by the 
ICE or by regenerative braking. If, on the other 
hand, the vehicle is designed to deplete energy 
stored in the RESS during the course of a trip, 
ending the trip with a lower state of stored energy 
than at the start and requiring recharging from 
the electrical grid, the vehicle is called charge de¬ 
pleting and is commonly referred to as a plug-in 
HEV (PHEV). PHEVs can in turn be subdivided 
into blended-mode PHEVs, in which stored elec¬ 
trical energy and fuel chemical energy are used 
jointly to achieve minimum overall energy use, 
and extended-range electric vehicles (EREVs), 
in which electrical energy is used exclusively to 
power the vehicle, until a lower bound is reached, 


at which point the vehicle uses both ICE and 
EM(s) to behave like a charge-sustaining hybrid. 
In principle, any of the architectures of Fig. 1 can 
be used in any of these modes. A battery-electric 
vehicle, or BEV, is an extreme case of an EREV, 
in which the vehicle is not equipped with an ICE. 
Miller (2004) provides an excellent overview of 
the technology underlying each of the powertrain 
architectures mentioned so far. 


Control Problems in X-EVs 

Let us refer to the general case of a hybrid or 
electric vehicle as an X-EV, with the X in X-EV 
representing any of the architecture discussed so 
far: X- = H, PH, ER, or B. X-EVs enable mul¬ 
tiple configurations and operating modes of the 
powertrain, presenting a number of interesting 
control problems above and beyond those that are 
already present in non-hybrid powertrains (e.g., 
engine and transmission control). In general, the 
control architecture of an HEV is hierarchical, 
with a higher-level (supervisory) controller that 
manages the power flows and mode changes (e.g., 
from electric to hybrid in an EREV) to meet the 
vehicle fuel economy, emissions, performance, 
and drivability requirements. Figure 2 depicts 
a hierarchical control architecture in use in a 
prototype PHEV. 

In an X-EV, two problems are especially 
important: optimal energy management , that 
is, the ability to optimize the energy use of 
a vehicle during a trip, and mode switching , 
that is, the ability to select the appropriate 
operating mode and to smoothly switch between 
modes. 

The higher-level controller issues set points 
to lower-level controllers that are used to 
manage the ICE, the EM(s), the mechanical 
transmission system, the brake system, and 
the RESS, as well as other auxiliary functions 
in the vehicle. In this article we primarily 
consider the higher-level controller and focus 
on two problems that are especially relevant 
to HEVs: optimal energy management and 
mode switching. In addition, we also consider 
the battery controller (often called battery 
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management system or BMS), which while 
being a low-level control is very specific to 
X-EVs. 

Optimal Energy Management 

The optimal energy management problem in an 
X-EV consists of finding the control u(t) that 
leads to the minimization of a performance index 
J over the time horizon t - tf, corresponding to 
a driving cycle, or trip ; the problem is subject to 
constraints that are related: 

(i) To physical limitations of the actuators and 
the energy stored in the RESS 

(ii) To the requirement to maintain the RESS 
state of energy within prescribed limits 
(in a charge-sustaining X-EV) or to track 
a specified RESS stored energy trajectory (in 
charge-depleting X-EVs) 

Let L (•) be a suitable function of the system 
states and inputs that accounts for the quantities 
we wish to minimize, for example, fuel consump¬ 
tion or emissions of carbon dioxide. Then, we 
define the cost function 

tf 

J(x(t),u(t )) = J L(x(t),u(t),t)dt (1) 

to 

which is to be minimized for every trip. In gen¬ 
eral, the exact driving cycle, or profile, associated 
with a trip is not completely known; thus, a causal 
solution to this problem is impossible to achieve 
without making some assumptions. Various ap¬ 
proaches to solve (1) have been proposed over 
the years; we cite (i) dynamic programming (DP), 
(ii) local optimization solutions as surrogates of a 
global solution, (iii) Pontryagin’s minimum prin¬ 
ciple (PMP), and (iv) rule-based methods. Onori 
et al. (2014) provide a comprehensive overview 
of the problem as well as detailed examples. We 
briefly review approaches (i), (ii), and (iii) in the 
present article. 

Global Optimization by Dynamic 
Programming 

If the driving cycle, represented by the vehicle 
instantaneous velocity over time, v(t) is known, 
it is possible to cast (1) in such a form that 


a DP solution is possible. For example, in 
a charge-sustaining X-EV, one can find the 
sequence of inputs that minimizes the trip 
fuel consumption while sustaining the desired 
state of charge of a battery and meeting the 
speed profile of the vehicle. In this problem, 
the input is the power supplied by the battery 
to the electric machine, and the state of 
charge of the battery, SOC, is the only state; 
all other subsystems (engine, electric drives, 
transmission, etc.) are modeled via quasi-static 
efficiency models that can be represented by 
algebraic equations (e.g., Willans lines, Rizzoni 
et al. 1999) or by maps. The vehicle velocity 
profile, v(t) is converted to a vehicle power 
request, Preq(0> knowing the vehicle load 
characteristics (aerodynamic, inertial, rolling and 
drivetrain friction, and road grade). In turn, 
the power required to meet a specific load 
profile is the sum of the power delivered by the 
ICE and EM, Preq(0 = Pice(0 + Pbat(0- 
So, for example, we seek the control input, 
^bat( 0> that corresponds to the minimum fuel 

tf 

consumption, that is, min f rhf(t)dt , 

(Ace(0Tbat(0 Vt} t o 

while delivering the requested vehicle power. 
The problem has physical constraints in the 
actuators (maximum and minimum power that 
can be delivered by ICE and EM), as well as the 
requirement for the control policy to be charge 
sustaining, which is translated into the additional 
condition SOC(^o) = SOC(£/). While this is 
only a sketch of the problem formulation (see 
Onori et al. (2014) for a detailed treatment), 
it should be clear that it is possible to find a 
DP solution. If the vehicle is charge depleting, 
the problem can be similarly formulated with 
SOC(^) < SOC(t 0 ). 

In practice, this approach requires complete 
information of the vehicle velocity profile, and 
DP is not an implementable, causal solution 
to the X-EV energy management problem. It 
is, however, a very useful tool to establish 
a benchmark for a problem or as an aid in 
developing a rule base (Onori et al. 2014). 
Stochastic DP methods have been proposed to 
circumvent the need to know the driving cycle 
exactly (see, e.g., Tate et al. 2007). 
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Local Optimization by Equivalent Fuel 
Consumption Minimization 
A heuristic approach that has met with success 
is to solve (1) as a local optimization problem, 

tf 

wherein f min m f (t)dt is used as an 

t 0 { PlCE(t),PBAT(t)Vt } 

approximation for min f rhf(t)dt. 

{ Pice (0> Pbat (0 V?} tQ 

This approach gives rise to the Equivalent fuel 
Consumption Minimization Strategy (ECMS) 
(Paganelli et al. 2001), which accounts for 
the use of stored electrical energy, in units of 
chemical fuel use (g/s), such that one can define 
an “equivalent fuel consumption” taking into 
account the cost of the electrical energy used to 
produce /bat (0 by way of the fuel that must 
be used at a future time to replenish the stored 
electrical energy in the RESS. The equivalent 
fuel consumption is defined in (2): 

rhf eq (t) = m f{t) +m eq (t) = m f (t) 

+s(t)fy L SdC(t) (2) 

Qlhv 

In (2), m f eq is the equivalent fuel consumption, 
m/ is the actual chemical fuel consumption, m eq 
is the virtual fuel consumption corresponding to 
the use of electricity stored in the battery (to be 
replenished in the future), Pbat is the energy 
capacity of the battery, Qlhv is the lower heating 
value of the chemical fuel, and s(t) is the equiva¬ 
lence factor that assigns a cost to the use of elec¬ 
tricity. Then, the global minimization problem of 
(1), with /(•) equal to m f eq , becomes the prob- 
tf 

lem of finding/* min rhf(t)dt. This 

t 0 {Pice (t), P B at (t) Vt} 

approach, which can be easily implemented, has 
been used widely and has been shown to closely 
approximate the global optimal solution if suffi¬ 
cient knowledge of the vehicle driving cycle is 
available. The method does requires empirical 
calibration and tuning of the equivalence factor, 
s(t), the optimal value of which is dependent 
on the driving cycle. Such calibration could be 
automated by using a predictor to generate a 
short-horizon estimate of the driving cycle and an 


adaptor to generate an appropriate s(t) (Musardo 
et al. 2005). 

Optimization by Pontryagin's Minimum 
Principle 

Pontryagin’s minimum principle (PMP) can also 
be employed to solve the X-EV energy manage¬ 
ment problem. If, again, the fast dynamics of the 
system are neglected the state equation is 

x(t) = f{pc, u, t ) = — — Ibat(x, U, t) (3) 

C'BAT 

where x = SOC is the state of charge of the 
battery, /’bat is the energy capacity of the battery, 
and /bat is the instantaneous battery current. If 
the input is the power requested of the battery, 
/bat (t), which in turn determines the engine 
power request, Pice (t), and hence the fuel con¬ 
sumption, then the Hamiltonian function can be 
defined to be 

H(x(t), Pbat( t), A(0) = mf(P B AT(t )) — M0 • 
f(x(t), PsAT(t), 0(4) 

In (4), / (•) is given by Eq. (3), and the control 
P bat (0 that which minimizes Eq. (4) at each time 
instant is 

Pbat(0 = argmin H(x(t), Pbat(0> M0) ( 5 ) 

Peat 

The co-state variable, A(t), is the solution of 

,r^ d f(x( »>«(0) 

A(f) = -A(0--- (6) 

OX 

Eqs. 3 and 5, with boundary conditions x(to) and 
x(tf) , can be solved numerically; in Serrao et al. 
(2009, 2011) it is shown that the co-state X(t) is 
related to the equivalence factor of Eq. (2), con¬ 
firming that the intuitive ECMS solution is in fact 
the PMP solution, providing that the equivalence 
factor (or co-state) is time varying and satisfies 

H(t, x, u, A) = my + X(t)x(t) and 
i Q lhv 

S(t) = -A (t )—— 

£BAT 


( 7 ) 
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Powertrain Control for Hybrid-Electric and Electric Vehicles, Fig. 3 State diagram illustrating mode switching in 
a PHEV (Courtesy: The Ohio State University EcoCAR 2 Team) 


The PMP solution is also cycle dependent, as 
the optimal initial condition for the co-state is 
dependent on the driving cycle. This dependence 
on the driving cycle, whether expressed in 
terms of an equivalent fuel consumption in 
the ECMS solution or as the initial condition 
of the co-state in the PMP solution, is an 
unavoidable consequence of the fact that the fuel 
consumption of a vehicle is strongly dependent 
on the driving conditions, which affect the vehicle 
load. 

The basic concepts outlined above continue 
to be the subject of further development; for 
example, integrating available trip information 
available from navigation and geographical 
information systems into predictive energy 
management algorithms and considering battery 
aging as a cost in the optimization function 
are but two of the research areas being 
pursued. 


Mode Switching 

X-EV architectures permit multiple operating 
modes to exploit the design and control flexibility 
available in the powertrain. Some examples are 
the following: an X-EV could operate in pure 
EV mode or in hybrid mode (whether series, 
parallel, or power-split), could use special control 
algorithms during regenerative braking events 
to provide maximum energy recovery without 
adversely affecting brake and vehicle stability 
control systems, and could implement special 
start-stop control strategies that minimize fuel 
consumption at idle without adversely affecting 
engine cold- or warm-start emissions and 
without inducing unwanted transient vibrations 
(Canova et al. 2009). Figure 3 depicts an 
example of a state flow diagram that could be 
implemented in a finite state machine. Mode 
switching can result in drivability problems 
(Wei and Rizzoni 2004), that is, in undesirable 
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transient response characteristics during mode 
changes. An X-EV can, in this context, be 
represented as a hybrid system (Koprubasi et al. 
2007). 


Battery Management Systems 

The most common RESS in hybrid vehicle is 
the electrochemical battery. A hybrid or electric 
vehicle uses a battery pack that is typically com¬ 
posed of modules, which are in turn comprised 
of battery cells connected in series and parallel. 
Battery management systems are necessary to 
provide charge balancing, cell protection, state of 
charge and state of health estimation, and other 
functions related to the management of the stored 
energy. A good overview of battery systems and 
associated control problems may be found in 
Rahn and Wang (2013). 

Two important problems related to battery 
management are state of charge (SOC) and state 
of health (SOH) estimation. SOC estimation is a 
necessary component of any battery management 
system. The SOC of battery is defined by the 
following equations, in which v is the SOC, Qbat 
is the battery capacity in ampere-hours, and r] is 
the battery charging/discharging efficiency: 


x(t) = 


r] 


Qbat(J) 


■ I BAT if) X(t) = X(t 0 ) 


+ 


1 


3,600 • Qbat(J ) 


tf 

J Ibat{t) • 


dr (8) 


In practice, there are two problems with 
using current integration (also called Coulomb 
counting) to estimating SOC: (i) errors in 
numerical integration accumulate and may cause 
significant bias error in the estimate, and (ii) 
the actual capacity of the battery is unknown 
during vehicle operation, as it changes over 
time due to battery aging. A second SOC 
estimation approach consists of correlating 
the battery open-circuit voltage to the SOC, 
but this approach also suffers from significant 
uncertainty, as the open-circuit voltage-SOC 
correlation curves are only accurate in stationary 
conditions (constant temperature, with battery 
at rest). SOC estimation has been the subject of 


much research and has seen the use of Kalman 
filters, extended Kalman filters, particle filters, 
and other estimation approaches (Chaturvedi 
et al. 2010). 

The SOH of a battery degrades over time 
due to two principal factors: capacity fade 
and power fade (which can also be thought 
of as conductance fade caused by an increase 
in the internal resistance of the battery). 
These phenomena are the result of complex 
electrochemical interactions that are specific 
to battery chemistry. The ability to estimate 
the capacity and resistance of a battery during 
actual operation is a very important aspect of 
battery management. As in the case of SOC 
estimation, no direct measurement is possible 
outside of controlled laboratory conditions; 
hence, estimation algorithms must be employed 
(Chaturvedi et al. 2010). It is important to 
observe that SOC and SOH estimation algorithms 
operate on two completely different time scales, 
as the SOC of a battery fluctuates over time 
windows of minutes or hours, while the SOH 
changes very slowly over time, with measurable 
changes occurring over periods of months or 
years. 


Summary and Future Directions 

In summary, the control of X-EV powertrains is 
a rich subject for control theoreticians and practi¬ 
tioners, presenting topics related to optimization 
and optimal control (for energy management, 
battery aging), hybrid control (for drivability), 
adaptive and predictive control, and estimation. 
Further, the electrification of ground vehicles 
presents interesting opportunities to integrate ve¬ 
hicles with the electric power and communi¬ 
cation networks infrastructures. The following 
paragraphs describe two such opportunities. 


Vehicle-Grid Interaction 

As the penetration of plug-in vehicles, PHEVs 
and BEVs, increases, their impact on the electric 
power grid cannot be neglected; the consideration 
of increased electric power demand and of the 
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timing of vehicle charging must be included in 
the control/optimization of the electric power 
grid. 

The electric grid and the transportation system 
are the two largest sectors that produce green¬ 
house gas emissions. When large numbers of 
vehicles are electrified and draw power from the 
electric grid, it is important to aim for reduced 
overall greenhouse gas emissions rather than just 
shifting emissions from tailpipes to power plant 
stacks. Controlling the charging of plug-in ve¬ 
hicles to alleviate the impact to the grid has 
been studied, including the idea of using plug¬ 
in vehicles as ancillary services to the grid, pos¬ 
sibly with significant renewable power sources 
connected to the grid. Modeling and simulating 
this integrated system require information on de¬ 
tailed grid load profiles, power generation pricing 
and carbon emissions, wind statistics, and vehi¬ 
cle usage statistics. In addition, charging control 
must balance multiple factors: grid stability, fully 
charging all vehicles, minimizing data collection 
and communication, and overall system carbon 
emission minimization. 

Intelligent Transportation Systems 

X-EVs, as well as conventional vehicles, will 
benefit from the ability to analyze traffic and geo¬ 
graphical information in real time to quantify the 
effects of infrastructure, environment, and traffic 
flow on vehicle fuel economy and emissions, 
and to permit the application of forecasting and 
optimization methods for energy management 
(Gong et al. 2011; Wollaeger et al. 2012). There 
are significant opportunities to achieve significant 
fuel savings and emissions reduction by consid¬ 
ering the large-scale interactions of vehicles with 
one another and with the infrastructure, further 
exploiting the flexibility inherent in X-EVs. 

Cross-References 

► Engine Control 

► Optimal Control and Pontryagin’s Maximum 
Principle 

► Optimal Control and the Dynamic Program¬ 
ming Principle 
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Abstract 

Programmable logic controllers (PLCs) are a spe¬ 
cial form of computing hardware and software 
tailored for use in industrial control. The hard¬ 
ware is built for rough environments and offers 
various input and output ports for industrial sen¬ 
sor and actuator signals as well as communication 
systems. The main software features are hard 
real-time capabilities and a set of standardized 
programming languages specifically designed for 
the realization of automation functions. 
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Introduction 

Since the 1970s, the programmable logic 
controller (PLC) has been the primary workhorse 
of industrial automation. For a long time, 
it has provided a distinct field of research, 
development, and application, mainly for control 
engineering. This area has produced its own 
design methods and programming languages. 
Due to its importance for industrial application, 
a lot of these methods have been standardized by 
the International Electrotechnical Commission 
(IEC). Currently the most influential standards 
are IEC 61131 (John and Tiegelkamp 2010) 
and IEC 61499 (Vyatkin 2011). While the 
latter one is dedicated to distributed systems, 
IEC 61131 covers the PLC as such. This standard 


consists of several parts. The most important ones 
are: 

Part 1 : General information. This part covers 
the CONCEPT of PLCs. It describes the 
general idea and typical functionalities, most 
importantly, the cyclic processing of the 
application program working on a stored 
image of the input and output values. 

Part 2 : Equipment requirements and tests. Here 
requirements on the PLC HARDWARE (elec¬ 
trical, mechanical, and functional) and corre¬ 
sponding tests are defined. 

Part 3 : Programming languages. This is the 
most important part of the standard. Based 
on already existing PLC programming lan¬ 
guages, a harmonization of the SOFTWARE 
structure was achieved. This includes a 
general software model together with a 
set of different standardized programming 
languages. IEC 61131-3 paved the way from 
proprietary programming solutions to a set 
of well-accepted languages, allowing easier 
training of PLC programmers and - to some 
extent - the reuse of application solutions on 
different hardware platforms. 

While Part 2 is of importance for PLC man¬ 
ufacturers only, Parts 1 and 3 contain relevant 
information for PLC users, especially for de¬ 
signers of PLC control applications. Before dis¬ 
cussing these points, the definition of PLC from 
IEC 61131-1 is reproduced and discussed: 


A PLC is a digital electrical system used 
in manufacturing. It utilizes programmable 
memory to store practice-oriented con¬ 
trol programs. Thus is suitable for im¬ 
plementation of specific functions such as 
combinatorial control, sequence control, 
time-, count- and arithmetic functions. Due 
to its special arrangement of digital or analog 
input/output, it is used for controlling vari¬ 
ous machines and processes. (...) 


This definition is focused on the usage of the 
device and would - taken out of the context - also 
cover industrial PCs or microcontroller-based 
control solutions. The specifics of PLC hardware 
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are discussed in Part 2 of the standard. However, 
much more important for distinguishing a PLC 
from other control hardware are the properties 
of the execution model described in Part 2 and 
discussed in the following. 

Execution Model 

In designing PLC applications, the execution 
model has to be considered. The main idea is 
the cyclic execution together with an I/O image. 
While microcontrollers and PCs typically use 
an event-based execution model (the application 
waits for external events from the environment - 
interrupts - and reacts accordingly), the PLC 
follows a time-based scheme (the application 
scans the environment at instances in time - 
often a fixed cycle time - and reacts on the new 
status of the input ports). 


reading 

processing 

writing 

reading 

input image 

application code 

output image' 

input image 


-►time 


<- cycle time -> 


A PLC cycle consists of three iterated steps: 
input reading, program execution, and output 
writing. Together with the concept of the process 
image - a reserved memory space where input 
and output variables are stored - this execution 
model leads to the following: 

(a) During one cycle, input and output values are 
kept fixed, i.e., a change in input signal values 
during a cycle will not be seen by the program 
executed. This means that a temporal change 
in an input signal value that is shorter than the 
cycle time may not be registered by the PLC 
at all. 

(b) Changes in output signal settings by the pro¬ 
gram will be switched to the actual output 
ports only after execution of the complete 
program. This actually means that for an out¬ 
put signal where the value is changed several 


times during one program execution, only the 
last change will be set to the hardware output 
of the PLC. 

(c) The response time of a PLC, i.e., the time 
between a change in an input signal and the 
corresponding reaction at the output port of a 
PLC, lies between one and two PLC cycles, 
depending on when the change at the input 
port occurs relative to the PLC cycle. 

While the time needed for input reading and 
output writing is constant over all cycles, the time 
for program execution may vary due to condi¬ 
tional execution of some program parts. However, 
normally the PLC is operated with a fixed cycle 
time set high enough to allow for the worst-case 
execution time of the application program. 

The advantage of the described concept is the 
deterministic behavior of the resulting system 
with a very simple way to determine the timing 
behavior. This is important for most PLC appli¬ 
cations: 

(a) Open-loop control, where the reaction to a 
change of an input signal has to be reached 
in a limited time, especially in safety-critical 
applications. 

(b) Closed-loop control, where the design of a 
discrete-time control algorithm is based on 
the assumption of a fixed sample-time. 

To realize control functions, an application pro¬ 
gram has to be written for the PLC. To this end, 
Part 3 of the IEC 61131 defines a software model 
together with a set of programming languages. 

Software Model and Programming 

The original idea that led to the development of 
the first programmable logic controller (PLC) in 
1968 was to replace hardwired control equipment 
at machines. Back then, the controllers of 
machines, for example, lathes or grinders, 
typically consisted of a cabinet of interconnected 
relays. The size of such a controller could be 
considerable and its failure rate was high due to 
mechanical defects of single relays. Furthermore, 
the initial setup was very time-consuming and 
error prone, because the relays (often hundreds 
of them) had to be wired by hand. The biggest 
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drawback of this technology, however, was 
the problems arising if a controller had to be 
changed, employing a new function or adjusting 
to a new production task. Then the hardwired 
structure had at least partially to be disassembled 
and rewired. Here was the main advantage of 
a controller that could be adjusted by changing 
software instead of hardware. 

Since the first PLCs in the early seventies 
reached the market, graphical programming 
methods are used to develop the control 
algorithms. These are ladder diagram (LD, 
sometimes also referred to as ladder logic) 
and later function block diagram (FBD). The 
implementation of LD on the very first PLC 
(the Modicon 084) was intended to allow an 
easy access for the people doing hardwired 
relay logic until then. (More on the history of 
PLCs can be found on the website of Dick 
Morley, commonly known as the father of the 
PLC (http://www.barn.org/FILES/historyofplc. 
html).) 

LD, at least in its early forms, is basically 
the graphical representation of its hardwired fore¬ 
father. The name ladder comes from the fact 
that on both sides of the drawing, there is a 
power rail and horizontally between those rails, 
like rungs on a ladder, sequences of logical el¬ 
ement are drawn. The basic of these elements 
are relays (switches), depending on input sig¬ 
nals or internal variables, and coils (memories 
to store variables and set output signals). The 
ladder is processed in a top-down and left-right 
fashion. 

Figure 1 shows an example of an LD. Every 
rung can be read as an IF THEN ELSE statement. 
The first rung of the ladder means IF (Varl = 1 
AND Var2 = 1) THEN (Var3 := 1; Var4 := 0) 
ELSE (Var3 := 0; Var4 := 1). The second rung 
is IF (Var3 = 1 OR Var4 = 0) THEN (Varl : = 
1) ELSE (Varl := 0). 

While LD resembles relay logic, FBD is a 
graphical mimicking of the wiring of simple logic 
gates, like AND, OR, NOT, or FLIP-FLOP. Both 
languages (LD as well as FBD) are still part of 
the IEC 61131-3. However, they are not well 
suited for the description of sequential and con¬ 
current algorithms because they have no means 



Programmable Logic Controllers, Fig. 1 Example of a 
PLC program written in Ladder Diagram (LD) 


for the visual description of the control flow in a 
program. 

The IEC 61131-3 standard also contains a 
language that is intended for the graphical de¬ 
scription of sequential and concurrent behavior: 
the sequential function chart (SFC). The SFC is 
based on Grafcet (David 1995) and represents 
a form of Petri net (with very special dynamics 
and functionality). Due to its high functionality, 
SFC can be easily applied for the structuring 
of a PLC program on a high level. However, 
it is cumbersome (and by the standard also not 
intended) to use for the specification of a low- 
level sequential algorithm, as, for example, the 
alternative switching between two motors. 

In addition to the three graphical languages, 
there are also two textual languages in the stan¬ 
dard: the assembler-like Instruction List (IL) and 
the Pascal-like Structured Text (ST). 

The decision for one of the languages is based 
on functional aspects of the application to be 
realized (high-level languages SFC and ST vs. 
low-level languages LD, IL, and FBD) but also 
on traditions in the application domain (e.g., LD 
in automotive manufacturing vs. FBD in process 
industry), the geographical region (e.g., LD in the 
US vs. IL in Germany), and the preferences of 
the programmer (graphical vs. textual). To allow 
for flexible solutions and the optimal choice of 
languages, IEC 61131-3 allows the use of dif¬ 
ferent languages for different parts of the control 
application. 

An application in IEC 61131-3 is structured 
into program organization units (POUs). Each of 
the POUs contains a header in a unified syntax for 
parameter and variable definitions and a body for 
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the actual program code. This body can be written 
in any one of the defined PLC languages. 

There are three types of POUs: Program, 
Function Block, and Function. A program is the 
top-level POU of a PLC application. Only in a 
program, variables can be linked to actual input 
and output ports. A program can call Function 
Blocks which in turn may call other function 
blocks. Programs and function blocks can also 
call Functions. A POU of type function has no 
internal memory while a Function Block has 
memory. 

IEC 61131-3 introduced the type and instance 
concept into PLC programming. A Function 
Block is always the instantiation of a Function 
Block Type. Each instantiation gets its own 
name and variable space. This concept is 
similar to - but much older than - the class- 
object instantiation idea of object-oriented 
programming languages. The exclusive use of 
symbolic variables without direct references 
to hardware addresses or ports in Function 
Blocks allows their easy reuse in one or 
more applications and the definition of widely 
applicable Function Block (Type) Libraries. 

Summary and Future Directions 

PLCs are a proven technology in industrial au¬ 
tomation. They follow a simple but deterministic 
execution and software model. This is the main 
reason why PLCs are still here and will be here 
for quite some time to come even if there is 
faster and fancier technology like embedded PCs 
available. 

Currently the third edition of IEC 61131-3 is 
nearly ready for publishing. In addition to minor 
corrections, this new edition adds some concepts 
from object-oriented programming to the existing 
software model. First tools on the market already 
support these extensions. 

For the future, two trends can be seen. First, 
there is a growing trend to integrate PLC pro¬ 
gramming into model-based software develop¬ 
ment processes: either by generating PLC code 
from existing model-based toolchains or by inte¬ 
grating model-based approaches, especially from 


the object-oriented domain, into PLC program¬ 
ming environments. Either way this is due to 
the fact that the complexity in PLC application 
is rising while the development time should be 
decreased. 

Second, there is a growing interest in the use 
of formal methods in the PLC domain. In recent 
years, a lot of interdisciplinary work was aimed 
in this direction. This work results in the formal¬ 
ization of different steps in the control design 
process depending on what problems are to be 
solved (Frey and Litz 2000): 

1. The demand for reduced development time 
and the possible reuse of existing software 
modules result in the need for a formal ap¬ 
proach to the development of the PLC pro¬ 
grams. 

2. The demand for high-quality solutions and 
especially the application of PLC in safety- 
critical processes result in the need for valida¬ 
tion procedures, i.e., formal methods to prove 
specific static and dynamic properties of the 
programs. 

3. The large numbers of already installed PLC 
programs, together with the high expense of 
programming, lead to the search for verifi¬ 
cation and validation methods that can be 
applied directly to programs written in PLC- 
specific programming languages such as lad¬ 
der diagram. 

To conclude, more than 50 years after its in¬ 
vention, the PLC is still an industrial success 
story, and due to ever-increasing demands on the 
complexity and correctness of its applications, it 
also still provides much room for further research 
and development. 


Cross-References 
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Abstract 

Differential games arose from the investigation, 
by Rufus Isaacs in the 1950s, of pursuit- 
evasion problems. In these problems, closed-loop 
strategies are of the essence, although defining 
what is exactly meant by this phrase, and what is 
the “Value” of a differential game, is difficult. 
For closed-loop strategies, there is no such 
thing as a “two-sided maximum principle,” 
and one must resort to the analysis of Isaacs’ 
equation, a Hamilton Jacobi equation. The 
concept of viscosity solutions of Hamilton- 
Jacobi equations has helped solve several of these 
issues. 


Keywords 

Closed loop strategies; Isaacs’ condition; Viscos¬ 
ity solutions 

Historical Perspective 

The history of differential games (DG in 
short) starts with Rufus Isaacs, who coined 
the phrase in his pioneering work of the 
early 1950s (Isaacs 1951), which was largely 
ignored until the publication of his book 
Isaacs (1965). Through the investigation of 
particular problems, Isaacs invented by himself 
(with his own names) the concepts of state 
and control variables, of feedback, his “tenet 
of transition” - better known as Bellman’s 
optimality principle - the (Hamilton-Jacobi- 
Caratheodory-)Isaacs equation, barriers, some 
difficult corner conditions (“equivocal lines”), 
singular arcs (“universal lines”), etc. 

Another very early work was Kelendzerize’s 
chapter “A Pursuit Problem” in the historical 
book by Pontryagin et al. (1962), but it lacked 
closed-loop strategies. 

John Breakwell and a few followers (Break- 
well and Merz 1969; Breakwell 1977) picked up 
Isaacs’ work where he had left it, still working on 
particular problems, but adding the power of the 
computer to analyze the solution of Isaacs’ equa¬ 
tion via the structure and singularities of fields 
of extremal trajectories, while most of the litera¬ 
ture concentrated on making precise the concepts 
of closed-loop strategies and of the Value of 
the game. Prominent figures in that quest are 
Krasovskii and Subbotin (1977), Fleming (1961), 
Friedman (1971), Blaquiere et al. (1969), Elliot 
and Kalton (1972), Emilio and Roxin (1969), and 
Varaiya and Lin (1969) who together invented the 
concept of non-anticipative strategies. 

The major later innovation was Crandall and 
Lions’ viscosity solutions of PDEs (Crandal and 
Lions 1983; Lions 1982) applied to DGs and its 
Isaacs equation by Evans and Souganidis (1984) 
and Lions and Souganidis (1985). 

We also refer the reader to the entry (Quincam- 
poix 2009) of another Springer Encyclopedia. 
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General Setup 

We shall be interested in (continuous time) two- 
person zero-sum DGs with complete information, 
this last phrase meaning that both players 
know exactly and instantly the state of the 
system, but (usually) not their opponent’s 
control. 

The available space of a short article does 
not allow us to attempt to give the most gen¬ 
eral setup of a zero-sum two-person perfect- 
information differential game. We shall therefore 
concentrate on a typical class, with a finite dimen¬ 
sional state space, as follows. The data are: 

1. A two-player dynamical system with state x e 
W 1 , control variables u e 1) C e V c 

M m (U and V will often be assumed compact), 
and its dynamics 

X = f(t,X,U,V ), x(to)=XQ. 


J(t 0 ,x 0 ;u(-),v(-)) = i J tQ 

[OO 


Denoting U and V the sets of measurable 
functions from M to U and V respectively, 
one assumes regularity and growth conditions 
on / to guarantee existence and uniqueness 
of the solution x(-) for all 0o,Xo) and all 
(«(•), u (•)) GWxV. 

2. A termination condition, often given by a tar¬ 
get set T gMxM”, open or closed according 
to necessity, defining a final time as t\ = 
inf {t | (t,x(t)) e T}. If T = {T} x IT, 
final time is fixed and equal to T. The question 
of whether there is a finite t\ is one of central 
interest in pursuit-evasion games. 

3. Sets of admissible closed-loop strategies (P 
and X I / . One should choose them in such a 
way that replacing (u , v) by a pair (0, 0) e 
0 x ^ in the dynamics always produces a 
(unique) admissible pair of control functions 
(u (•), v(-)) = r(t 0 , X 0 ; <f>,f) eU x V. 

4. A performance measure, or payoff, typically 


t\ 

L(t,x(t), u(t), v(t)) dt if t\ < oo , 
if t\ = oo . 


We let 

G(t 0 , x 0 ; 0 , 0 ) := J(t 0 , x 0 ; r(t 0 , x 0 ; 0 , 0 )). 

5. A concept of “solution,” where the first player 
wants to minimize the performance index 
while the second one wishes to maximize it. 
(In our choice of definition of /, we have 
assumed that player one wants over anything 
else to make the game terminate. If we define 
J as the integral even for infinite end-time, 
Isaacs’ tenet of transition may not hold.) 

If 

inf sup G0o, xo; 0, 0) 

= sup inf G0o,xo;0,0) = F0o,xo), 

then V is called the Value function of the 
game. Several concepts of upper Value and 


lower Value may be defined (including the first 
and second terms above) that have to coincide 
for a Value to exist. 

Isaacs’ Condition In the framework of this 
short entry, we shall always assume that the 
game satisfies Isaacs’ condition. It bears on the 
Hamiltonian H(t,x, p,u,v) := L(t,x,u,v) + 
( p , / (t, x, u , v)) and reads 

V0,x, p) e M x R n x R n , 

inf sup Hit , x, p, u, v) 

M eU UGV 

= sup inf H(t, x, p, u, v) . (1) 

v eV mgU 

Strategies and Value 

In pursuit-evasion games, the concept of closed- 
loop strategies is of the essence, and it is ex¬ 
tremely important for all DGs. Yet, allowing state 
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feedback strategies such as u(t ) = cp(t,x(t)), 
v(t ) = )), poses a difficult problem: 

what classes 0 and 0 of functions 0 and 0 
to allow? The notations inf^ or sup^ have no 
meaning if one does not answer that question. 
Experience tells us that discontinuous feedbacks 
are necessary to find the solution of many ex¬ 
amples, but then existence, or uniqueness, of the 
solution of the dynamical equation cannot be 
guaranteed. 

Isaacs’ K-strategies were a partial attempt 
to address this issue. More developed concepts 
were proposed, from limit of piecewise 
constant, or piecewise open-loop, controls 
(Fleming 1961; Friedman 1971) to extensions 
of the notion of solution of a differential 
equation (Krasovskii and Subbotin 1977), 
also proving the existence of a Value. The 
equivalence of all these Values was an issue 
until the advent of viscosity solutions of Isaacs’ 
equation. 

A tool used to accommodate state-feedback 
strategies (Bernhard 1977) is 

Lemma 1 (Berkovitz) IfV C 0, then, Vcpfor 
which this expression is well defined, 

sup G(to, xo; 0, 0) = sup G(xo, to, 0, v(-)). 
few u(-)eV 

As a consequence, a saddle-point (0*, 0*) solu¬ 
tion is defined by 

VwQ G W,Vu(0 E V 
G(f o ,*o;0*,uQ) < V(to,xo) 

< G0o, ^o; «(•), 0*), (2) 

confronting the closed-loop saddle point strate¬ 
gies to open-loop controls only. (This proves 
useful in the analysis of Nash equilibria of 
nonzero-sum DGs.) 

Another consequence of Berko vitz’ lemma is 
that if a DG has a saddle point in open-loop 
controls, it is a saddle point over closed-loop 
controls as well. (But the existence condition may 
be less stringent for the later.) The relationship 
between different forms of the strategies has been 


further clarified by Ba§ar (1977) and Ba§ar and 
Olsder (1982). 

As far as the existence of the Value is con¬ 
cerned, the problem for a large class of DGs is 
solved with non-anticipative strategies defined as 
0 : V -> U such that 

vr, [Vs < t «i(s) = t> 2 (s)] =*> [0Oi(-))(O 

= <t>(v 2 0X0]. 

and likewise for 0 (notice that for this concept 
of strategies, (2) is the natural formulation of a 
saddle point) and with the notion of viscosity so¬ 
lution of Isaacs’ equation. See Theorem 1 below. 


Games of Pursuit Evasion 

An important class of DGs is the game of pur¬ 
suit evasion. Typically, in these games the state 
v is composed of a sub-vector y of Pursuer 
state(s) and a sub-vector z of Evader state(s). 
The dynamical function / is separated likewise, 
the dynamics of the Pursuer depending on the 
Pursuer’s control(s) and that of the Evader on 
the Evader’s control(s). Typically, the payoff is 
time until capture defined as (t,x(t)) E T (the 
target is often called capture set). This form of 
DG automatically satisfies Isaacs’ condition (1). 

Qualitative Game 

In pursuit-evasion games, the main issue is to 
distinguish initial states, called capturable , for 
which a Pursuer’s strategy causing finite-time 
capture against any defense exists, from those, 
called safe , for which the Evader has a strategy 
guaranteeing escape against any defense. This 
is the topic of the qualitative game or game of 
kind (Isaacs). A theorem of the alternative is 
one which states that for a particular (class of) 
game(s), every initial state is either capturable or 
safe. Such theorems have been proved for classes 
of pursuit-evasion games covering essentially all 
cases of interest, under Isaacs condition (1) with 
L = 0 (Cardaliaguet 1996; Cardaliaguet et al. 
2001; Krasovskii and Subbotin 1977). 
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Capturable states are separated from safe 
states by a barrier , a piecewise smooth manifold 
which has to be semipermeable. This means that 
for all ( t , x) on the barrier where this barrier is a 
smooth manifold with normal v(t,x ), it should 
hold that 

min max (v(t, x), f(t, x, u , v)) 
well ueV 

= max min (v(t, x), f(t , x, u, v)) = 0 . 

veV ue U 


A minimax pair (u,v) = (fp(t,x,v),jr(t,x, v)) 
is called a pair of semipermeable strategies. 
If the boundary of the capture set is a 
smooth manifold with local outward normal 
n(t,x), its usable part is the region where 
inf M eu su P uGV {n(t, x), f(t, x, u, v)) < 0. The 

natural barrier is a semipermeable manifold 
constructed backward from its boundary 
(the BUP), with n as final v and using the 
characteristic equations: 


X = f(x,4>,f). 


t df(t,x,<p,\jr) 

dx 


Other Approaches 

Other approaches have been developed to solve 
games of pursuit evasion. 

An early approach by Pontryagin (1967), ex¬ 
tended by Pshenichnyi (1968), used geometric 
methods for linear pursuit-evasion games with 
convex compact control sets. Krasovskii’s stable 
bridges (Krasovskii and Subbotin 1977) are a 
concept close to Isaacs’ semipermeability. Patsko 
and Turova (2001) have developed, for some 
families of DGs, an efficient numerical procedure 
to compute recursively hypersurfaces of con¬ 
stant time-to-capture, whose discontinuities dis¬ 
play the barriers. Cardaliaguet et al. (1999) have 
developed a theoretical and numerical procedure 
building on Aubin’s viability theory, which re¬ 
quires less regularity on the data than other ap¬ 
proaches. 

Provided that care be applied, a quantitative 
game may be transformed into a family of quali¬ 
tative games - an approach used by Krasovskii, 
Blaquiere et al., and Cardaliaguet et al. - and 
conversely, a fruitful approach is to investigate 
capturability of initial states as a function of a 
parameter defining the “size” of the capture set, 
imbedding the qualitative game into a quantita¬ 
tive game of the type game of approach. 


(These trajectories are abnormal trajectories of 
the calculus of variations). In most examples, 
only part of the manifold thus constructed is 
a barrier, and the complete barrier is made of 
manifolds pieced together according to a junction 
condition insuring that the corners “do not leak” 
(Breakwell), analogous to the corner conditions 
of the next section. 

Quantitative Game 

The quantitative game, or game of degree 
(Isaacs), is played inside the capture zone, 
typically with time of capture as the payoff. It 
is ruled by Isaacs’ equation in a fashion similar 
to that of games of finite duration (see below). 
Yet, the interplay between the qualitative and the 
quantitative game may be quite subtle and plays a 
prominent role in determining the actual capture 
zone. The Value function is usually discontinuous 
across other barriers inside the capture zone. 


Games of Finite Duration 

Wherever termination of the game is not an issue, 
the major tool in investigating a DG is Isaacs’ 
equation, a partial differential equation bearing 
on the Value function: 

dV 

'i(t,x)£T, —(t,x ) 

+ min MG u maxyev H(t, x,V x V,u,v) = 0, 

V(f,jc) G r, V(t,x) = K(t,x). 

(3) 

For any DG where all trajectories are transverse 
to the boundary 3T, and with adequate regularity 
conditions on the data (and still under condition 
(1)), it holds that 

Theorem 1 The DG has a Value in non- 
anticipative strategies, which is the only bounded, 
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uniformly continuous viscosity solution of the 
equation obtained by changing signs in (3) as 
—dV/dt — min MG u max^v H = 0. And all other 
Values coincide. 

One possible way to solve Isaacs’ equation is 
via the investigation of its field of characteris¬ 
tics. Their equations are Isaacs’ retrograde path 
equations', let (u, v) = (<p(t, x, p), \j/(t, x, p)) be 
the saddle point of H(t , x, p, u, v), assumed here 
to be unique, one integrates from the target set 
backward: 

x = f(t , x, u, v) , (4) 

(dH(t,x,p,u, v)\ 

? = —s—J ■ <5) 

The above equations are similar to Pontryagin’s 
maximum principle equations. However, a major 
difference lies in the corner conditions. While 
Pontryagin’s theorem extends to control theory 
the Erdman-Weierstrass condition stating that the 
adjoint vector (here p) is continuous along an 
extremal trajectory, in (4) and (5), p is to coincide 
with V X F and may be discontinuous along 
an extremal trajectory. These discontinuities 
cannot be found by a local analysis along an 
isolated trajectory and require that a complete 
field of extremals be constructed, synthesizing a 
state feedback strategy. 

The analysis of the conditions that hold at 
such corners, equivocal manifolds (Isaacs), enve¬ 
lope manifolds (Breakwell), and focal manifolds 
(Merz), has been a large part of the early Isaacs- 
Breakwell theory. It has been for its larger part 
synthesized by Bernhard (1977), except a general 
constructive analysis of focal manifolds which 
had to wait until Melikyan and Bernhard (2005). 

The absence of a “two-sided Pontryagin prin¬ 
ciple” for closed-loop differential games forces 
one to resort to the solution of Isaacs’ equation 
or an equivalent. This is the reason why no 
practical method of solution exists beyond a state 
dimension of 3 or 4, counting time if the game 
is not time invariant. An exception is the linear 
quadratic game. (See article ► Linear Quadratic 
Zero-sum Two-person Differential Games in this 
encyclopaedia). 


Conclusion 

Except for very particular games, “solving” a 
DG remains a difficult task. Numerical meth¬ 
ods suffer the famous “curse of dimensional¬ 
ity.” Moreover, many of them strive to compute 
the Value function. But the optimal strategies 
typically depend on the gradient of the Value 
function, requiring a stronger convergence of the 
approximation algorithms than pointwise, or C° 
or L 2 , if they are to be computed as well. Further 
advances in numerical algorithms tackling this 
problem would be useful, as well as uncovering 
new classes of DGs for which further analytical 
results could be obtained. 


Cross-References 

► Dynamic Noncooperative Games 

► Game Theory: Historical Overview 

► Linear Quadratic Zero-Sum Two-Person Dif¬ 
ferential Games 
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Synonyms 

QFT 


Abstract 

Designing reliable and high-performance control 
systems is an essential priority of every 
control engineering project. In many practical 
circumstances the presence of model uncertainty 
challenges the design. One robust control 
approach for these cases, deeply rooted in 
the classical frequency domain, is quantitative 
feedback theory (QFT). Providing a control 
solution that guarantees the achievement of a 
multi-objective set of performance specifications 
for every plant within the model uncertainty 
(quantification), QFT balances the trade-off 
between the simplicity of the compensator 


structure and the minimization of the activity 
of the controller at each frequency (“cost of 
feedback”). Previous results indicate that the 
QFT methodology has been able to provide 
successful control solutions to a large variety 
of real applications, including linear and 
non-linear plants, stable and unstable systems, 
multi-input multi-output processes, minimum 
and non-minimum phase plants, containing time- 
delay, lumped or distributed parameters, etc. 


Keywords 

Frequency domain control; Quantitative con¬ 
troller design; Robust control 


Definition 

Quantitative Feedback Theory (QFT) is a ro¬ 
bust control engineering design methodology that 
uses the feedback to simultaneously and quantita¬ 
tively: (1) reduce the effects of plant uncertainty 
and (2) satisfy performance control specifica¬ 
tions. The method searches for a controller that 
guarantees the satisfaction of the required perfor¬ 
mance specifications for every plant within the 
model uncertainty ( robust control ). 

QFT is rooted in the classical frequency 
domain. It involves Bode diagrams and Nichols 
charts (magnitude/phase diagrams). It relies on 
the observation that feedback is needed when 
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the plant presents model uncertainty and/or 
there are uncertain disturbances. QFT balances 
quantitatively: (a) the simplicity of the controller 
structure, (b) the minimization of the so-called 
cost of feedback, controller magnitude at each 
frequency, (c) the plant model uncertainty and 
(d) the achievement of the desired performance 
specifications, all at each frequency of interest. 
The technique has been successfully applied to a 
wide variety of real-world control problems. 


Historical Notes 

Many of the frequency domain fundamentals 
were established by Hendrik Bode in his seminal 
book Network Analysis and Feedback Amplifier 
Design , published in 1945 (Van Nostrand). The 
book strongly influenced the understanding 
of automatic control theory for many years, 
especially where system sensitivity and feedback 
constraints are concerned. 

Almost 20 years later, in 1963, a new influen¬ 
tial book entitled Synthesis of Feedback Systems 
(Academic Press), written by Isaac Horowitz, 
proposed for the first time a formal combina¬ 
tion of the frequency domain methodology with 
plant model uncertainty ( robust control) under a 
quantitative analysis. The new book addressed 
an extensive set of sensitivity problems in feed¬ 
back control and was the first work in which 
a control problem was treated quantitatively in 
a systematic way. The book laid the foundation 


for a new control design methodology that had 
been introduced briefly in a previous paper by 
Horowitz in 1959: the one that became known 
as Quantitative feedback theory (or QFT) in the 
early 1970s. 


Fundamentals 

A detailed study of the QFT fundamentals 
and applications can be found in the books 
written by Garcia-Sanz and Houpis (2012), 
Houpis et al. (2006), Sidi (2002), Yaniv (1999), 
and Horowitz (1993); see the “Recommended 
Reading” section. 

The QFT methodology provides a multi¬ 
criteria engineering understanding of the 
controller design process, as it quantifies the 
balance among the controller structure, cost of 
feedback, performance specifications, and model 
plant uncertainty at each frequency of interest. 
The basic steps of the QFT methodology are 
summarized in Fig. 1 and are presented in the 
following sub-sections. 

Define Plant Model and Uncertainty: 
Templates Generation (Steps 1, 2 & 3) 

First of all, the dynamics of the plant to 
be controlled are described in the frequency 
domain. Taking the plant model in terms of 
transfer functions with mixed parametric, non- 
parametric and even model structure uncertainty, 
the frequency domain description is carried out 



Step 1 : Define plant models and associated uncertainty 

Step 2: Obtain templates representation at specified frequencies (o i 

Step 3: Select nominal plant P 0 {jco) 

Step 4: Define control specifications: Stability 

Step 5: Define control specifications: Performance 

Step 6: Calculate stability bound (U-contour) on Nichols Chart 

Step 7: Calculate performance bounds (reference tracking, disturbancerejection,etc) 
Step 8: Calculate combined (worst case scenario) bounds 

Step 9: Synthesize feedback controller G(jcd) s.t. L 0 (jco) = P 0 (jco) G(jco) satisfies all bounds 
Step 10: Synthesize prefilter controllerF(jto) 

Step 11: Analysis in the frequency domain 
Step 12: Analysis in the time domain (linear) 

Step 13: Analysis in the time domain (nonlinear) 


Quantitative Feedback Theory, Fig. 1 Summary of QFT controller design methodology 
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Quantitative Feedback Theory, Fig. 2 From the parameter space to the Nichols chart : (a) 2-dimensional parameter 
space, (b) Template of the plant P(jco ) at co = 1 rad/s, (c) Typical templates for frequencies co G [w min , &> max ] 


by calculating “ templates ”, which are sets of 
complex numbers at each frequency of interest 
co e [(O mm , &>max] rad/second: a projection of 
the ft-dimensional parameter space through the 
transfer function/functions onto the Nichols 
chart. 

As an example, and for “&> = 1 rad/s, 
Fig. 2b represents the QFT template of the 
3-parameter plant P(jco) = exp{—jcox)j 

(O) 2 + 2 £ (O n ( jco ) + with G 

[0.7, 1.2], r G [0, 2], and f 

= 0 . 02 . 


Each template ^sP(jcOi) = {P(j(Oi)} rep¬ 
resents on the Nichols chart and at a specific 
frequency (Oi all the possible plants within the 
model uncertainty (see Fig. 2c). One particular 
case, defined as a set of specific parameters of 
the ft-dimensional parameter space is arbitrarily 
selected to define the nominal plant Po(jco), a 
member of the family of plants within the uncer¬ 
tainty (see Fig. 2b). 

Define Control Specifications (Steps 4 & 5) 

The standard two-degree-of-freedom (2DOF) 
control system diagram is shown in Fig. 3. 
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Quantitative Feedback 
Theory, Fig. 3 

Multi-Input-Single-Output 
2DOF feedback control 
system, s ^ jco 



It includes the set of uncertain plants P(jco) to 
be controlled, the disturbance dynamics M(jco), 
the feedback path dynamics H(jco ), and the 
loop controller G(jco) and prefilter F(jco ), 
both to be design. On the other hand, R(jco ), 
E(jco), U(jco), Y(jco), D (jco), and N(jco) are 
vectors representing respectively the reference 
input, signal error, controller output, plant output, 
disturbance input, and sensor noise input. From 
the diagram (Fig. 3), it is easy to derive the 
following three input/output equations (note the 
dependency on jco is removed): 


Y = 

U = 

E = 


PG 

1 + PGH 
G 

1 + PGH 
1 

1 + PGH 


FR + 


FR- 


FR- 


M 

1 + PGH 
GH 

1 + PGH 
HM 

1 + PGH 


D - 


PGH 

- N\ 

1 + PGH 


(MD + N) and 


D - 


H 

1 + PGH 


N 


Without losing generality, and with a straight¬ 
forward block diagram manipulation, F(s) and 
G(s ) can be modified to have H(s ) = 1. Now, 
the stability and performance specifications are 
defined by limiting the magnitude of each transfer 
function of the three previous equations at each 
frequency of interest, \T k (jco)\ < 8 k (co),k = 1- 
4, such that, 


Stability and noise reduction: \T\(jco)\ = 


Y(jco) 


Y(j<0 


p(ja)) G(ja>) 


R(jco)F(jco) 


N(jco) 


1 + P(jo>) G(jco) 


< <5i(&>), co e £2 1 , 


Disturbance rejection: \T 2 (jco)\ = 


Control effort reduction: \T^{jco)\ 


U(jco) 


U(j(o) 


U(jco) 

M(j(o) D(jco) 


N(jco) 


R(ja)) F(jco) 


l+P(jco) G(ja>) 


< 83(00), CO G ^3 


Reference tracking: 8^ n f(co) < \T^(jco)\ 


Yjjco) 

R(j<») 


F(jco) 


p(jco) G(j0)) 
\+P(j(o) G(jco) 


5 4sup^)’ M e ^4, 


< 


\Gjjco) P d (j<o)\ |1 + G(jco) P e (jti >)| < g 
\G(jco) P e (jco)\ \l + G(jao) P d (jao)\ ~ 


^4sup(^) 

^4inf(^) 


CO G £^4 


QFT Bounds (Steps 6,7 & 8) 

For the nominal plant Po(jco), the QFT method¬ 
ology converts the stability and performance 
specifications 8k(co) and the model plant 
uncertainty into a set of constrains or bounds 
for each frequency of interest on the Nichols 
chart (the Horowitz-Sidi Bounds). 

Th QC 0 f plant template, 3 P(jcoi) = {P(jcoi)}, 
is approximated by a finite set of plants {P r (jcoi), 
r = 1,2...}. Each plant can be expressed in its 
polar form as P r (j 00 /) = p(cot) exp(j 6 (coi)) = 
pA 6 . Likewise the controller polar form is 
G(jo)i) = g(coi) exp(j<p ) = gZ(p, with a 
controller phase 0 that varies from —2 7t to 
0. Therefore, and for every frequency cot, the 
previous control specifications {\Tk(j(Oi)\ < 

8 k(c 0 i), k = 1,...,4} are translated into a 

set of quadratic inequalities with the format 
I^.(p, 6 , 8 k ,(p) = a g 2 + b g + c > 0, such 
that, 
































































Quantitative Feedback Theory 


1113 




Quantitative Feedback Theory, Fig. 4 (a) QFT-bounds 
and G(jco ) design -loopshaping-, (b) Prefilter F(jco ) 
design: reference tracking specifications <$ 4 SU p(A) and 


$ 4 inf(&>), and upper and lower limits of 74 due to the plant 
uncertainty: <S 4 mf(ft>) < |7 4 | < <5 4sup (ct;) 


Stability and noise reduction: p 2 ^1 — ^ j 
g 2 + 2 p cos(<£ + 0) g + 1 > 0, 

Disturbance rejection: p 2 g 2 + 2 p cos(0 + 0) 
g + ^1 — > 0, with typically m = /?,1, 

or other options, 

Control effort reduction: [p 2 — ^ J g 2 + 

2 p cos(0 + 0) g + 1 > 0, and 

Reference tracking: plp 2 d {\ — ^ g 2 + 

2 Pe Pd (Pe COS {(p + 9 d )~ fr COs(0 + 0<,) j g 

+ ( P 2 e- f)>0 


Now, with an appropriate algorithm (see 
references), the above quadratic inequalities are 
translated into a set of curves on the Nichols 
chart for each frequency of interest and type 
of specification: the individual specification 
bounds. Then, the more demanding (worst 
case) bound, i.e., the most restrictive one at 
every phase and each frequency of interest 
is computed to obtain the intersection of 
bounds, or the combined QFT bounds (see 
Fig. 4a). 


Controller Gijoo) Design: Loop-Shaping 
(Step 9) 

Although the objective of designing a controller 
for an infinite number of plants seems to be a very 
arduous task (there is an infinite number of plants 
due to the model uncertainty), the integration 
of all the information (uncertainty and specifica¬ 
tions) in a set of simple curves (the QFT bounds) 
will allow the designer to use just a single plant, 
the nominal plant Pq, and the bounds to design 
the controller. 

Then, in the design stage (loop-shaping), the 
controller G(jco) is synthesized on the Nichols 
chart by adding poles and zeros until the nominal 
loop, defined as Lo(jco) = Po(jco)G(jco ), 
lies near its bounds (see Fig. 4a). The bounds 
express the plant models with uncertainty and 
the performance specifications at each frequency. 
An optimal controller in the sense of QFT will 
be obtained if Lo(jco) lies exactly on the bounds 
at each frequency. Practically speaking, a good 
design will place Lo(jco) above the continuous- 
line bounds and below the dashed-line bounds, 
and will have the minimum possible magnitude 
at every frequency. A general formulation for 
the controller structure G(s) is expressed by the 
following transfer function: 
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G(s) = 



2 Re(zi)„ 


s r 



2 Rejpj) 
I Pi f 


S+ 1 


) 


where kc is the controller gain, Zi is a zero (real 
or complex) with m rz and m cz the number of real 
and complex zeroes respectively, and pj is a pole 
(real or complex) with m rp and m cp the number 
of real and complex poles respectively ( m cz and 
m cp even numbers). The controller may have also 
some poles at the origin (integrators), with r = 0, 
1 or 2, etc. 

Prefilter F(jco ) Design (Step 10) 

If the feedback system includes a reference track¬ 
ing problem, then the best choice is to use a 
prefilter F(s ) - the second degree of freedom. 
While the feedback controller G(s ) reduces the 
effect of the uncertainty and improves stability, 
disturbance rejection, and other specifications, 
the prefilter F(s) is designed to fulfill reference 
tracking requirements. Figure 4b shows a typical 
prefilter design in the Bode diagram. <54 Sup (&>) and 
$ 4 inf (&0 are the reference tracking specifications, 
defined as a band (outer dashed lines, Fig. 4b). 
The transfer function T 4 shows an upper and a 
lower limit (inner dashed lines, Fig. 4b) due to the 
plant uncertainty. After an appropriate prefilter 
design, the T 4 limits will be in the middle of the 
$ 4 sup—$ 4 inf band! 


$4inf(&0 <\T 4 \< <54 SU p(<Z>), \T 4 (jO))\ = 


Y(jco) 

R(ja)) 


P(jco) G(jco) 

1 + P(jco) G(jco) 


F(Jo)) 


Validation (Steps 11,12,13) 

Once the design of the controller (and prefilter 
if needed) is finished, it will be convenient to 
analyze the performance of the complete control 
system under different scenarios, including: (a) 
frequency domain analysis of each specification 
for all the significant plants within the model 
uncertainty and (b) time domain simulations, typ¬ 
ically using a Monte Carlo campaign for the 


uncertainty, first with the linear system and then 
with nonlinear elements (saturation, etc.). 


Programs and Data 

Computer-aid-design (CAD) tools have definitely 
facilitated the use of QFT. The MATLAB code 
of the interactive object-oriented QFT CAD 
tool developed by Garcia-Sanz et al. for ESA- 
ESTEC (2014) can be found at http://cesc.case. 
edu/OurQFTCT.htm (free download). Another 
popular QFT CAD tool in the 1990s, developed 
by Borghesani, Chait & Yaniv, can be found 
at http://www.terasoft.com/products/QFT/index. 
html. 


Applications and Future Directions 

• QFT has been successfully applied to a 
wide variety of control problems, including 
stable and unstable plants minimum and 
non-minimum phase systems, single-input 
single-output and multiple-input multiple- 
output processes, with linear and nonlinear 
characteristics, longtime delay, distributed 
parameter systems, and time-varying plants; 
and has been combined with feed-forward 
control topologies, multi-loop systems, etc. 
Also, QFT has been used in many real-world 
applications: e.g., flight control, wind energy, 
water treatment plants, spacecraft, power 
systems, mechanical systems, motion control, 
chemical reactors, etc. (see Garcia-Sanz and 
Houpis 2012; Houpis et al. 2006). 

• Future research on QFT includes among oth¬ 
ers new multiple-input multiple-output tech¬ 
niques, nonlinear plants, distributed parameter 
systems, load-sharing control, etc. 
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Cross-References 

► Classical Frequency-Domain Design Methods 

► Polynomial/Algebraic Design Methods 

► Robust Adaptive Control 

► Spectral Factorization 
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Abstract 

This article briefly describes the topic of quan¬ 
tized control with limited data rates. The focus 
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is on the problem of stabilizing a linear time- 
invariant plant over a digital channel and the 
associated data rate theorems. It is shown that 
the deepest results in this area require a uni¬ 
fied treatment of its communications and control 
aspects. 


Keywords 

Control under communication constraints; Quan¬ 
tization; Quantized control 


Introduction 

One of the standard assumptions of classical con¬ 
trol theory is that the signals sent from sensors to 
controllers and from controllers to actuators take 
continuous values with infinite precision. The ad¬ 
vent of computer-based and digitally networked 
control systems challenged this assumption, since 
the analog plant outputs or control variables in 
such systems must be reduced to finite bit strings 
or discrete symbols for storage, manipulation, 
and transmission. This process of converting a 
continuous-valued variable into a finite-valued 
one is called quantization and entails a potentially 
significant loss of resolution and closed-loop per¬ 
formance. Quantized control is concerned with 
the analysis and design of control systems which 
feature such analog-to-digital conversions in the 
feedback loop. 

There is a vast literature on this topic and the 
aim of this article is to briefly explain some of 
its key ideas. For reasons of space, the discus¬ 
sion is largely confined to the question of how 
to stabilize a linear time-invariant plant over a 
digital channel. It is shown that the deepest results 
here emerge from treating the communications 
and control aspects jointly, instead of separately. 
The reader is referred to the survey (Nair et al. 
2007) and the references therein for a discussion 
of other issues such as optimality and transient 
performance. 
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Quantization 

Quantization has long been an object of study 
in communications and information theory - 
see Gersho and Gray (1993) and the references 
therein. In its simplest form, a signal x(-) : M -> 
W 1 is first sampled at regular time intervals 
t = 0, t, 2r,... to yield a discrete-time signal 
(x(kr))kez, with the sampling frequency 1/r 
chosen to be greater than the Nyquist frequency 
of x (i.e., twice its bandwidth). Each sample 
Xk := x(kr) is then passed through a static, 
memoryless quantizer Q to yield a quantized 
discrete-time signal 

x q k = Q{x k )e{q\...q M }<zW, k € Z> 0 . 

(1) 

which can take M distinct values in W 1 . If the 
quantizer is known to both transmitter and re¬ 
ceiver, each of these M values can be represented 
by a binary string with |"log 2 M] bits. When the 
input dimension n = 1, the quantizer is called 
scalar; otherwise, it is a vector quantizer. The 
regions R l := Q~ l (q l ),l < / < M, are called 
the quantizer cells and together form a partition of 
W 1 . Thus an M -valued quantizer is fully defined 
by its quantizer cells R 1 and associated quantizer 
points q l , 1 < i < M. 

The quantization error or quantizer noise is 
defined as n^ := x q k — Xk. When the inputs Xk 
are identically distributed random variables, then 
a standard goal is to design Q so as to minimize 
the mean-square quantizer noise 

D :=E[\\Q(x k )-x k \\ 2 ], (2) 

where E[-] is the expectation functional. This 
yields an optimal quantizer Q* with cells that 
satisfy the nearest-neighbor property, i.e., 

x€K=>\\Q*(x)-qi\\<\\Q*(x)-ql\\,Vj ^i. 

When || • || is the Euclidean norm (possibly 
weighted), the quantizer cells R 1 *, 1 < i < m, 
are convex polygons and form a Voronoipartition 
of W\ and furthermore q l * is the centroid of R l * 
with respect to the stationary distribution Fx of 


Xk, i.e., q l * = E[xk\xk G R 1 *]. As a consequence, 
the optimal quantizer is statistically unbiased, 
i.e., E [nk] = 0, and furthermore Xk and the 
quantizer noise nk are uncorrelated at time k, i.e., 
E [xkn k ] = 0. However, note that nk and xj may 
be correlated for j k, and (nk) may itself be a 
correlated process. 

If Q is not optimal but M is large (i.e., 
the quantizer is high resolution or fine), then 
E[xkn k ] = o(l/ M), provided that q l is the cen¬ 
troid of R l with respect to Lebesgue measure pi 
and Xk has a probability density function (pdf) fx 
with suitable continuity properties. The reasoning 
here is that each region R t will typically be very 
small, so that fx will not vary much on each R l , 
yielding a conditional pdf of Xk given Rf that is 
approximately uniform on Rj . 

When Q is a scalar uniform quantizer on 
an interval [a, b ], these considerations yield the 
asymptotic formula 

D % (b-a) 2 /(l2M 2 ), ( 3 ) 

provided that the overload regions - i.e., the tails 
of fx(x) on the regions x <a or x>b- 
make negligible contributions to D . Note that this 
expression does not depend on the distribution of 
the input. For large M , it can be shown that the 
optimal vector quantizer has a normalized point 
density proportional to f x ' and yields 

Ani„ ^ /zW 1/3 dn(x)\ , (4) 

where the constant c depends only on n . 

Quantized Control: Basic Formulation 

Much of the theory of quantized control concerns 
finite-dimensional linear time-invariant (LTI) 
plants. A formulation is provided in this section 
to help fix ideas, for the case of a single feedback 
loop containing a single errorless digital channel. 

Consider the discrete-time plant 

xjc +1 — Axk+Buk+Vk, yk — Fxk+Wk, ( 5 ) 
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where at every time k G Z>o, Xk G M" is the 
state with Vo unknown, Uk G M m is the control 
input, yk G is the measured output, Vk G 
is unknown process noise, G R p is unknown 
measurement noise, and A, B, and F are constant 
known matrices of appropriate dimensions. For 
the problem to be well posed, assume that the 
matrix pairs (A, B) and (F, A) are, respectively, 
reachable and observable. Suppose that the out¬ 
put sensors communicate with the controller over 
a digital channel that can carry one symbol Sk 
from a finite, possibly time-varying alphabet S>k 
of cardinality Mk > 1 during the (k + l)-th 
sampling interval. Assume for simplicity that the 
channel is errorless, with negligible propagation 
delay. The asymptotic average rate at which the 
channel transports data may then be defined as 


1 k -1 

R := lim — log 2 Mj (bits/sample). 


( 6 ) 


k—>oo K 


j= 0 


Note that if the channel alphabet Sk is constant 
or varies periodically with k , the inferior limit 
reduces to a straight limit. 

In full generality, each transmitted symbol 
may depend on all past and present measurements 
and past symbols, 


Sk = Vk(yk s o *) e S k , Vk e Z> 0 , (7) 


Finite-dimensional coding and control laws 
may be formulated by defining internal coder and 
controller states ^ and \J/f with local updates of 
the form 

sk = y ( yk, V k - 1 )> V k =<l> (sk, V k - 1 ). ( 9 ) 
Vk=r) (sk, Vk- 1 ) , u k = 8 (Vk) • ( 10 ) 

If the states ^ and ^ are finite valued, then the 
coding and control laws are called finite-state. 


Additive Noise Model 

Early approaches to quantized control modeled 
quantization errors as additive noise, in order to 
allow the use of well-developed tools from linear 
stochastic control (Curry 1970). While this was 
reasonable at high quantizer resolution, it failed 
to capture two key properties. 

A simple example illustrates this. Consider 
a scalar, noiseless, fully observed, unstable LTI 
plant - i.e., (5) with n = 1, A = a with \a\ > 1, 
B, C = 1, and Wk, Vk = 0 - where xo is 
a random variable. Under static, high-resolution 
uniform quantization, the data available to the 
controller is expressed as a noisy linear measure¬ 
ment 


y'k '■= Q( x k ) = x k +n k , k e Z> 0 , 


where yk is the coder mapping at time k. At time 
k the controller has so,... ,Sk available and then 
applies a control law of the general form 

«* = &(4) e R m , Vk e Z> 0 , (8) 

where 8k is the controller mapping at time k. 

In practice, additional memory or structural 
constraints are usually placed on the general 
coding and control rules (7) and (8). For instance, 
if a static quantizer of the form (1) is used, then 
the coding alphabet Sk = S will be constant 
and Sk = y(yk) will represent the index of 
the quantizer cell that contains yk . Similarly, 
a static, memory less controller is captured by 
setting Uk = 8(sk) in (8). 


where the quantizer error process (nk) is treated 
as zero mean white noise uncorrelated with (xk) 
and having constant variance given by (3). 

The first shortcoming of this approach is that 
it precludes the possibility of asymptotic mean- 
square stability, which would effectively require 
the controller to estimate the initial state Xo with a 
mean-square error diminishing strictly faster than 
a~ 2k . This turns out to be impossible under the 
uncorrelatedness assumption and the constraint 

M > i- 

However, in the seminal paper (Delchamps 
1990), it was shown that asymptotic stability 
could in fact be achieved, by using a nonlinear 
controller that exploited the correlation between 
successive quantizer errors. To see this, suppose 
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that the unknown initial state Xo is confined to a 
known interval [—4, Zq]. At time k > 0, suppose 
that 4 > 0, k = 1,2,... represent bounds to be 
determined on the future states x^. Let Q be a 
static one-bit quantizer - i.e., with M = 2 - such 
that Q(x) = 1 if x > 0 and Q(x) = — 1 if x < 0. 
At time k let Uk = — 0.5alkQ(x ) so that 


(/ /z 0 (x) 1/3 d/4x)j • 

Thus a necessary condition for asymptotic mean- 
square stabilizability is that M > \a\ - see Nair 
and Evans (2000) for details. 


_ ( fl(xk 0.54) 

x * +1 “ ( a(x k + 0.54) 


if 0 < x k < x k 

if 4< < '- 0 


The Data Rate Theorem 


^ l-^fc+il ^ 0.5|fl|4 —• lk+ 1- 

If |tf| <2 then 4 0, and asymptotic sta¬ 

bility is achieved uniformly and with exponential 
convergence. 

However, the main drawback of the additive 
white noise model is that it does not predict the 
loss of closed-loop stability that can result when 
the quantizer resolution is too coarse. This is 
because the number M of quantizer points only 
serves to determine the variance of the additive 
noise n k : reducing M increases the variance of 
n k and the mean-square states, but they remain 
bounded over time. In contrast, a rigorous anal¬ 
ysis reveals that stability is impossible by any 
means, linear or nonlinear, when M drops below 
a certain threshold. 

Numerous proofs of this loss of stability exist. 
In a stochastic setting, the argument is based on 
fixing the coder and controller and expanding out 
the closed-loop dynamics of the scalar LTI plant 
to write 

Xk=a k xo~a k Zk ( 11 ) 

where Zk := —a~ k CL k ~j~ l Uj. As Zk is a 
function of s^~ l e S k , it can take at most M k 
values. Furthermore, in the absence of noise, it 
is fully determined by xo, for a given coding 
and control policy (7) and (8). Thus Zk can be 
regarded as the output Q' k {x o) of an M k -valued 
quantizer. Substituting this into (11) yields 


Xk = a k (x o - e*(*o)). 


From the asymptotic quantizer result (4), it then 
follows that for large k, 


The discussions above emphasized the need for a 
more rigorous approach to quantized control. In 
the literature, the necessary condition M > \a\ 
was first derived in a nonrandom setting, where 
it was shown to be both sufficient and necessary 
to be able to ensure uniform stability (Baillieul 
1999; Wong and Brockett 1999). 

The sufficiency argument is constructive. Let 
Q be an M -level uniform quantizer on [—1,1], 
with cells formed by partitioning [—1,1] into 
M subintervals R l ,..., R M of equal length and 
setting Q(z) to be the midpoint of R l when 
z € R l . Suppose that at time k the unknown 
state Xk lies in a known interval [—/,/], and set 
Uk = —alQ(xk //). Thus 


|xjt+i| = \a\\xk-lQ(x k /l)\ 


= \a\l 



When M > \a\, the right-hand side < /. Thus 
Xk +1 e [—/,/] as well, and boundedness is 
achieved. Uniform asymptotic stability can be 
achieved by replacing the constant parameter / in 
the argument above with a time-varying bound 4, 

updated as 4+i = \a\l k /M -+ 0. 

The necessity argument is based on volume 
partitioning. The basic idea is to fix an arbitrary 
coding and control policy and let be the 
Lebesgue measure of the set of values that Xk can 
take at time k e Z>o. After k time steps, the plant 
dynamics expand this uncertainty volume mo by a 
factor \a \ k . However, the coder effectively divides 
this region into M k disjoint, exhaustive pieces, 
each of which is shifted by the controller. As 
Lebesgue measure is translation invariant, it then 
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follows that mk > \a\ k mo/ M k . Consequently M 
must exceed \a\ if the closed loop is uniformly 
asymptotically stable. 

The tight criterion M > \a\, or equivalently 
R > log 2 |tf|, was the first instance of the data 
rate theorem. Volume-partitioning arguments and 
Jordan canonical forms can be used to generalize 
it to LTI plants with vector-valued states, yielding 
the necessary and sufficient condition 

R> J2 lo g2 \\i\=:H, (12) 

/:| A /|>1 

where X\,... ,X n are the eigenvalues of A. This 
criterion is remarkably universal, having been 
shown to be tight for a variety of settings and 
objectives: e.g., for asymptotic r-th moment sta- 
bilizability with random, unbounded Vo and no 
process or measurement noise (Nair and Evans 
2003); uniform stabilizability with bounded Xo 
and no process or measurement noise (Baillieul 
2002); uniform stabilizability with bounded ini¬ 
tial state, process, and measurement noise (Hes- 
panha et al. 2002; Tatikonda and Mitter 2004); 
and mean-square stabilizability with random, un¬ 
bounded initial state, process, and measurement 
noise (Nair and Evans 2004). 

The deep nature of (12) becomes even clearer 
when it is noted that the right-hand side of (12) 
coincides with the intrinsic entropy generation 
rate H of the (open-loop) plant, in both the 
Kolmogorov-Sinai and topological senses; that 
is, it describes the growth rate of the number of 
distinguishable state trajectories. Thus the data 
rate theorem states that stability is possible iff 
the communication rate in the feedback loop 
exceeds the rate at which the plant generates 
uncertainty. This interpretation leads to the notion 
of feedback entropy (see cross-reference to article 
by C. Kawan). 

Zooming Quantized Control 

When the plant noise and initial state of the plant 
(5) are bounded, stability (in a uniform sense) can 
be guaranteed by applying a linear observer to 
track the plant states with bounded error and then 


applying a suitable static, memoryless coding and 
control policy on the observer states x° k . 

However, if the noise or initial state has 
unbounded support - e.g., when they are 
Gaussian or when prior bounds on them are 
not known - then stability cannot be achieved 
by any such static memoryless scheme or indeed 
by any scheme where the control inputs (8) are 
bounded (Nair and Evans 2004). The explanation 
is simple: due to the infinite support, there is a 
nonzero probability that the propagated state Ax t 
will be beyond reach of the control input at some 
time t. The unstable plant dynamics then amplify 
this shortfall, causing the same phenomenon to 
occur with increasing probability at subsequent 
times and inevitably leading to instability. 

One solution is to use a zooming quantizer , 
i.e., having a dynamic range 4 > 0 that is 
not bounded a priori but expands or contracts 
according to the most recent symbol (Brockett 
and Liberzon 2000). In the noiseless case, if this 
symbol corresponds to the “overload region” of 
the quantizer (as indicated by a special symbol), 
then the range is updated as 4+i := (pout lk , 
where (p ou t > 1 is the “zoom-out” factor. Other¬ 
wise 4+i '.= (pinh, where <pi n < 1 is the “zoom- 
in” coefficient. 

In the communications literature such 
schemes are called adaptive quantizers (Good¬ 
man and Gersho 1974). If (p out is sufficiently large 
compared to the unstable open-loop eigenvalues, 
and if (pt n is not too small, then global asymptotic 
stability ensues. With unbounded noise in the 
plant, variants of this scheme guarantee mean- 
square stability at any data rate satisfying (12) 
(Nair and Evans 2004) or input-to-state stability 
(Liberzon and Nesic 2007). 

Zooming quantization is an important example 
of a finite-dimensional coder-controller (9) and 
(10), with 4 playing the role of an internal state 
variable. As the range update is driven by the 
symbols, both coder and controller can each gen¬ 
erate identical copies of 4, provided that there are 
no errors in the channel and they both start from 
the same initial range Zq. The important issue 
of how to design a scheme that can cope with 
mismatched initial internal states or a small level 
of channel errors is as yet largely unexplored. 
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Erroneous Digital Channels 

The information-theoretic aspects of quantized 
control become especially pronounced when the 
channel is not error-free. In this case, the data rate 
theorem (12) can be extended, but in ways that 
are highly dependent on the precise setting and 
stability objective. 

A common figure of merit for a stochastic dis¬ 
crete memoryless channel (DMC) is its ordinary 
capacity C. This is defined operationally as the 
largest block-code bit rate that can be transmitted 
across the channel with negligible probability 
of decoding error, and also coincides with the 
largest rate of Shannon information across the 
channel (Shannon 1948). For a noiseless LTI 
plant with random initial state controlled over a 
DMC, the condition C > H is a tight criterion 
for almost sure (a.s.) asymptotic stabilizability 
(Matveev and Savkin 2007a). This is a natural 
generalization of (12). 

On the other hand, if the objective is to bound 
the state moments of a scalar LTI plant subject to 
bounded process noise, then the achievability of 
this goal is determined by the anytime capacity 
C any (Sahai and Mitter 2006): this is essentially 
given by the fastest decay rate of the decoding 
error probability. 

However, if the aim is a.s. boundedness of an 
LTI plant with random initial state and bounded, 
nonstochastic process noise, then the stabilizabil¬ 
ity criterion changes again to Co/ > H (Matveev 
and Savkin 2007b). Here Co/ is the zero-error 
feedback capacity of the channel, defined as the 
largest block-code bit rate that can be transmitted 
across the channel with exactly zero probability 
of decoding error and with perfect channel feed¬ 
back (Shannon 1956). 

As Co/ < C any < C for most channels, these 
conditions do not coincide. This suggests that 
there is no universal, operationally relevant in¬ 
formation theory for feedback control over error- 
prone channels: such a theory must instead be 
tailored to match the underlying objectives and 
assumptions. For systems with nonstochastic dis¬ 
turbances, preliminary steps in this direction have 
been taken in Nair (2012, 2013). The reader is 
also referred to You and Xie (2011) and Minero 


et al. (2013) for information-theoretic analyses of 
stochastic linear systems controlled via Markov 
channels. 


Summary and Future Directions 

This article described the key elements of quan¬ 
tized control with finite data rates, emphasiz¬ 
ing the interplay between coding and control. 
A great deal is now known about the funda¬ 
mental limitations on stability in quantized con¬ 
trol systems consisting a single feedback loop. 
Two major directions for future research suggest 
themselves: 

- Little work has been done on designing 
optimal coding and control schemes or 
determining optimal costs at a given rate, 
apart from one or two special cases and 
structural results - see Nair et al. (2007) 
and the references therein. It is very unlikely 
that explicit, closed-form solutions will be 
possible. However, numerical approaches 
based on the Lloyd-Max algorithm, particle 
filtering, and model-predictive control may 
prove fruitful. 

- Networked control systems usually consist of 
a number of subsystems interconnected over a 
network. Furthermore, in multi-agent systems 
the main objective may not be stability, but 
rather coordination or consensus to a com¬ 
mon state. Comparatively little is known about 
the data rate requirements and information- 
theoretic aspects of these problems. 
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Abstract 

In this article, we study the tools and method¬ 
ologies for the analysis and design of control 
systems in the presence of random uncertainty. 
For analysis, the methods are largely based on 
the Monte Carlo simulation approach, while for 
design new randomized algorithms have been de¬ 
veloped. These methods have been successfully 
employed in various application areas, which in¬ 
clude systems biology; aerospace control; control 
of hard disk drives; high-speed networks; quan¬ 
tized, embedded, and electric circuits; structural 
design; and automotive and driver assistance. 


Keywords 

Chernoff bound; Hoeffding inequality; Monte 
Carlo simulation; Randomization algorithms 


Preliminaries 

Randomized methods for control deal with the 
design of uncertain and complex systems. They 


have been originally developed for linear sys¬ 
tems affected by structured uncertainty, usually 
expressed in the so-called M — A configuration. 
A similar approach may be followed when deal¬ 
ing with uncertainty in other contexts, such as 
uncertainty in the environment (random distur¬ 
bances) or even when there is no uncertainty in 
the problem formulation, but the complexity of 
the problem is such that randomized methods 
may be the best approach, since these methods 
are known to break the curse of dimensionality, 
see Tempo et al. (2013) for details. 

For the sake of simplicity, we consider here an 
uncertain plant transfer function P(s,q ) affected 
by parametric uncertainty 

q = [qi...qe] T 

bounded in a set Q C M 1 . The objective is to 
design the parameters 0 e W 1 of a controller 
transfer function C(s, 6) so to guarantee robustly 
some desired performance. This is reformulated 
as the problem of finding a design satisfying some 
uncertain constraints of the form 

/ (0, q) < for all q e Q. 

In other words, the goal is to design a robust con¬ 
troller which satisfies the uncertain constraints. 
Specific examples of these constraints include an 
H 0 o or H 2 norm bound on the closed-loop sensi¬ 
tivity function or time-domain specifications. 

Since this objective may be too hard to achieve 
in many situations, we are relaxing it as follows: 


J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9, 
© Springer-Verlag London 2015 
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we would like to design controller parameters 0 e 
W 1 such that a certain violation is allowed, i.e., 

( / (0, q) < 0 for all q e Q good ; 

{ / (6, q) > 0 for all q e Q bad 

where the good and bad sets satisfy the equations 


( Qgood hi Qbad — Q> 

( Qgood C Qbad = &■> 

and the goal is to guarantee that the bad set Qbad 
is “small” enough. To state this concept more pre¬ 
cisely, we assume that q e Q is a random vector 
with given probability density function (pdf), and 
we introduce the probability of violation and the 
controller reliability. 

Definition 1 (Probability of Violation and Re¬ 
liability) The probability of violation for the 
controller parameters 0 e MP is defined as 

V ( 0 ) =Prob {q e Q : f (6, q) > 0} . 

The reliability of the design 0 eW 1 is given by 


samples of q e Q, called the multisample , of the 
uncertainty q according to the given probability 
density function 

= j,(». 

The cardinality N of the multisample 
is often referred to as the sample complexity 
(Vidyasagar 2001). The empirical violation of the 
design 0 is then defined. 

Definition 2 (Empirical Violation) For given 
0 e W 1 , the empirical violation of V (0, q) 
with respect to the multisample gO -AO _ 
{q^ l \ ..., q e Q^ is given by 

Vs (0,9 (O ) 

i = 1 


where 1/ (Q,q^) is the indicator function 



Oif f (0,q (i) ) <0 
1 otherwise. 


R (0) = 1 - V (0 ). 

In this context, we are satisfied if, given a viola¬ 
tion level ocG (0, 1), the probability of violation 
is sufficiently small, i.e., V (0) <oc. We remark 
that relaxing the requirement of robust satisfac¬ 
tion of the uncertain constraints / (0, q) < 0 to a 
probabilistic one (by means of the probability of 
violation) is not helpful computationally because 
computing exactly the probability V (0) is very 
hard in general because it requires to solve a 
multidimensional integral over the nonconvex do¬ 
main defined by / (0, q) > 0, with q e Q C M 1 . 
The problem is then resolved introducing Monte 
Carlo randomized algorithms (formally defined 
in the next section). This is a computational 
approach which leads to solutions which are often 
denoted as PAC (probably approximately correct) 
(Vidyasagar 2002). 

More precisely, for fixed design 0 e M", to 
compute a Monte Carlo approximation based 
on N random simulations, we generate N 
independent identically distributed (iid) random 


Monte Carlo Randomized Algorithms 
for Analysis 

In this section, we study Monte Carlo randomized 
algorithms for analysis, i.e., when the controller 
parameters are fixed, and in particular we con¬ 
centrate on a PAC computation of the probability 
of violation. In agreement with classical notions 
in computer science (Mitzenmacher and Upfal 
2005; Motwani and Raghavan 1995), a random¬ 
ized algorithm (RA) is formally defined as an 
algorithm that makes random choices during its 
execution to produce a result. This implies that, 
even for the same input data, the algorithm might 
produce different results at different runs, and, 
moreover, the results may be incorrect. There¬ 
fore, statements regarding properties of these 
algorithms are necessarily of probabilistic nature. 

Formally, the probabilistic parameters s, 
8 e (0,1) called accuracy and confidence, 
respectively, are introduced. For any 0 , the PAC 
approach provides an empirical violation which is 
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an approximation Vn ( 9 , to V ( 9 ) within 

accuracy s, and this event holds with confidence 
1 — 8 . 


Monte Carlo Randomized Algorithm 

Given a design 0 e M", a Monte Carlo 
randomized algorithm (MCRA) is a randomized 
algorithm that provides an approximation 
Vn (0, to V ( 9 ) based on the multisample 

g(L..A0 Gi ven accuracy s and confidence <5, the 
approximation may be incorrect, i.e., 


V(6)-V (e,q a - N) ^ 


> € 


but the probability of such an event is bounded, 
and it is smaller than <5. 

In general, the results obtained by an MCRA 
as well as its running time would be different 
from one run to another since the algorithm is 
based on random sampling. As a consequence, 
the computational complexity of such an algo¬ 
rithm is usually measured in terms of its expected 
running times. MCRA are efficient because the 
expected running time is of polynomial order 
in the problem size (Tempo et al. 2013). One¬ 
sided and two-sided Monte Carlo randomized 
algorithms may be also defined (Tempo and Ishii 
2007). 

To derive the probabilistic properties of 
MCRA, we need to state the so-called Hoeffding 
inequality, which provides a bound on the error 
between the probability of violation and the 
empirical violation (Vidyasagar 2002). 


Two-Sided Hoeffding Inequality 

For fixed 0 eW 1 and s e (0,1), we have 


Prob |^ (1JV) e Q n : V(d) - V (e, tf (1JV) ) | 

> A < 2e- 2iV V 


For fixed accuracy s, we observe that the right- 
hand side of this equation approaches zero 
exponentially. Furthermore, if we bound the 
right-hand side of this equation with confidence 
8 , we immediately obtain the classical (additive) 


Chernoff bound (Chernoff 1952) which is stated 
next. 


Chernoff Bound 

For any £ e (0,1) and 8 e (0,1), if 


A'i^ylog 


2 

8 


then, with probability greater than 1 — 8, we have 

V(9) - V (0,^ (1 ‘" JV) )| < e. 

The Chernoff bound provides an indication of the 
required sample size, i.e., it provides the so-called 
sample complexity. More precisely, the sample 
complexity of a randomized algorithm is defined 
as the minimum cardinality of the multisample 
g(L..A0 neec [ s to be drawn in order to achieve 
the desired accuracy s and confidence 8. Notice 
that the confidence enters the Chernoff bound 
in a logarithmic fashion, while accuracy enters 
quadratically, and therefore, it is much more 
expensive computationally. Other large deviation 
inequalities and sample complexity bounds are 
discussed in the literature, including in particular 
the (multiplicative) Chernoff bound and the log- 
over-log bound for computing the so-called em¬ 
pirical maximum (Tempo et al. 1997). We refer 
to Vidyasagar (2002) for additional details. 

Remark 1 (Las Vegas Randomized Algorithms) 
Las Vegas randomized algorithms (LVRA) are 
based on random samples generated according 
to a discrete probability density function, instead 
of a continuous pdf as in the case of Monte 
Carlo. Therefore, contrary to MCRA, LVRA pro¬ 
vide the “correct answer” with probability one 
because the entire search space can be fully ex¬ 
plored. However, because of randomization, the 
running time of an LVRA is random (similarly to 
MCRA) and may be different in each execution. 
Hence, it is of interest to study the expected 
running time of the algorithm. It is noted that 
the expectation is with respect to the random 
samples generated during the execution of the 
algorithm and not to the problem data. Classical 
examples of LVRA are within computer science 
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and include the well-known randomized quick¬ 
sort (RQS) algorithm for ranking numbers, which 
is implemented in a C library of the UNIX op¬ 
erating system (Knuth 1998). Other more recent 
developments in systems and control regarding 
these algorithms are for the PageRank compu¬ 
tation in the Google search engine (Ishii and 
Tempo 2010), consensus over large-scale net¬ 
works (Fagnani and Zampieri 2008), localization 
and coverage control of robotic networks (Bullo 
et al. 2012), and opinion dynamics (Frasca et al. 
2013). These problems are generally formulated 
in a graph theoretic setting consisting of nodes 
and links, and either the nodes or the links are 
randomly selected according to a given “local” 
protocol (often called gossip) based on a given 
discrete pdf. 


Randomized Algorithms for Control 
Design 

This section deals with control problems which 
require computing a design 9 e W 1 satisfying 
some probabilistic properties on the uncertain 
constraints / (9,q). Two classes of problems, 
feasibility and optimization, are considered. 

Feasibility Problem 

Given uncertain constraints / (9, q) and level oce 
(0,1), compute 9 eW 1 such that 

V (9) = Prob {q e Q : / (9,q) > 0} <oc . (1) 

The second problem relates to the optimization of 
a linear function of the design parameters under 
probability constraints. 

Optimization Problem 

Given uncertain constraints / (9, q), a linear ob¬ 
jective function c T 9 and level p e (0,1), solve 
the constrained optimization problem 

min# c T 9 

subject to V ( 9 ) = Prob {q e Q : / (9, q) > 0} 

< P • ( 2 ) 


Optimization problems subject to constraints of 
the form V ( 9 ) = Prob {q e Q : f (9, q) > 0} < 
a are often called chance constraint optimization 
(Uryasev 2000). 

Most of the algorithms that have been stud¬ 
ied in the literature follow two main paradigms 
and are often based on the following convexity 
assumption. 

Convexity Assumption 

The uncertain constraint / (9, q) is convex in 9 
for any fixed value of q e Q. 

The two solution paradigms that have been 
proposed are now summarized. The algorithms 
have been implemented in the Toolbox RACT 
(Randomized Algorithms Control Toolbox) for 
probabilistic analysis and control design in the 
presence of uncertainty (Tremba et al. 2008). 

Paradigm 1 (Sequential Approach) 

Under the convexity assumption, we study 
the Feasibility Problem (1). The algorithms 
presented in the literature (see, e.g., (Calafiore 
et al. 2011) for finding a probabilistic feasible 
design) follow a general iterative scheme (Fig. 1), 
which consists of successive randomization steps 
to handle uncertainty and optimization steps to 
update the design parameters. In particular, these 
algorithms share two fundamental ingredients: 

1. A probabilistic oracle which performs a 
random check, with the objective to assess 
whether the probability of violation V (0^) 
of the current candidate solution 9^ is 
smaller than a given level p and returns a 
certificate of unfeasibility, that is, a value 

such that / (j)( k \q( k ^ > 0, when the 
candidate solution is found unfeasible 

2. An update rulefr up d which exploits the con¬ 
vexity of the problem for constructing a new 
candidate solution 9^ k+V} based on the proba¬ 
bilistic oracle outcome 

In this paradigm, the algorithm returns a de¬ 
sign 9k such that 

v ( 4 ) = Prob j? e Q : / ( 4 , tf) > o) < p 
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Randomized Methods for Control of Uncertain Sys¬ 
tems, Fig. 1 Paradigm for sequential design consisting 
of probabilistic oracle and update rule 

is larger than 1—5. That is, the violation probabil¬ 
ity associated to the design Ok is smaller than the 
level p, and this event holds with large confidence 
1-5. 

Paradigm 2 (Scenario Approach) 

Under the convexity assumption, we study the 
optimization problem (2). We remark that, even 
under these assumptions, solving this problem 
is very hard computationally because the prob¬ 
abilistic constraint is nonconvex. To alleviate this 
difficulty, we reformulate problem (2) as a so- 
called scenario problem introduced in Calafiore 
and Campi (2006), which is now described. 

For randomly extracted scenarios q( l - N \ this 
approach requires to compute 6 e W 1 that solves 
the convex optimization problem subject to a 
finite number of sampled constraints 

- _ min# c T 6 

N subject to f (0,q l ) < 0, i = 1,..., N 

(3) 


In this paradigm, the algorithm returns in one- 
shot a design On and the sample complexity N 
such that 

V (0tf) = Prob j <7 e Q : / (§ N , q^j > o) < p 

is larger than 1—5. That is, the violation probabil¬ 
ity associated to the design On is smaller than the 
level p, and this event holds with large confidence 
1-5. 


Concluding Remarks 

Other probabilistic approaches have been pro¬ 
posed in the literature for control design, which 
are not based on the convexity assumption. A 
noticeable example is the strategy based on statis¬ 
tical learning theory (Valiant 1984; Vapnik 1998) 
which has the objective to design a controller 
without any convexity assumptions (Alamo et al. 
2009). In particular, in Alamo et al. (2013), the 
general class of sequential probabilistic valida¬ 
tion (SPV) algorithms has been introduced. A 
specific SPV algorithm tailored to scenario prob¬ 
lems, providing a sequential scheme for dealing 
with the optimization problem, has been recently 
studied in Chamanbaz et al. (2013). 


Cross-References 

► Markov Chains and Ranking Problems in Web 
Search 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 

► Uncertainty and Robustness in Dynamic Vision 

Bibliography 

Alamo T, Tempo R, Camacho E (2009) A randomized 
strategy for probabilistic solutions of uncertain feasi¬ 
bility and optimization problems. IEEE Trans Autom 
Control 54:2545-2559 

Alamo T, Tempo R, Luque A, Ramirez D (2013) The 
sample complexity of randomized methods for anal¬ 
ysis and design of uncertain systems. arXiv: 13040678 
(accepted for publication) 














1128 


Realizations in Linear Systems Theory 


Bullo F, Carli R, Frasca P (2012) Gossip coverage control 
for robotic networks: dynamical systems on the space 
of partitions. SIAM J Control Optim 50(1):419-447 
Calafiore G, Campi M (2006) The scenario approach 
to robust control design. IEEE Trans Autom Control 
51(l):742-753 

Calafiore G, Dabbene F, Tempo R (2011) Research 
on probabilistic methods for control system design. 
Automatica 47:1279-1293 

Chamanbaz M, Dabbene F, Tempo R, Venkataramanan 
V, Wang Q (2013) Sequential randomized algorithms 
for convex optimization in the presence of uncertainty. 
arXiv: 1304.2222 

Chernoff H (1952) A measure of asymptotic efficiency for 
tests of a hypothesis based on the sum of observations. 
Ann Math Stat 23:493-507 

Fagnani F, Zampieri S (2008) Randomized consensus 
algorithms over large scale networks. IEEE J Sel Areas 
Commun 26(4):634-649 

Frasca P, Ravazzi C, Tempo R, Ishii H (2013) Gossips 
and prejudices: ergodic randomized dynamics in social 
networks. In: Proceedings of the 4th IFAC workshop 
on distributed estimation and control in networked 
systems, Koblenz 

Ishii H, Tempo R (2010) Distributed randomized algo¬ 
rithms for the PageRank computation. IEEE Trans 
Autom Control 55:1987-2002 
Knuth D (1998) The art of computer programming. Sort¬ 
ing and searching, vol 3. Addison-Wesley, Reading 
Mitzenmacher M, Upfal E (2005) Probability and comput¬ 
ing: randomized algorithms and probabilistic analysis. 
Cambridge University Press, Cambridge 
Motwani R, Raghavan P (1995) Randomized algorithms. 

Cambridge University Press, Cambridge 
Tempo R, Ishii H (2007) Monte Carlo and Las Vegas 
randomized algorithms for systems and control: an 
introduction. Eur J Control 13:189-203 
Tempo R, Bai EW, Dabbene F (1997) Probabilistic ro¬ 
bustness analysis: explicit bounds for the minimum 
number of samples. Syst Control Lett 30:237-242 
Tempo R, Calafiore G, Dabbene F (2013) Randomized 
algorithms for analysis and control of uncertain sys¬ 
tems, with applications. Communications and control 
engineering series, 2nd edn. Springer, London 
Tremba A, Calafiore G, Dabbene F, Gryazina E, Polyak B, 
Shcherbakov P, Tempo R (2008) RACT: randomized 
algorithms control toolbox for MATLAB. In: Proceed¬ 
ings 17th IFAC world congress, Seoul, pp 390-395 
Uryasev SP (ed) (2000) Probabilistic constrained opti¬ 
mization: methodology and applications. Kluwer Aca¬ 
demic, New York 

Valiant L (1984) A theory of the leamable. Commun ACM 
27(11): 1134-1142 

Vapnik V (1998) Statistical learning theory. Wiley, New 
York 

Vidyasagar M (2001) Randomized algorithms for robust 
controller synthesis using statistical learning theory. 
Automatica 37:1515-1528 

Vidyasagar M (2002) Learning and generalization: with 
applications to neural networks, 2nd edn. Springer, 
New York 


Realizations in Linear Systems 
Theory 

Panos J. Antsaklis 1 and A. Astolfi 2,3 
department of Electrical Engineering, 
University of Notre Dame, Notre Dame, 

IN, USA 

department of Electrical and Electronic 
Engineering, Imperial College London, 

London, UK 

3 Dipartimento di Ingegneria Civile e Ingegneria 
Informatica, Universita di Roma Tor Vergata, 
Roma, Italy 

Abstract 

When a state variable description of a linear 
system is known, then its input-output behavior 
can be easily realized by interconnecting simpler 
components. The problem of realization refers to 
the following: given an input-output description 
such as the impulse response, or the transfer 
function in the case of time-invariant systems, 
find a state variable description, the impulse re¬ 
sponse of which is the given one. Existence and 
minimality conditions are discussed. We are in¬ 
terested in realizations of minimum order which 
is the case when the realization is both control¬ 
lable and observable. Realizations in both the 
continuous-time and discrete-time systems are 
discussed. 

Keywords 

Controllability; Irreducible; Minimal order; 
Observability; Realizations 

Introduction 

The problem of system realization is as follows: 
given an external description of a linear system, 
specifically its impulse response (or its transfer 
function in the case of a time-invariant system), 
determine an internal state variable description 
that generates the given impulse response (or the 
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transfer function). The name reflects the fact that 
if a (continuous-time) state variable description 
is known, an operational amplifier circuit can 
be easily built to realize (actually simulate) the 
system response. 

Before we discuss realizations, we review the 
key relations between internal state variable and 
external impulse response or transfer function 
descriptions. 

Consider a system described by 

x = A{t)x + B(t)u, y = C{t)x + D(t)u, 

( 1 ) 


where x(t), the state vector, is a column vector of 
dimension n (x(t) e W) and u(t) G R m , y(t) G 
R p are the inputs and outputs of the system. 
A(t) e R nxn , B{t ) g R n * m , C(t) g R pxn , 
D(t ) G M^ xm with entries continuous functions. 
The output response is given by 


y(t) = 


emit, t 0 )x 0 + 


/' 


Hit, x)uix)dx, 

( 2 ) 


where &it,to) is the n x n transition matrix of 
x = Ait)x, x(to) = xo is the initial condition, 
and Hit, x) is the pxm impulse response matrix 
given by 


Hit, x) 


C(t)<P(t,x)B(t) 

< +Dit)8it — x) t > x, (3) 
0 t < x, 


where, without loss of generality, the initial time 
to was taken to be zero. In this case, the impulse 
response is 


/ , ( Ce A(, - l) B + D8(t -r) t > r, 

H(t, t) = < 


Recall that time invariance implies that 
Hit, x) = Hit — x, 0), and so r, which is the 
time an impulse input is applied to the system, 
can be taken to be zero (r = 0), without loss of 
generality, to give Hit, 0). The transfer function 
of the system is the (one-sided or unilateral) 
Laplace transform of Hit, 0), namely, 

H(s) = C[H(t, 0)] = C{sl - A)~ l B + D. (7) 

A realization of Hit, x) is any state variable 
description (1), {Af), B it), C it), D it)}, the im¬ 
pulse response of which is Hf, x), that is, (3) 
is satisfied, similarly for the time-invariant case 
when (6) is satisfied. 

In the time-invariant case, a realization is com¬ 
monly defined in terms of the transfer function 
matrix His). Then a realization of His) is any 
state variable description (4), {A, B, C, D}, the 
transfer function of which is His), that is, (7) is 
satisfied. 

There are alternative conditions under which a 
set of {A, B, C, D) is a realization of some His). 
To this end, expand His) in a Laurent series to 
obtain 


where 8 it — x) is the impulse (delta or Dirac) 
function applied at time t = x. Recall that 
Hf, x) denotes the response at time t when an 
impulse input is applied at time r assuming zero 
initial conditions. 

In the time-invariant system, (1) becomes 


H(s) = H 0 + H lS ~ l + H 2 s~ 2 + ■■■ . (8) 

The matrices Hi, i = 0,1,2,... are called 
Markov parameters of the system and can be 
determined as follows: 

Ho = lim His), H\ = lim siHis) - tf 0 ), 

j—>-oo s—>oo 


x = Ax + Bu, y = Cx + Du, 
and the output response in this case is 

yit) = Ce At x o + f Hf,x)uix)dx, 

Jo 


(4) H k = lim s k (H(s) - Sf I l 0 H iS k> 1. 

S^OO 

It can be shown that a set {A, B,C, D} is a 
realization of His) if and only if 

(5) H 0 = D and //, = CA l ~ l B, i = 1,2,.... 

(9) 


R 
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Below we introduce conditions for the existence 
of a realization given H(t, r) or His). 

Note that if a realization does exist, then there 
is an infinite number of realizations. One could 
see this, for example, by considering equivalent 
descriptions of a realization - all have the same 
impulse response or transfer function. 

Existence and Minimality 

It can be shown that Hit, r) is realizable as the 
impulse response of a system described by (1) if 
and only if H(t, r) can be decomposed into 

H(t, r) = M(t)N(r) + D(t)8(t - r), (10) 

for t > r, where M, N, and D are p x 
n , n x m,and p x m matrices, respectively, 
with continuous real-valued entries and with n 
finite. If in addition to (10), M(t ) and N(t ) are 
differentiable and 

H(t, r) = H{t- r,0), (11) 

then //(£, r) is realizable as the impulse response 
of a system described by a time-invariant sys¬ 
tem (4). 

In the time-invariant case, it is more common 
to work with a given transfer function H(s). Then 
H(s ) is realizable, as the transfer function matrix 
of a time-invariant system described by (4), if and 
only if H(s ) is a matrix of rational functions and 
satisfies 

lim H(s) < oo, (12) 

s—>oo 

that is, if and only if H(s ) is a proper rational 
matrix or equivalently if and only if 

lim H(s) = D (13) 

s—>oo 

is a constant. 

We are interested in realizations (4) of a given 
transfer function matrix H(s ) of least order n 
(A e R nXn ), called minimal or irreducible real¬ 
izations of H(s). 

The following two results completely solve the 
minimal realization problem. 


Theorem 1 An n-dimensional realization 
{A, B,C, D} of His) is minimal (irreducible, 
of least order) if and only if it is both reachable 
(or controllable) and observable. 

Note that if iA,B) is not controllable, then by 
separating the controllable and uncontrollable 
parts of the system by an equivalence transforma¬ 
tion and taking only the controllable part, one can 
still obtain His) because the uncontrollable part 
of the system cancels out in His). Similarly for 
observability. So controllability and observability 
are necessary conditions for minimality. It can be 
shown that they are also sufficient. 

Theorem 2 If a minimal realization of order n is 
found, then any other minimal realization may be 
obtained via equivalence transformation. 

Specifically, if {A,B,C,D} and {A,B,C,D} 
are realizations of His) and {A, B, C, D} is min¬ 
imal, then {A, B,C, D} is also minimal if and 
only if there exists a nonsingular matrix P such 
that 

A = PAP~\ B = PB, C = CP~\ D = D. 

(14) 

Discrete-Time Linear Systems 

The definitions and results for the discrete-time 
case are completely analogous to the ones in 
the continuous-time case. They are summarized 
below for completeness. 

Consider systems described by 

xik + 1 ) = Aik)xik) + Bik)uik), 

( 15 ) 

yik) = Cik)xik) + Dik)uik). 

The output response is 

k -1 

yik)= Cik)@ik,ko)xo+^2H(kA) u (i)’ k>ko, 

i =ko 

(16) 

where @ik, kf) is the n x n transition matrix and 
Hik,i) is the p x m discrete-impulse (pulse) 
response: 
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W) = | (17, 

( C(k)@(k,i + 1 )B(i) k > i, 
H(k,i ) = < D(k) k = i, 

y 0 k < i. 

(IB) 

In the time-invariant case, Eqs. (15) and (16) 
become 


x(k + 1) = Ax(k) + Bu(k), 

y(k) = Cx(k ) + Du(k), (19) 


and 


k —1 

y(k) = CA k x o 4- ^ H(k , k > 0, 

/ =o 

( 20 ) 

where, without loss of generality, ko is taken to 
be zero. 

The discrete-impulse (pulse) response is now 
given by 

f CA k-{i+v>B k>i , 
H(k,i)=<D k = i , (21) 

I 0 k < i. 


Since the system is time invariant, H(k,i ) = 
//(£ — z,0) and i, the time the pulse input is 
applied, can be taken to be zero. The transfer 
function matrix for (19) is the (one-sided or 
unilateral) z- transform of H(k, 0): 

H(z) = Z{H(k, 0)} = C(zl - A)~ l B + D. 

( 22 ) 

It can be shown that given a p x m matrix 
H(k,i), k > z, it is realizable as the pulse 
response of a system (15) if and only if H(k,i ) 
can be decomposed as 

= 1>L ( 23 ) 

( D{k ) k = i. 

If, in addition, H(k,i ) = H(k — i, 0), then it is 
realizable via a time-invariant description (19). 

Similarly to the continuous-time case, H(z) 
is realizable as the transfer function matrix of a 
system described by (19) if and only if 


lim H(z) < oo. (24) 

z —^oo 


A realization (19) of H(z ) is minimal if and 
only if it is reachable (controllable from the 
origin) and observable. And if (19) is a minimal 
realization of H(z ), then any other minimal real¬ 
ization is equivalent to (19). 


Realization Algorithms 

Given a transfer function H(s ) (or H(z)), we are 
interested in finding a minimal (irreducible, or of 
least order) realization of the form (4) (or (19)). 

First note that there are methods to determine 
the order n of a minimal realization directly 
from H(s ) via the characteristic polynomial 
and notions such as McMillan degree of H(s ) 
or via the Markov parameters of H(s ) and 
the Hankel matrix. This can be done without 
finding a minimal realization. Knowing the 
order of a minimal realization in advance 
is useful as it provides a guide as to what 
we should expect when we determine an 
actual realization. Details may be found in the 
references below. 

In special cases, it is possible that the realiza¬ 
tion algorithm results directly in a controllable 
and observable and therefore minimal realization. 
It is more common however for the algorithm 
to result in just an either controllable or observ¬ 
able realization, in which case an extra step is 
needed to isolate the uncontrollable, say, part 
of the realization and take only the part that it 
is both controllable and observable. The reader 
should consult any of the references below for 
detailed descriptions of several realization algo¬ 
rithms. 

Here an example is given of a single-input, 
single-output system where the resulting real¬ 
ization is controllable and observable, therefore 
minimal. 

Example 1 


H(s) = 


b 2 s 2 + b\s + b 0 
s 3 + a 2 s 2 + a\s + ao' 
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0 1 0 
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A = 

0 0 1 

, B = 
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1 
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o 

1 

a 

1 

a 

tsJ 


_ 1 _ 


C = [bo, b\, &2], 


is controllable ((A, B) have a form called 
controller form) and observable and therefore 
minimal realization of H(s)\ note that all 
cancellations are assumed to have already taken 
place between numerator and denominator of 
a transfer function H(s). This algorithm easily 
generalizes to the case when the degree of the 
denominator of H(s) is n (in this example it is 3). 
Note that if lirn^oo H(s) = D ^ 0, then apply 
the previous algorithm to H(s) = H(s) — D to 
obtain A, B, and C. 


Summary 

The state variable realization of impulse 
responses and transfer functions was one of 
the early problems addressed by system theory. 
Its solution provides clear understanding of the 
relations between external (input-output) and 
internal descriptions of systems. A key result is 
that any minimal order realization is controllable 
and observable. Many realization algorithms 
may be found in the literature. Extensions to 
polynomial matrix descriptions can also be found 
in the literature, as well as extensions to partial 
realizations. 

Cross-References 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► Linear Systems: Continuous-Time, Time-Vary¬ 
ing State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Varying, 
State Variable Descriptions 

► Linear Systems: Continuous-Time Impulse Re¬ 
sponse Descriptions 

► Linear Systems: Discrete-Time Impulse Re¬ 
sponse Descriptions 


Recommended Reading 

A clear understanding of the relationship between 
external and internal descriptions of systems is 
one of the principal contributions of systems 
theory. This topic was developed in the early 
1960s with original contributions from Gilbert 
(1963) and Kalman (1963). The role of control¬ 
lability and observability in minimal realization 
is due to Kalman (1963); see also Kalman et al. 
(1969). Lor extensive historical comments, see 
Kailath (1980). The time-varying case is treated 
in Brockett (1970), Antsaklis and Michel (2006), 
and Rugh (1996). 
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Abstract 

RTO aims to optimize the operation of the pro¬ 
cess taking into account economic terms directly. 
There are several fundamental gears for smooth 
operating of an RTO solution. The RTO loop is an 
extension of feedback control system and consists 
of subsystems for (a) steady-state detection, (b) 
data reconciliation and measurement validation, 
(c) process model updating, and (d) model-based 
optimization followed by solution validation and 
implementation. There are several alternatives for 
each one of these subsystems. This contribution 
introduces some of the currently used approaches 
and gives some perspectives for future works in 
this area. 

Keywords 

Data reconciliation; Model updating; Online 
optimization; Parameter selection; Steady-state 
detection 

Introduction 

Effectiveness, efficiency, product quality, process 
safety, and low environmental impact are the 
main driving forces for the improvement of the 
operation of industrial processes. Real-time (or 
online) optimization (RTO) is one of the options 
that are available to achieve these goals and is 
attracting considerable industrial interest due to 
its direct and indirect benefits. 

RTO systems are model-based, closed-loop 
process control systems whose objective is to 
maintain the process operation as nearly as pos¬ 
sible to the optimal operating point. Such RTO 
systems use rigorous process models and current 
economic information to predict the optimal pro¬ 
cess operating conditions. Additionally, RTO can 
mitigate and reject long-term disturbances and 
performance losses (e.g., due to fouling of heat 
exchangers or deactivation of catalysts). 

The direct benefit from applying RTO is 
improving the economic performance in terms 
of increasing the profit of the plant and reducing 
energy consumption and pollutant emissions. 


These are also called the online benefits. The 
indirect benefits result from the tools used in the 
implementation of RTO. For instance, a better 
understanding of the processes can be employed 
to debottleneck the plant and to reduce operating 
difficulties. In addition, abnormal measurement 
information obtained from gross error detection 
can help instrumentation and process engineers 
to troubleshoot the plant instrument errors. 
Parameter estimation is very useful for process 
engineers to evaluate the equipment conditions 
and to identify decreasing efficiencies and other 
sources of problems. Furthermore, the detailed 
process simulation of the model used in online 
optimization can be used for process monitoring 
and serve as a training tool for new operators. 
Finally, the rigorous process model can be 
used for process maintenance, advanced process 
control, process design, facility planning, and 
process monitoring. 

Real-time optimization (RTO) solutions have 
been developed since the early 1980s, and nowa¬ 
days there are many petrochemical and chemi¬ 
cal applications, especially in the production of 
ethylene and propylene in fluid catalytic cracking 
units (FCCUs) (Darby et al. 2011). Other suc¬ 
cessful industrial applications are mentioned in 
Alkaya et al. (2009) with the respective economic 
returns. 

Control Layers and the RTO Concept 

Usually the process control is stratified into sev¬ 
eral layers, which have different response times 
and control objectives. RTO is located in an 
intermediate layer that provides the connection 
between plant scheduling (medium-term plan¬ 
ning) and the control system (short-term process 
performance). In a plant control hierarchy, pro¬ 
cess disturbances are controlled using process 
controllers, whereas the RTO system must track 
changes in the optimum operating conditions 
caused by low-frequency process changes (e.g., 
raw material quality and composition, catalyst 
deactivation). 

The typical structure of an RTO system is 
shown in Fig. 1, which depicts the elements 
of the closed-loop system. The RTO loop is 
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Real-Time Optimization of Industrial Processes, Fig. 1 Basic structure of the traditional RTO 


an extension of the feedback control system 
and consists of subsystems for (a) steady-state 
detection, (b) data reconciliation and measure¬ 
ment validation, (c) process model updating, 
and (d) model-based optimization followed 
by solution validation and implementation. 
Once the plant operation has reached a steady 
state, the plant data (y = [Ye,Ys,Yr]) are 
gathered and validated to detect and correct 
gross errors in the process measurements, and 
at the same time the measurements may be 
reconciled using material and energy balances to 
ensure that the data set used for model updating 
is consistent. These validated measurements 
are used to estimate model parameters ( 6 ) 
to ensure that the model represents the plant 
faithfully at the current operating point. Then, 
the optimum controller set points ( Ysset ) and 
manipulated targets (Utargets and U e targets) 
are calculated using the updated model and are 
transferred to the advanced process controllers 
after they have been validated to be effectively 
applied. 

Each layer in Fig. 1 has its own specific tasks 
as discussed in the following: 


1. Regulatory layer. This layer is focused on 
basic (e.g., temperature, flow rate) and inven¬ 
tory (e.g., level and pressure) control ensuring 
safety and operational stability for the indus¬ 
trial plant. The holdups of vapors and gases 
are measured by pressure sensors, while the 
holdups of liquids and solids are measured 
by level and weighing sensors. In the case 
of unstable processes, the regulatory layer is 
also responsible for their stabilization, e.g., 
by temperature control of industrial reactors. 
No industrial process can operate without this 
control layer. The typical operation time scan 
is in the order of seconds. For its design, 
typical questions that have to be answered 
are the following: “How to ensure safe unit 
operation?” “How to quickly meet the de¬ 
mands coming from the supervisory layer or 
from the operators?” “How to prevent distur¬ 
bances to propagate throughout the plant?” 
The control technology that prevails in this 
layer is SISO (single-input-single-output) PID 
controllers, with very few cases where the 
derivative action is effectively employed. In 
Fig. 1, Yr are the controlled variables of this 
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layer (e.g., levels, flow rates, pressures, tem¬ 
peratures, pH), and Ur are the correspond¬ 
ing manipulated variables, typically control 
valves. 

2. Supervisory layer. This layer is concerned 
with the quality of the final product. The goal 
is to ensure the specifications without infring¬ 
ing the operating limits of the equipment. 
Typically, in this layer there is a strong interac¬ 
tion between the controlled variables, requir¬ 
ing tailored multi-look control structures or 
the use of multivariate control techniques. The 
dominant advanced technology in this layer is 
model predictive control (MPC). In this layer 
the calculations and updates are performed on 
the time scale of minutes and the typical asso¬ 
ciated question is “how to ensure the quality of 
the final product while satisfying the operating 
constraints and improving the profitability by 
reducing the variability of the product param¬ 
eters?” Here the controlled variables ( Ys ) are 
usually related to the product quality and the 
manipulated variables are the set points for the 
regulatory layer Yvset and additional manipu¬ 
lated variables (Us) not used by the regulatory 
layer (e.g., variable frequency drive). 

3. RTO layer. Here the main focus is the prof¬ 
itability of the process. Specifications and op¬ 
erating points (i.e., set points and targets for 
the manipulated variables) are determined by 
solving an optimization problem that aims 
at maximizing the profitability of the pro¬ 
cess under stationary conditions. When the 
optimal operating point is close to the op¬ 
erational limits, the real-time optimization is 
quite straightforward, since it is enough to take 
the process to these limits, which is usually 
done by solving a linear programming (LP) 
optimization problem. Such simple solutions 
are effective especially in cases where it is 
known that to maximize or to minimize the 
flow rate of a given stream will maximize the 
profitability. As this kind of solution can be 
easily implemented, most commercial predic¬ 
tive controllers already have an LP or QP layer 
integrated, using as a model the gain of the 
dynamic model used in the MPC. However, 
for processes with large recycling streams and 


pronounced nonlinearity, this type of solution 
is not enough to bring the system to its op¬ 
timal operating conditions. In this case, it is 
essential to use a nonlinear optimizer that aims 
at driving the system to operate in the best 
operating region. When the industrial process 
works essentially in steady state, the problem 
can be solved using stationary models. The 
solutions offered on the market typically in¬ 
volve the use of a stationary process simulator 
(e.g., Aspen Plus, PRO II). The RTO sampling 
times are in the time scale of hours and the 
questions to be answered are the following: 
“What is the best way of operation?” “How 
to increase the profitability of the process?” 
“How to decrease energy consumption and to 
increase the process efficiency?” 


Four Elements of Classical Real-Time 
Optimization 

A standard RTO solution requires that all four 
calculation blocks illustrated in Fig. 1 work to¬ 
gether smoothly. In fact, each block can be for¬ 
mulated as an optimization problem by itself. 
Sometimes these optimization problems are com¬ 
bined together. Below the alternative techniques 
that can be applied to each of these subsystems 
are discussed. 

Steady-State Detection (SSD) 

As indicated in Fig. 1, the RTO loop execution 
begins with the detection of a steady state. Iden¬ 
tifying a steady state may be difficult because 
process variables are noisy and measurements do 
not settle at a constant value. Being at a steady 
state can be defined as an acceptable constancy 
of the measurements over a given period of time. 
Therefore, tests for stationarity are commonly 
based on checking the constancy of the measured 
quantities. 

Mejia et al. (2010) compared 6 different ap¬ 
proaches to SSD using 5,760 simulated data sets. 
They concluded that the method based on the 
estimation of the absolute value of the first and 
the second derivatives defined by 
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dy 

+ 10 

d 2 y 

dx 


dt 2 


gives the best results. Although this idea is quite 
simple, as being at a steady state means zero 
derivatives by definition, it has some implemen¬ 
tation issues, due to signal noise and outliers. 
These problems can be reduced by smoothing 
the plant data using smoothing splines, noncausal 
Butterworth filters, or wavelet decompositions. 
The second best compared approach was the local 
autocorrelation (Mejia et al. 2010) followed by 
the two statistical nonparametric tests of indepen¬ 
dence hypothesis proposed by Bebar (2005) and 
by the method of Cao and Rhinehart (1995). 

Data Reconciliation (DR) 

Within the mathematical models of industrial 
processes, the balance equations that result from 
conservation laws of mass, energy, etc., are the 
core that cannot be subject to debate. If the mea¬ 
sured data do not satisfy the balance equations, 
this fact must be attributed to measurement errors 
or to fundamental model inadequacies. Ruling 
out the latter, as measurement errors are always 
present, before using the measured data, they 
should be adjusted to obey the conservation laws 
and other constraints, e.g., of their ranges. The 
adjustment using optimization techniques com¬ 
bined with the statistical theory of errors is called 
data reconciliation. Unfortunately the adjustment 
of all variables can be greatly affected by “gross 
errors” in one variable, so such errors must be 
detected. 

The relationship between a measurement of a 
variable and its true value can be represented by 

error 

ym — y + G + e g (2) 

where y m and y are the measured and true values, 
while e r and e g are the random and gross errors, 
respectively. The random errors ( e r ) are assumed 
to be zero mean and normally distributed (Gaus¬ 
sian), since they are the result of the simultaneous 
effect of several causes. The gross errors ( e g ) 
are caused by large nonrandom events. They can 


be subdivided into measurement-related errors, 
such as malfunctioning sensors (e.g., incorrect 
calibration, sensor degradation, or damage to the 
electronics), and processes-related errors, such as 
process leaks. 

In the absence of gross errors, the simplest 
version of data reconciliation can be stated as a 
quadratic programming (QP) problem 

min - y m ) T Q~\y - y m ) (3) 
y 2 

subject to the linear or linearized constraint re¬ 
lated to the process model, written as 

A • y — c = 0. 

The covariance matrix (Q), which is usually 
diagonal, captures the variance of the sensors and 
is responsible to distribute the errors among the 
measurements (y m ). The solution of this problem 
is the reconciled value that for this simple case is 
given analytically by 

y = [I -QA r (AQA)~ 1 A]y m 
+ QA T (AQA T )~ l c. 

A rigorous formulation of the reconciliation 
problem is possible even with nonlinear 
constraints; only the general existence and 
uniqueness of a solution is not warranted 
theoretically. 

Several statistical tests have been constructed 
for the detection of gross errors. Some of them 
are based in the distribution of the constraint 
residuals, i.e., r c = A • y m — c, and others 
are based on the distribution of the estimated 
error after the reconciliation procedure, i.e., e = 
y m — y. The evaluation of r c does not require 
solving previously the associated data reconcil¬ 
iation problem. For a complete discussion and 
review, see Narasimhan and Jordache (1999) and 
Sequeira et al. (2002). 

Model Updating 

A key, yet difficult, decision in model parameter 
adaptation is to select the parameters that are 
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adapted. These parameters should be identifiable 
and represent actual changes in the process, and 
their adaptation should help to approach the true 
process optimum. Clearly, the smaller the subset 
of parameters, the better the confidence in the 
parameter estimates and the lower the required 
excitation (also the better the regularization ef¬ 
fect). But too few adjustable parameters can lead 
to misleading models and thus wrong proposals 
for operational changes. 

In general, the parameter estimation and up¬ 
dating are limited not only by the lack of in¬ 
formation available from experimental data but 
also by the correlation between the parameters 
that are identified. The estimation of correlated 
parameters leads to a high degree of uncertainty 
in the model, since different combinations of 
parameter values lead to the same value of the 
objective function in the estimation problem. 

The selection of the right number of parame¬ 
ters to be identified can be done by the analysis of 
the sensitivity matrix (S). The elements of S, sy , 
are the partial derivatives of the measurement yt 
with respect to the parameter 0j evaluated at the 
current value of the parameter (0 q), i.e., 



In general, different parameters and measure¬ 
ments have distinct magnitudes. Therefore, scal¬ 
ing is a key issue that has a strong impact on 
the results. Traditionally each element of the 
matrix S is scaled by the initial guess Ooj of 
the parameter and by the average value of the 
measurement i (y“). The scaled elements s s jj are 
then given by 



This scaling procedure has some problems, once 
it requires both a good initial guess for the pa¬ 
rameters and representative average values for the 
measurements. But the main drawback is that it 
does not consider the multivariable nature exist¬ 
ing among all parameters and outputs. To solve 
these drawbacks, Botelho et al. (2012) proposed 


to apply diagonal scaling matrices L and R that 
result from the solution of the convex optimiza¬ 
tion problem to find the minimized condition 
number of the sensitivity matrix, y(LSR ), i.e., 

min l ,r y(LSR) 

s.t.L G W yXny , diagonal and nonsingular (6) 
R G ffi ndXnd , diagonal and nonsingular 

This convex optimization problem can be solved 
using the LMI (linear matrix inequality) approach 
as described by Boyd et al. (1994). With the 
optimized scaling matrices L and R , the scaled 
sensitivity matrix S s is given by 

S s = L SR (7) 

With S s , the best subset of parameters to be es¬ 
timated can be determined using the non-square 
relative gain array matrix (NSRGA) as also pro¬ 
posed by Botelho et al. (2012). The NSRGA 
can be easily calculated for the scaled sensitivity 
matrix by 

NSRGA (S s ) def S s o (SjV (8) 

where (S]) T is the transpose of the pseudo- 
inverse of S s and o is the entrywise product (also 
known as the Hadamard or element-wise prod¬ 
uct). The rows of NSRGA (S s ) are related to the 
output measurements, whereas the columns are 
related to the parameters. The sum of the values 
in each column, whose values can vary between 
0 and 1, reflects the relevance of each parameter, 
and it can be used to sort in descending order 
their influence on the outputs. When the sum of a 
column is close to 1, the corresponding parameter 
has a small correlation with the other ones and a 
strong influence on the output measurements. 

Figure 2 illustrates the typical ordering pro¬ 
duced by sorting the NSRGA ( S s ) in descending 
order. Thus, it is possible to have an idea of which 
parameters should be selected for estimation. The 
values presented in this figure suggest that the 
parameters Pll and P 4 have very small corre¬ 
lation with the others and should be selected as 
updated parameters, whereas the parameters P 12 
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Real-Time Optimization of Industrial Processes, Fig. 2 Illustrative example of using NSRGA (S s ) to rank the 
parameters 


and PI show the opposite behavior and should 
not be updated. 

Solving the Optimization Problem 

Nonlinear programs for RTO can be formulated 
using models of different complexities. For ex¬ 
ample, RTO can be based on process models sim¬ 
ilar to those used for design and analysis, using 
commercial simulators (e.g., Aspen Plus, PRO II, 
HYSYS, etc.). On the other hand, because these 
problems need to be solved at regular intervals 
(at least every few hours), detailed simulation 
models can be partly replaced by correlations or 
operating curves that are fitted to the process and 
updated on a longer time scale. 

If a rigorous process model is used, the num¬ 
ber of nonlinear equations can be very large. The 
model is usually built by linking smaller sub¬ 
models. The optimization problem can be for¬ 
mulated as the following nonlinear programming 
problem (NLP): 

min xm s m f(x M ,S M ) 

s.t. 

M\ (xm \, S M u Omi) = 0 

M n (pcMm SMm @Mn) ~ 0 

OC (xm, Sm ) < 0 

Xm = [xm 1 , • • • , XMn] , Sm = [S>Ml , • • • , SMn\ , 

( 9 ) 


where M\ are the unit modules that can be solved 
by a tailored procedure in the modular approach 
or all together in the equation-oriented approach. 
Each unit model M 7 has internal variables Xm\ 
and parameters Omi • These unit models are con¬ 
nected by the input and output streams Smi- 
Additionally, there are operating constraints OC 
to capture the possible lower and upper bounds 
and other equipment constraints. The objective 
function /(xm,Sm) is based on an economic 
model that involves the raw materials, products, 
and operating costs. 

Successive quadratic programming (SQP) has 
become the most popular method for solving 
these nonlinear constrained optimization prob¬ 
lems. SQP converges the equality and inequality 
constraints simultaneously with the optimality 
conditions. This strategy requires relatively few 
function evaluations and often performs effi¬ 
ciently for process optimization problems. The 
NLP solver can be implemented in a nonintrusive 
way, similar to recycle convergence modules that 
are already in place. As a result, the structure of 
the simulation environment and the unit operation 
blocks does not need to be modified in order 
to include the optimization, so that SQP can 
be easily incorporated within existing modular 
simulators and therefore be applied directly 
to flow sheets modeled in these commercial 
simulators. However, in this case derivative 
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information must be obtained by numerical 
differentiation which increases the effort and 
slows down convergence near the optimum. 

For fully equation-oriented models with the 
exact first and second derivatives for all con¬ 
straints and objective functions, efficient NLP 
algorithms were developed. For instance, large 
equation-based models can be solved efficiently 
with structured barrier NLP solvers (see Biegler 
2010 for a detailed overview). But for problems 
where function evaluations are expensive, and 
gradients and Hessians are difficult to obtain, it 
is not clear that large-scale NLP solvers should 
be applied. Black-box optimization models with 
inexact (or approximated) derivatives and few de¬ 
cision variables are poorly served by large-scale 
NLP solvers, and derivative-free optimization al¬ 
gorithms should be considered instead. For the 
standard RTO problems formulated using mod¬ 
ular process simulator model, SQP and reduced- 
space SQP methods are expected to perform well 
(see Alkaya et al. 2009; Biegler 2010 for detailed 
discussion). 

After solving the RTO optimization problem, 
it is necessary to decide if the solution can be 
implemented. For this, it is necessary to ver¬ 
ify if the dominant cause of the plant changes 
is noise, since in this case implementing these 
changes could lower the profit. Thus, an impor¬ 
tant challenge in RTO results analysis is to deter¬ 
mine when to implement the calculated changes 
(Miletic and Marlin 1996). 

Summary and Future Directions 

RTO aims at optimizing the operation of the pro¬ 
cess taking into account economic terms directly. 
There are several fundamental needs for a smooth 
operation of an RTO solution. The central point is 
the mathematical modeling which can be a com¬ 
plex first principle model or be based on simple 
operating curves. If a good model is available, it 
is necessary to have a good characterization of 
the inlet streams (properties and composition), 
to employ data reconciliation and gross error 
detection and steady-state identification. Finally, 
the efficiency of the optimizer is a key issue. 


Due to the time and resources needed to imple¬ 
ment and maintain an RTO solution, a full RTO 
project involves a certain high risk. Therefore, in 
cases where simpler and easier approaches can be 
applied with equivalent economic benefits, they 
should be used instead. For processes with large 
recycle streams, it is worthwhile to apply the 
classical RTO strategy, i.e., the one discussed in 
the last section. In this case the optimal solution 
is not trivial, once it is not simply the maximal 
capacity of the plant. For the cases where the op¬ 
timal operating constraint is a direct consequence 
of the operating process capacity, the economic 
optimization can be easily included in the LP or 
QP layer implemented usually within a model 
predictive controller. 

In the previous section, the so-called two- 
step approach, where the measurements are 
used to refine the process model, which is then 
used to repeat the optimization, was described. 
Several RTO schemes have emerged since the 
development of this two-step approach in the 
1970s. Recently, it has been proposed to update 
the model differently. Instead of adjusting the 
model parameters, one updates correction terms 
that are added to the cost and constraint functions 
of the optimization problem. The technique, 
labeled as modifier adaptation (RTO-MA), forces 
the modeled cost and constraints to match the 
plant values (Gao and Engell 2005; Marchetti 
et al. 2009). The main advantage of RTO-MA 
compared to the two-step approach lies in its 
ability to converge to the true plant optimum, 
even in the presence of structural plant-model 
mismatch. RTO-MA is a static optimization 
method which means that its application to a 
continuous process requires waiting for reaching 
the steady state before taking measurements, 
updating the correction terms, and repeating the 
numerical optimization. Hence, several iterations 
are generally required before convergence can 
be achieved. In contrast, implicit methods, such 
as self-optimizing control (Skogestad 2000) and 
NCO tracking (Francois et al. 2005), propose 
to adjust the inputs online in a control-inspired 
manner. Especially simple to be implemented is 
the “self-optimizing” approach, where a feedback 
control structure is chosen so that maintaining 
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some function of the measured variables constant 
automatically maintains the process near an 
economically optimal steady state in the presence 
of disturbances. The problem is posed from a 
plantwide perspective, since the economics are 
determined by overall plant behavior. 

The classical steady-state RTO has some 
drawbacks related to its low frequency of 
execution. It is normally run twice or three 
times per day and one does not consider the 
cost of transiting from one operating condition 
to another. Some plants need to respond to 
market changes very quickly, like grade changes 
in polymerization and petroleum process. In 
these processes, market competition requires 
the capability to accommodate fast and cost- 
effective transitions so that companies can 
produce and sell on demand at favorable prices. 
To provide this capability, dynamic RTO is 
being developed and implemented in industrial 
processes. The largest difference between steady- 
state and dynamic RTOs is that traditional RTO 
only provides optimal operating conditions at 
the steady state, while dynamic RTO provides a 
trajectory of changes of operating conditions. 
Dynamic RTO does not require steady-state 
conditions to be applied. The formulation and 
solution of the problem DRTO are very similar to 
the approach used to solve nonlinear predictive 
controllers (NMPC), with the primary difference 
the inclusion of economic aspects in the objective 
function (Engell 2007). 

Cross-References 

► Control Structure Selection 

► Industrial MPC of Continuous Processes 

► Model-Based Performance Optimizing Control 

► Model-Predictive Control in Practice 

Recommended Reading 

As a number of design decisions must be made 
in the construction of a RTO system, there 
is no single approach how to implement it. 
The elements of the solution were discussed 


here which should be viewed as a starting 
point for further reading. The review paper by 
Engell (2007) discusses and compares several 
approaches for RTO and DRTO giving a quite 
general and broad perspective of the area. For 
the reader interested more in the solution and 
formulation of the optimization problems, the 
book by Biegler (2010) is a very good starting 
point and gives a complete discussion about the 
solvers currently used, illustrating the application 
with several examples. For data reconciliation 
and gross error detection the book by Narasimhan 
and Jordache (1999) is a good starting point. 
Finally, an industrial discussion about RTO 
and alternative approaches that have been used 
in the industry can be found in Darby et al. 
( 2011 ). 
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Abstract 

Redundancy may occur in different ways in 
a robotic system. This entry focuses on the 
resolution of kinematic redundancy, i.e., on 
the techniques for exploiting the redundant 
degrees of freedom in the solution of the inverse 
kinematics problem; this is indeed an issue of 


major relevance for motion planning and control 
purposes. 

Keywords 

Algorithmic singularity; Kinematic singularity; 
Optimization; Redundancy; Task-oriented kine¬ 
matics; Task-space augmentation 

Introduction 

Redundant robots possess more resources than 
those strictly required to execute their task; this 
provides the robot with an increased capacity 
of facing real-world applications by allowing 
to handle performance issues besides the mere 
achievement of a given motion trajectory. 

Redundancy may occur in the sensory 
system, in the mechanical structure, and/or in 
the actuation system, thus allowing, e.g., fault 
accommodation, multisensory perception, dex¬ 
terous motion, and load sharing. Nevertheless, 
unless otherwise specified, by redundant robot it 
is meant one that has a kinematically redundant 
mechanical structure, i.e., provided with more 
degrees of freedom than those strictly required to 
execute its task; this also typically leads to a re¬ 
dundancy in the number of actuators and sensors. 
Noticeably, kinematic redundancy is usually the 
key to handle the avoidance of singular configu¬ 
rations, the occurrence of joint limits, the engage¬ 
ment of obstacles in the workspace, and the mini¬ 
mization of joint torques or energy. In practice, if 
properly managed, the increased dexterity char¬ 
acterizing kinematically redundant robots may al¬ 
low them to achieve a higher degree of autonomy. 

In principle, no robot is inherently redundant; 
rather, there are certain tasks with respect to 
which it may become redundant. Nevertheless, 
since most papers in the classical literature on the 
topic have dealt with robotic manipulators (for 
which a general task consists in tracking an end- 
effector motion trajectory requiring six degrees 
of freedom), a robot arm with seven or more 
joints is often considered as a typical example 
of an inherently redundant manipulator. However, 
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even robot arms with fewer degrees of freedom, 
like conventional six-joint industrial manipula¬ 
tors, may become kinematically redundant for 
specific tasks, such as simple end-effector posi¬ 
tioning without constraints on the orientation. 

In the case of traditional industrial applica¬ 
tions involving nonredundant mechanical struc¬ 
tures, the occurrence of singular configuration 
and/or the presence of obstacles in the workcell 
resulted in the need of a carefully structured (and 
static) working space where the motion of the 
manipulator could be planned in advance. 

On the other hand, the presence of redundant 
degrees of freedom allows motions of the manip¬ 
ulator that do not displace the end effector (the 
so-called self-motions or internal motions); this 
implies that the same end-effector task can be ex¬ 
ecuted with several different joint motions, giving 
the possibility of better exploiting the workspace 
of the manipulator and ultimately resulting in 
a more versatile robotic arm (see Fig. 1). Such 
feature is a key to allow operation in unstructured 
and/or dynamically varying environments that 
characterize advanced industrial applications and 
service robotics scenarios. 

The biological archetype of a robotic manip¬ 
ulator is the human arm, which, not surprisingly, 
also inspires the terminology used to characterize 
the serial-chain structure of a robot arm. Remark¬ 
ably, a simple look at the human arm kinematics 
from the torso to the hand allows to recognize 
seven degrees of freedom (three at the shoulder , 



Redundant Robots, Fig. 1 A self-motion of the arm that 
keeps the end-effector positioned at the blue spot. It is 
possible to choose configurations that both take the blue 
spot and avoid the red obstacle 


one at the elbow , and three at the wrist ) that make 
a manipulator kinematically redundant. 

The kinematic arrangement of the human arm 
has been replicated in a number of robots of¬ 
ten termed as human-armlike manipulators (see, 
e.g., Fig. 2). Manipulators with a larger number 
of joints are often called hyperredundant robots 
and include - among others - snakelike robots 
(Fig. 3). 

The use of two or more robotic structures 
to execute a task (as in the case of cooper¬ 
ating manipulators or multifingered hands or 
multiarm/multilegged robots) also gives rise 
to kinematic redundancy. A headed multilimb 
structure is typical of a humanoid robot (Fig. 4). 
Redundant mechanisms also include vehicle- 
manipulator systems (Fig. 5). 

Although the realization of a kinematically 
redundant structure raises a number of issues 
from the point of view of mechanical design, this 
entry focuses on the techniques for exploiting 
the redundant degrees of freedom in the solution 



Redundant Robots, Fig. 2 The Mitsubishi PA-10 ma¬ 
nipulator 
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Redundant Robots, Fig. 3 The SnakeRobots.com S7 
snake robot prototype 



Redundant Robots, Fig. 5 The KUKA youBot 



Redundant Robots, Fig. 4 The Honda ASIMO 

of the inverse kinematics problem. This is an 
issue of major relevance for motion planning and 
control purposes. 

Task-Oriented Kinematics 

The relationship between the N variables rep¬ 
resenting the configuration q of an articulated 


manipulator in the joint space and the M vari¬ 
ables describing an assigned task t in an ap¬ 
propriate task space constitutes a task-oriented 
kinematics; this can be established at the position, 
velocity, or acceleration level. Typically, one has 
N > M, so that the joints can provide at least the 
degrees of freedom required for the end-effector 
task. If N > M strictly, the manipulator is 
kinematically redundant. 

At the position level, the direct kinematics 
equation takes on the form 

t = k t (q ), (1) 

where k t is a nonlinear vector function. 

Besides the direct kinematics expressed at the 
position level, it is useful to consider the first- 
order differential kinematics (Whitney 1969) 

i = J t (q)q, (2) 

that can be obtained by differentiating Eq. (1) 
w.r.t. time. In (2), the mapping between the task- 
space and the joint-space velocities is held by the 
M x N task Jacobian matrix Jfiq) = dkt/dq 
(also called analytic Jacobian). 

Remarkably, i expresses the rate of change of 
the variables adopted to describe the task and thus 
does not necessarily have the meaning of an end- 
effector velocity. In general, by denoting the end- 
effector spatial velocity as the stack of the 3D 
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translational and angular end-effector velocities, 
the following relationship holds 

i = T(t)v N , (3) 

where T is an M x 6 transformation matrix. 

For a given manipulator, the mapping 

v N = J(q)q (4) 

relates a joint-space velocity to the corresponding 
end-effector velocity through the 6 x TV geometric 
Jacobian matrix J. 

By comparing (2)-(4), the relation between 
the geometric Jacobian and the task Jacobian can 
be found as 

J t (q) = T(t)J(q). (5) 

Further differentiation of (2) w.r.t. time pro¬ 
vides the following relationship between the ac¬ 
celeration variables: 

t = J t (q)q + j t (q,q)q . (6) 

This equation is also known as second-order 

differential kinematics. 

Singularities 

A robot configuration q is singular if the task 
Jacobian matrix is rank deficient at it. Consider¬ 
ing the role of J t in (2) and (6), it is easy to realize 
that at a singular configuration, it is impossible to 
generate end-effector task velocities or accelera¬ 
tions in certain directions. Further insight can be 
gained by looking at (5), which indicates that a 
singularity may be due to a loss of rank of the 
transformation matrix T and/or of the geometric 
Jacobian matrix J. 

Rank deficiencies of T are only related to the 
mathematical relationship between v n and t , i ; 
for this reason, a configuration at which T is sin¬ 
gular is referred to as a representation singularity. 
A representation singularity is not directly related 
to the true motion capabilities of the manipulator 
structure, which can be instead inferred by the 
analysis of the geometric Jacobian matrix. Rank 


deficiencies of J are in fact related to loss of 
mobility of the manipulator end effector; indeed, 
end-effector velocities exist in this case that are 
unfeasible for any velocity commanded at the 
joints. A configuration at which J is singular is 
referred to as a kinematic singularity. 

Since redundancy resolution methods involve 
the inversion of the task differential kinemat¬ 
ics (2) and (6), the handling of singularities 
through proper treatment of the Jacobian matrix 
is very important. However, due to space 
limitations, this topic is out of the scope of 
this entry and in the following, we will assume 
that the Jacobian matrices at issue are all full 
rank. 

Null-Space Velocities 

With a full-rank task Jacobian, at each configu¬ 
ration an N — M dimensional null space of J t 
exists made of the set of joint-space velocities 
that yield zero task velocity; these are thus called 
null-space velocities in short. 

Remarkably, the components of q in the null- 
space of J t produce a change in the configuration 
of the manipulator without affecting its task ve¬ 
locity. This can be exploited to achieve additional 
goals - like obstacle or singularity avoidance - in 
addition to the realization of a desired task motion 
and constitutes the core of redundancy resolution 
approaches. 

Inverse Differential Kinematics 

The inverse kinematics problem can be solved 
by inverting the direct kinematics equation (1), 
the first-order differential kinematics (2) or the 
second-order differential kinematics (6). With a 
time-varying desired task reference, it is con¬ 
venient to solve the differential kinematic re¬ 
lationships because these represent linear equa¬ 
tions with the task Jacobian as the coefficient 
matrix. 

For a kinematically redundant manipulator, 
the general solution of (2) or (6) can be expressed 
by resorting to the pseudoinverse j\ of the task 
Jacobian matrix (Whitney 1969). 

The general solution of (2) can be written as 

q = j\(q)t + N Jt (q)q 0 > 


( 7 ) 
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where 

Nj t (q) = l-Jl(q)j t (q) 

is an orthogonal projection matrix into the null- 
space of J t , and q 0 is an arbitrary joint-space 
velocity; the second part of the solution is there¬ 
fore a null-space velocity. The particular solution 
obtained by setting </ 0 — 0 in (7) is known as the 
pseudoinverse solution. 

As for the second-order kinematics (6), its 
solution can be expressed in the general form 

q = J Uq)(t - Jt(q,q)q) + N Jt (q)q 0 , ( 8 ) 

where q 0 is an arbitrary joint-space acceleration. 

In summary, for a kinematically redundant 
manipulator, the inverse kinematics problem ad¬ 
mits an infinite number of solutions, so that a 
methodology to select one of them is needed. 

Redundancy Resolution via 
Optimization 

An approach to redundancy resolution is based on 
the optimization of suitable performance criteria. 

Performance Criteria 

The availability of redundant degrees of freedom 
can be used to improve the value of performance 
criteria during the motion. These criteria may 
depend on the robot joint configuration only or 
involve also velocities and/or accelerations. 

The most frequently considered performance 
objective for trajectory tracking tasks is 
singularity avoidance. In fact, singularities lead 
to decreased mobility, and adding kinematic 
redundancy allows to reduce the extension of 
the workspace region where the manipulator 
is necessarily at a singular configuration 
(unavoidable singularities Baillieul et al. 1984). 
Possible performance criteria to drive the 
manipulator motion out of avoidable singu¬ 
larities are configuration-dependent functions 
that characterize the distance from singular 
configurations, i.e., the manipulability measure, 


the condition number, and the smallest singular 
value of J t . 

Since kinematic inversion produces very high 
joint velocities in the vicinity of singular config¬ 
urations, a conceptually different possibility is to 
minimize the norm of the joint velocity generated 
by the redundancy resolution scheme. 

Redundancy can be also used to keep a robot 
away from undesired regions of the joint space or 
of the task space. For example, it might be desired 
that a manipulator avoids reaching mechanical 
joint limits (Liegeois 1977). Another interesting 
application is obstacle avoidance, which can be 
enforced by minimizing suitable artificial poten¬ 
tial functions defined on the basis of the image of 
the obstacle region in the configuration space. 

Many other performance criteria can be found 
in the literature. 

Local Optimization 

Equation (7) provides least-squares solutions to 
the end-effector task constraint (2), so that it 
minimizes || t — Jql 

The simplest form of local optimization is rep¬ 
resented by the pseudoinverse solution that pro¬ 
vides the joint velocity with the minimum norm 
among those which realize the task constraint. 
Clearly, the joint movement generated by this 
locally optimal solution does not provide global 
velocity minimization along the entire manipula¬ 
tor motion; therefore, singularity avoidance is not 
guaranteed (Baillieul et al. 1984). 

In terms of the inverse differential kinematics 
problem, the least-squares property may quantify 
the accuracy of the end-effector task realization, 
while the minimum norm property may be rele¬ 
vant for the feasibility of the joint-space veloci¬ 
ties. 

Another possibility is to use the general solu¬ 
tion (7), choosing q 0 as 

q 0 = -k H WH(q) , (9) 

where kn is a scalar stepsize and V H(q) denotes 
the gradient of a scalar configuration-dependent 
performance criterion H which is desired to min¬ 
imize (Liegeois 1977). 
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As for the second-order solution (8), choosing 
q 0 = 0 gives the minimum norm acceleration 
solution. 

Global Optimization 

Local optimization algorithms can lead to unsat¬ 
isfactory performance over long-duration tasks. 
It is therefore natural to consider the possibility 
of selecting q 0 in (7) so as to minimize integral 
criteria of the form 

f f H(q)dt 

Jn 


N — M independent constraints can be written 
in vector form as 

h(q) = 0 . 

For a motion that tracks a trajectory t{t) by 
keeping g(q) extremized at each time, it is 


't(ty 


’*t (?(0)" 

0 


. h{q(t)) _ 


that, similarly to (1) and (2), leads to define an 
extended Jacobian matrix as 


defined over the whole duration of the task. Un¬ 
fortunately, the solution of these problems (natu¬ 
rally formulated within the Calculus of Variations 
framework) may not exist and does not admit a 
closed form in general. One way to make the 
problem solvable is to use an integral criterion 
in quadratic form in the joint velocities or ac¬ 
celerations. However, this is more easily done 
at the second-order kinematic level (see sec¬ 
tion “Second-Order Redundancy Resolution”). 


Redundancy Resolution via Task 
Augmentation 

Another approach to redundancy resolution 
consists in augmenting the task vector so as 
to tackle additional objectives expressed as 
constraints. 


Extended Jacobian 

The extended Jacobian technique (Baillieul 1985) 
enforces an appropriate number of functional 
constraints to be fulfilled along with the original 
end-effector task. 

Given an objective function g(q), if / t has full 
rank a set of N — M independent constraints can 
be obtained from the equation 


dg(q) 

dq 


Nj t (S) 

q=q 


where q is the current joint configuration such 
that the function g(q) is at an extreme; these 


J ext (q ) 


J M) 

dh(q ) 

_ 3 q . 


Therefore, if the initial joint configuration ex- 
tremizes g(q) and provided that / e xt does not 
become singular, the time integral of the inverse 
mapping 


= 


ji* K «) 



( 10 ) 


tracks the assigned end-effector trajectory t(t ) 
propagating joint configurations that extremize 
<?(?)• 

The extended Jacobian method has a major 
advantage over the pseudoinverse solution in that 
it is cyclic , i.e., it generates repetitive joint motion 
from a repetitive task motion. Moreover, solution 
(10) can be made equivalent to (7) via suitable 
choice of the vector q 0 (Baillieul 1985). 


Augmented Jacobian 

The task-space augmentation approach is based 
on the direct definition of a constraint task to be 
fulfilled along with the end-effector task (Sciav- 
icco and Siciliano 1988). 

In detail, let t c collect P variables that de¬ 
scribe the additional tasks to be fulfilled besides 
the end-effector task t . In the general case, it is 
P < N — M although full redundancy exploita¬ 
tion suggests to consider exactly P = N — M . 

The relation between the joint-space and the 
constraint-task coordinates can be considered as 
a direct kinematics equation 
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t C = k c (q) , 

where k c is a continuous nonlinear vector func¬ 
tion. At this point, an augmented task can be 
defined by stacking the end-effector task with the 
constraint task as 


" t " 


' kt(q)~ 

_ l c _ 


_k c (q)_ 


According to this definition, finding a joint con¬ 
figuration q that brings t a at some desired value 
means to satisfy both the end effector and the 
constraint task at the same time. 

A solution to this problem can be found at the 
differential level by inverting the mapping 

t\ = Ja(q)q (11) 


where the matrix 


J,(q) 


Jt(q) 

Jc(q) 


is termed augmented Jacobian and J c (q) = 
dk c /dq is the P x N constraint-task Jacobian 
matrix. 

A particular choice for the constraint-task vec¬ 
tor is t c = h(q ), with h defined as explained in 
section “Extended Jacobian”, that allows the aug¬ 
mented Jacobian method to embed the extended 
Jacobian one. 


Algorithmic Singularities 

The specification of additional goals besides 
tracking the end-effector task raises the 
possibility that configurations exist at which 
the augmented kinematics problem is singular 
while the sole end-effector task kinematics is 
not; these configurations are termed algorithmic 
singularities (Baillieul 1985). With reference 
to the velocity mappings (10) and (11), an 
algorithmically singular configuration is one at 
which the extended and the augmented Jacobians, 
respectively, are singular while J t is full rank. 

Remarkably, algorithmic singularities arise 
from the way in which the constraint task 
conflicts with the end-effector task and are not 


a problem of the specific inverse kinematic 
technique (Baillieul 1985). This is easily 
understandable in simple situations such as 
that of a desired trajectory passing through an 
obstacle, where either the trajectory is tracked 
or the obstacle is avoided, so that both tasks 
cannot be achieved together. If the origin of 
the conflict between the two tasks has a clear 
meaning, the algorithmic singularity may be 
avoided by keenly specifying the constraint task 
case-by-case; otherwise, analytical tools must be 
adopted. 

Task Priority 

Conflicts between the end-effector task and the 
constraint task are handled in the framework of 
the task-priority strategy by suitably assigning 
an order of priority to the given tasks and then 
satisfying the lower-priority task only in the null- 
space of the higher-priority task (Maciejewski 
and Klein 1985; Nakamura et al. 1987). The idea 
is that, when an exact solution does not exist, the 
reconstruction error should only affect the lower- 
priority task. 

With reference to solution (7), the task-priority 
method consist in computing q 0 so as to suitably 
achieve the P -dimensional constraint-task veloc¬ 
ity t c . Remarkably, the projection of qo onto 
the null-space of J t ensures lower priority of the 
constraint task with respect to the end-effector 
task since it results in a null-space velocity for 
the end-effector task. 

Consistently with the defined order of priority 
between the two tasks, a reasonable choice is 
then to guarantee exact tracking of the primary- 
task velocity while minimizing the constraint- 
task velocity reconstruction error t c — J c q\ this 
gives (Maciejewski and Klein 1985) 


q = j}(q)i + (j c (q)Nj t (q)y(t-Jc(q)jHq)i). 

( 12 ) 

It can be recognized that the problem of al¬ 
gorithmic singularities still remains; in fact, the 
matrix J c • N j t may lose rank with full-rank 
J[ and J c . However, differently from the task- 
space augmentation approach, correct primary- 
task solutions are expected as long as the sole 
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primary-task Jacobian matrix is full rank. On the 
other hand, out of the algorithmic singularities, 
the task-priority strategy gives the same solution 
as the task-space augmentation approach; this 
implies that close to an algorithmic singularity, 
the solution becomes ill-conditioned and large 
joint velocities may result. 

Another approach is to relax minimization of 
the secondary-task velocity reconstruction con¬ 
straint and simply pursue tracking of the compo¬ 
nents of j\t c that do not conflict with the primary 
task (Chiaverini 1997), namely, 

q = j\{q)i+NjXq)Jt(q)t\. (13) 

A nice property of solution (13) is that algorith¬ 
mic singularities are decoupled from the singular¬ 
ities of J c . 

Second-Order Redundancy 
Resolution 

Redundancy resolution at the acceleration level 
allows the consideration of dynamic performance 
along the manipulator motion. Moreover, the ob¬ 
tained acceleration profiles (together with the 
corresponding positions and velocities) can be 
directly used as reference signals of task-space 
dynamic controllers. 

The simplest scheme operating at the acceler¬ 
ation level is represented by (8) with q = 0. Sim¬ 
ilar to the velocity-level pseudoinverse solution, 
the joint motion generated by this locally optimal 
solution does not result in global acceleration 
minimization. Remarkably, provided that the ap¬ 
propriate boundary conditions are satisfied, this 
solution leads to the minimization of the integral 
of q T q (Kazerounian and Wang 1988). 

More flexibility in the choice of performance 
criteria is obviously obtained by considering the 
full second-order solution (8). Let the manipula¬ 
tor dynamic model be expressed as 

T = H(q)q + c(q, q) + r g (q) , 

where r is the vector of actuator torques, H is 
the manipulator inertia matrix, c is the vector of 


centrifugal/Coriolis terms, and r g is the gravi¬ 
tational torque vector. Choosing the null-space 
acceleration in (8) as 

q 0 = -(H(q)N Jt (q)y 
( H(q)jl(q)(t-j t (q,q)qj+c(q,q) + T g (q)j 

leads to the local minimization of the actuator 
torque norm r T r (Hollerbach and Suh 1987). 

Another interesting inverse solution, which 
minimizes the integral of the manipulator kinetic 
energy, is the following Kazerounian and Wang 
(1988): 

4 = J l H (q)(t - jt(q,q)o) 

+ (l-j\ H {q)JM)) H~Xq)c{q,q) , 

where the inertia-weighted task Jacobian pseu¬ 
doinverse can be computed as 

j{ H (q) = H-\q)jJ(q) 

(j t (q)H-\q)jJ(q)y\ 

Once again, the correct boundary conditions must 
be used. 

Summary and Future Directions 

To discuss kinematic redundancy, the concept of 
task-oriented kinematics has been first recalled 
with the basic methods for its inversion at the 
velocity and acceleration level. Next, different 
methods to solve kinematic redundancy at the 
velocity level have been arranged in two main 
categories, namely, those based on the optimiza¬ 
tion of suitable performance criteria and those 
relying on the augmentation of the task space. 
Finally, redundancy resolution methods at the 
acceleration level have been considered in order 
to take into account dynamics issues, e.g., torque 
or kinetic energy minimization. 

Besides the classical linear algebra methods 
and optimization tools still ever under inves¬ 
tigation, new methodological approaches to 
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Redundant Robots, Fig. 6 The DLR rolling justin 

redundancy resolution recently include learning 
algorithms (Rolf et al. 2010) and soft computing 
techniques (Liu and Li 2006). Active fields of 
new applications are in sensorial redundancy for 
data fusion (Luo and Chang 2012) and in systems 
(like the one in Fig. 6) with a large number of 
degrees of freedom, namely, hyperredundant 
robots (Salvietti et al. 2009), humanoids (Kanoun 
and Laumond 2010), and multirobot systems 
(Antonelli et al. 2010). 

Cross-References 

► Cooperative Manipulators 

► Optimal Control and Mechanics 

► Robot Motion Control 


Recommended Reading 

Because of space and scope limitations, in 
drawing on overview of such a mature and well- 
developed topic, there are a number of techniques 


and details that go neglected in any case; a 
slightly more extensive treatment of kinematic 
redundancy, including a touch on singularity 
robustness, cyclicity, and hyperredundant ma¬ 
nipulators with related first-reading bibliography 
can be found in Chiaverini et al. (2008). Other 
major issues of interest that could not be covered 
here are in the use of kinematic redundancy for 
fault tolerance, for improved grasping, and for 
motion/force control; see, e.g., Roberts et al. 
(2008), Prats et al. (2011), and Khatib (1990), 
respectively. 
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Abstract 

A classical problem in control theory is the de¬ 
sign of feedback laws such that the effect of 
exogenous inputs on selected output variables is 
asymptotically rejected. This includes problems 
of asymptotic tracking and disturbance rejection. 
In this entry, the fundamentals of the theory are 
presented, as well as constructive procedures for 
the design of a controller, which embeds an “in¬ 
ternal model” of the generator of the exogenous 
inputs. Current and future research directions are 
also discussed. 


Keywords 

Nonlinear output regulation; Robust control; Sta¬ 
bilization of nonlinear systems; Tracking 

Introduction 

The problem of controlling a dynamical systems 
in such way that a “regulated” output tracks ref¬ 
erence signals or rejects exogenous disturbances 
is ubiquitous in control theory. Among various 
possible different approaches to the solution of 
this problem, in this entry we present the so- 
called theory of nonlinear output regulation. A 
distinctive feature of this theory is that refer¬ 
ence/disturbance signals to be tracked/rejected 
are thought of as unknown functions of time, 
which belong to the set of all trajectories gen¬ 
erated by an autonomous nonlinear system (the 
so-called exosystem). Fundamental in this setting 
is the concept of internal model , developed in 
the early 1970s for linear systems by Francis 
and Wonham (1976) and subsequently extended, 
beginning with the work (Isidori and Byrnes 
1990), to the case nonlinear systems. Since these 
early contributions, nonlinear output regulation 
has been an active research domain, in which con¬ 
stant improvements have brought the theory to a 
stage of full maturity. In this entry we introduce 
the fundamental principles of the nonlinear out¬ 
put regulation theory and the associated design 
tools. The entry ends with an overview of actual 
research trends and future research directions. 


The Generalized Tracking Problem 
for Nonlinear Systems 

We consider the class of time-invariant smooth 
nonlinear systems described in the form 

x = /(w, x, u) 

e = h(w, x) (1) 

y = k(w, x) 

in which v e W 1 is the state, u e M m is the 
control input, y e R q is the measured output, and 
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e eR p is the regulation error. The input wel^ 
models exogenous signals that might represent 
references to be tracked, exogenous disturbances 
to be rejected, or also parametric uncertainties. 
In this framework the problem is to design a 
controller of the form 

t = <p(£,y) £ € (2) 

« = y(^y) 

such that the associated closed-loop system 
(l)-(2) has bounded trajectories (x(t),£(t)) 
and the resulting error e(t) = h{yv{t),x(t)) is 
asymptotically vanishing, i.e., lim^oo e it) = o. 
The previous framework encompasses several 
standard control problems, such as the problem 
in which a system of the form x = f(x,u ), 
with measured output y = k(x) and regulated 
output y Y = h Y (x), must be controlled in such a 
way that y Y (t ) asymptotically tracks a reference 
signal y*(t). This is the case, in fact, if we set 
w(t) = y*(t ), define e = h(w,x) = w — h Y (x), 
and drop the dependence from w in the functions 
/(•) and k(-) in (1). Similarly, the previous 
framework lends itself to capture a scenario of 
disturbance suppression, in which, in a system of 
the form x = f(d,x,u ), with measured output 
y = k{x) and regulated output y Y = h(x), 
the effect of a disturbance d(t) on the regulated 
output y Y (t) must be asymptotically rejected. 
This is the case if we set w(t) = d(t) and drop 
the dependence on w in /z(-) and £(•) in (1). 
Similarly, by letting h(x) = x and interpreting 
the variable w as parametric uncertainty, the 
previous setting captures the problem of robust 
output feedback stabilization, at the origin, of 
an uncertain system of the form x = f(w,x,u ) 
with measured output y = k(w,x). Of course, 
the general case of tracking reference signals in 
presence of exogenous disturbances can be cast 
in a similar manner. 

The ability of solving the problem in question 
strongly depends on the amount of knowledge 
one assumes about the exogenous variable w in 
the design of the controller (2). Among the dif¬ 
ferent options available, in this entry we present 
the so-called theory of output regulation, in which 
the exogenous variable is assumed to be an un¬ 


known member of a known family of functions of 
time. Specifically, it is assumed that w(t ) is an 
unspecified member of the set of all trajectories 
generated by an autonomous nonlinear system of 
the form 

w = s(w) (3) 

as its initial condition w( 0 ) ranges on a prescribed 
set W C W. In this framework, system (3), 
usually referred to as the “exosystem,” is assumed 
to be known and its knowledge potentially ex¬ 
ploitable in the design of (2). The specific “mem¬ 
ber” w(t) of the family, however, is unspecified 
as the initial condition w(0) is not known. The 
fact of regarding w(t ) as unknown member of 
a known family seems to be the right trade-off 
between the favorable but unrealistic situation in 
which w(t) is assumed to be perfectly known and 
the opposite realistic but conservative situation in 
which w{t) is regarded as a totally unknown sig¬ 
nal. An elementary, and yet meaningful, example 
is given by the case in which w(t) belongs to 
the family of periodic functions of time with an 
unspecified frequency, phase, and amplitude. In 
this case the exosystem (3) is a nonlinear system 
of the form (weR 3 ) 

W\ — W2 W2 = —vvf W\ W 3 = 0 . 

Solutions of the previous system, in fact, are pe¬ 
riodic functions, with frequency w^it) = W 3 ( 0 ) 
and amplitude and phase depending on the spe¬ 
cific initial condition (wi(0), W 2 ( 0 )). Other sit¬ 
uations, such as exosystems modeling nonlinear 
oscillators, can be dealt with in a similar fashion. 

In the previous context, the problem of output 
regulation can be formally cast as follows. Let 
X C W 1 be a set of initial conditions for (1). 
Then, the problem consists in finding a controller 
of the form ( 2 ), with initial conditions in a set 
S C M v , such that the trajectories of the closed- 
loop system (1)—(2) augmented with (3), originat¬ 
ing from an initial condition (w( 0 ), x( 0 ), £( 0 )) e 
W x X x E, are bounded and Hindoo e(t) = 0 
uniformly in the initial conditions (The property 
of “uniformity” is relevant in the context of out¬ 
put regulation. It reflects the requirement that 
the time needed for the error e(t ) to reach an 
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€ -neighborhood of the origin only depends on the 
set W x X x —/, where the initial conditions 
are supposed to range, and on e but not on the 
particular value of the initial condition within 
W x X x S Depending on the assumptions 
on the set X where the initial conditions of the 
plant are assumed to range, the problem is further 
classified in semiglobal output regulation , if the 
set X is a known compact but otherwise arbitrary 
set of W 1 , or global output regulation if X = M". 

Output Regulation Principles 

Steady State for Nonlinear Systems 
and Internal Model Principle 

Since the objective is to design the controller such 
that the effect of the exogenous variable is asymp¬ 
totically rejected by the regulation error, it is 
apparent that any approach to the solution of the 
problem of output regulation must be necessarily 
grounded on a precise characterization of the no¬ 
tion of “steady state” for a nonlinear system. As it 
is the case for the familiar version of this concept 
in linear systems theory, a notion of “steady 
state,” for the system consisting of (1)—(3), should 
be able to capture the “limiting behavior” - if 
any - that such system asymptotically approaches 
when the “transient behavior,” due to the effect of 
specific initial conditions of plant and controller, 
fades out and a “persistent behavior,” induced 
only by the specific exogenous input, emerges. In 
this respect, the mathematical tool that has been 
shown to be at the core of a rigorous notion of 
steady state for nonlinear systems is the one of 
co-limit set of a set. We refer the reader to Hale 
et al. (2002) for a definition of this notion and 
to Byrnes and Isidori (2003) for a description 
of its use in the characterization of the steady- 
state behavior of a nonlinear system. In this entry, 
we simply observe that if the trajectories of the 
system (3)—(1)—(2) that originate from the set of 
initial conditions W x X x E are bounded (which, 
in turn, is one of the requirements of the problem 
in question), then there exists a compact set A C 
W xM" xM v , which is precisely the &>-limit set of 
the set W x X x E under the dynamics of (3)-(l)- 
(2), that is invariant for the closed-loop system 


and that uniformly attracts its trajectories. The 
set A is usually referred to as steady-state locus , 
while the restriction of the closed-loop dynamics 
to the set A are the steady-state dynamics of 
the closed-loop system. The latter characterize 
the “limiting behavior” of the system towards 
which all the closed-loop trajectories converge to. 
Unlike the case of linear systems, though, in a 
nonlinear context we cannot expect, in general, 
that the steady-state behavior is only governed 
by the exogenous w, namely, that the asymptotic 
behavior of the closed-loop system is totally 
independent of the initial conditions of the plant 
and of the regulator. Assuming that the set W is 
compact and invariant for (3), it can be proven 
(see Isidori and Byrnes 2008) that the set A is the 
graph of a set-valued map defined on W, namely, 
that there exists a map a : W —> W 1 x M v , which 
is set-valued in general, such that 

A = {(w, x,tj) e W xCxl y : (*, f) e cr(w )}. 

Clearly the steady-state locus and the asso¬ 
ciated steady-state dynamics of the closed-loop 
system depend on the design of the controller (2). 
The role of the latter is not only to enforce the 
existence of a steady state, which is equivalent to 
enforce bounded closed-loop trajectories, but also 
to guarantee that the error converges asymptoti¬ 
cally to zero uniformly in the initial conditions. 
In this respect, by bearing in mind the asymptotic 
properties of the set A, it can be seen that a 
sufficient condition under which a regulator of the 
form (2) solves the problem of output regulation 
is that the steady-state locus is “shaped” in such a 
way that the regulation error is zero on it, namely, 

A C {(w,x,£) : h(w,x) = 0}. (4) 

In fact, it can be proved that condition (4) is not 
only sufficient but also necessary (see Byrnes and 
Isidori 2003) as a consequence of the require¬ 
ment that the error converges to zero uniformly 
in the initial conditions. That is, any regulator 
that solves the problem in question necessarily 
enforces a steady state such that the steady-state 
locus fulfills (4). 
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In view of the previous considerations, a cru¬ 
cial property required to any regulator is to induce 
a steady-state locus A fulfilling (4). This key 
feature can be further elaborated by highlighting 
two necessary conditions, involving separately 
the plant and the regulator, leading to the notions 
of regulator equations and internal model prop¬ 
erty. To this purpose, consider the simplified, yet 
relevant, case in which the map cr(-) is single¬ 
valued and smooth, and let tt(w) and r(w) be the 
two components of cr(w) associated to x and £, 
respectively. By letting 

c(w) = y(r(w),fc(w,7r(w))) (5) 

it is immediately realized that the fact that A is 
invariant for (l)-(3) implies that the functions 
tt(-) and r(-) necessarily fulfill 

9 f V) .v(H-) = f(w, n(w), c(w )) (6) 

dw 

and 

dr(w) 

———s(w) = (p(z(w),k(w,n(w))) (7) 

ow 

for all w G W. Furthermore, the fact that e must 
be zero on A (see (4)) implies that necessarily 

h(w, tt(w)) = 0 (8) 

for all w G W. Equations (6) and (8), interpreted 
as equations in the unknown 7t(w) and c(w), in¬ 
volve only the regulated plant (1) and are known 
as regulator equations (see Isidori 1995; Isidori 
and Byrnes 1990). The functions c(w(t)) and 
t x(w(t)), with w{t) solution of (3), represent, re¬ 
spectively, the desired steady-state control input 
and state towards which the actual control input u 
and state v of (1) should converge in order to have 
the regulation goal fulfilled. On the other hand, 
Eqs. (5) and (7), interpreted as equations in the 
unknown r(w), point out the so-called internal 
model property required to any regulator solving 
the output regulation problem, that is, the ability 
of the regulator to reproduce the ideal steady- 
state input c(w(t)), for all possible w(t) solution 
of (3), once it is driven by the measured output of 


the plant in the ideal steady state (namely, by the 
function k(w(t), 7t(w(t)))). In fact, this property 
can be achieved by incorporating in the controller 
an appropriate “internal model” of the exogenous 
dynamics (3). 

Regulator Design 

As emphasized in the previous discussion, the 
design of the regulator involves the fulfillment 
of two crucial properties. The first is the inter¬ 
nal model property, namely, the ability of gen¬ 
erating, by means of the regulator outputs, all 
possible “feedforward inputs” which force an 
identically zero regulation error and, in turn, to 
guarantee the existence of an invariant steady- 
state set on which the error is identically zero. 
The second property asks that such a steady- 
state set is asymptotically stable for the closed- 
loop system with a domain of attraction including 
the set of initial conditions. A systematic design 
procedure of regulators simultaneously fulfilling 
the previous two properties can be found under 
sufficient conditions that essentially restrict the 
class of regulated plants (1). In particular, in the 
following, we consider the class of single input- 
single output systems that are affine in the input u , 
with a measurable error variable (i.e., e = y) and 
that after an appropriate change of coordinates 
can be written in the form 

z = Uw,z,e) zeW~ l 

e = a(w , z, e) + b(w, z, e)u e, u G M 

with f z (-, •, •), a(-, •, •), and b (•, •, •) smooth func¬ 
tions with b(w,z,e) ^ 0 for all ( w,z,e ). Sys¬ 
tems of this kind possess a well-defined unitary 
relative degree (The restriction to systems with 
unitary relative degree is just made for sake of 
simplicity. Higher relative degree can be equally 
dealt with, Isidori (1995).) between the input u 
and the output e , and the Eqs. (9) are said to be 
in normal form (see Isidori 1995, 2013). In these 
coordinates an easy calculation shows that the 
solution of the regulator Eqs. (6) and (8) takes the 
form 7t(w) = 0), where 7t z (f) : W -> 

R n ~ l is a solution of 
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dx z (w) 
3 w 


Hw) = f z (w,7T z (w),0), 


c = _ a (w, 7t z (w), 0) 
b(w, 7t z (w), 0) 

In addition, we further restrict the class of sys¬ 
tems by asking for a minimum-phase property 
(Isidori 1995, 2013). In the present context, the 
property in question amounts to asking that the 
set B = {(w, z) £ W x R n ~ l : z = 7t z (w)} is 
asymptotically stable for the system 

w = s(w) 

z = fz(w,Z, 0 ) 

with a domain of attraction containing W x Z, 
with Z the set where the initial condition of z is 
expected to range. 

Existence of the relative degree and the prop¬ 
erty of minimum-phase are all what is needed to 
design a regulator. The regulator takes the form 


% — - 1- G(y(£) + /c(e)) £ £ W 

U = y(£) + K(e) 


in which (F , G) is a controllable pair and ]/(•) 
and /<*(•) are real-valued functions to be properly 
designed. In particular, it can be shown (see 
Marconi et al. 2007) that if v, the dimension of 
the regulator is taken sufficiently large relative 
to the dimension of w (specifically, v > 2s + 2) 
and if F is any matrix whose eigenvalues have 
negative real part, there exist a continuously 
differentiable function r(-) and a continuous 
function ]/(•) such that 

= Fr(w) + Gc(yv ) (n) 

c(w) = y( r(w)) 

for all w £ W. This being the case, it is seen 
that the regulator (10) fulfills conditions (5)-(7) 
and therefore has the internal model property, 
regardless of how /c(-) is chosen, provided 
that k( 0) = 0. In particular, in the closed- 
loop system (3), (9), and (10), the invariant set 
A = {(w, z, e, £) £ W x R n ~ l xRxf : z = 
7 t z (w), e = 0, £ = r(w)} fulfills (4) regardless 


of how /c(-) is chosen. The function /c(-) is a 
degree of freedom that can be chosen to make 
the steady-state set A asymptotically stable. In 
this respect the minimum-phase assumption and 
the fact that F is a Hurwitz matrix play a role. 
In fact, the closed-loop system (3), (9), and (10), 
interpreted as a system with state (w, z, e), input 
k(‘), and output e , have relative degree one and 
it is minimum-phase. This fact makes it possible 
to use standard high-gain arguments to show that 
there exists a function /c(-) such that the set A is 
asymptotically stable for the closed-loop systems 
with a domain of attraction containing any 
(arbitrarily large) compact set of initial conditions 
(see Marconi et al. 2007; Teel and Praly 1995). 

The delicate part in the procedure illustrated 
above is the design of the function y(-) that is 
required to fulfill (11) for a suitable r(-). Ex¬ 
act, although hardly implementable in practice, 
expressions for the function y(-) can be found 
in Marconi and Praly (2008). More constructive 
design procedures can be found at the price of 
restricting the class of systems and exosystems 
that can be dealt with. Such procedures require 
that the autonomous dynamical system with “out¬ 
put” w* 

w = s(w) 
u* — c(w ), 

namely, the system characterizing all possible 
ideal steady-state inputs, is “immersed” into a 
system exhibiting certain structural properties 
(Loosely speaking, the autonomous system with 
output U is immersed into the autonomous 
system with output S if the set of all possible 
functions of time generated as outputs of U is a 
subset of the set of all possible functions gener¬ 
ated as outputs of Z.). In this respect a number 
of alternative solutions have been proposed in 
literature that differ for the kind of underlying 
immersion assumption and consequent regulator 
design procedure. Immersion into a linear known 
observable system (see Byrnes et al. 1997; Huang 
and Lin 1994; Khalil 1994; Serrani et al. 2000), 
immersion into a linear unknown (but linearly 
parameterized) system (Serrani et al. 2001), 
immersion into a linear system having a nonlinear 
output map (Chen and Huang 2004), immersion 
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into a nonlinear system linearizable by output 
injection (Delli Priscoli 2004), immersion into a 
system in canonical observability form Byrnes 
and Isidori (2004), and immersion into a system 
in a nonlinear adaptive observability form Delli 
Priscoli et al. (2006a,b) are a few examples of 
approaches proposed in literature. 

Summary and Future Directions 

The theory of output regulation for nonlinear 
systems is an active area of investigation. 
Research efforts are, in particular, addressed 
to the problems of weakening the minimum- 
phase assumption and of identifying robust 
design procedures, to asymptotically stabilize 
the steady-state locus, not necessarily based on 
high-gain principles. Recently, the problem of 
output regulation for multivariable systems has 
been also addressed (Isidori and Marconi 2012). 
In this case a paradigm shift in the design of 
the regulator and of the stabilizer is expected 
to deal with the problem in its full generality. 
Finally, it is worth mentioning that the theory 
of output regulation and internal model-based 
design methods are being used for the problem 
of reaching a consensus between the outputs 
of a network of nonlinear systems exchanging 
relative information over a communication graph. 
In this case it has been proved the necessity of 
internal model-based regulators (Wieland 2010) 
and the research activity is now conveyed to 
identify constructive design strategies for classes 
of nonlinear systems and network topologies. 

Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Lyapunov’s Stability Theory 
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Abstract 

Motivated by understanding “robustness” from 
the view points of stochastic control, the studies 
of risk-sensitive control have been developed. 
The idea was applied to portfolio optimization 
problems in mathematical finance, from which 
new kinds of problem on stochastic control, 
named “large deviation control,” have been 
brought, and currently the studies are in progress. 

Keywords 

Large deviation control; Mathematical finance; 
Robustness 

Risk-Sensitive Criterion 

Risk-sensitive stochastic control has the criterion 

J(x,0\ T ; y) = -logE[e y{ ^ f(^,us)ds+ v (x T )}j 
Y 

( 1 ) 

with y ^ 0, where X t is the state variable process 
defined by the controlled stochastic differential 
equation 

dX t = or(X t )dB t +b(X t ,u t )dt, X 0 = x (2) 


with the control parameter process u t . Here 
a(x) : R n i-> R n 0 R d and b(x,u ) : 

R n x R m i-> R N . When y 0, the criterion 
behaves as 

J(X, 0; T;y) ~ E[f 0 T f(X s ,u s )ds + <p(X T )] 
+ 2 Yar Uo f(X s ,u s )ds + <p(X T )] + 0(y 2 ). 

Then, minimizing the criterion with y > 0 is 
considered to be risk averse, while with y < 0 it 
is to be risk seeking. The problem minimizing the 
classical criterion E[f^ f(X s , u s )ds + (p{Xj)\ 
corresponds to the case of y = 0, which is risk 
neutral. 

When f(x,u ) = ^x*Qx + \u*Su , <p(x) = 
7}X*Ux, and b(x,u ) = Ax + Cu, a(x) = E 
with constant matrices Q, S, U , A, C, E, 
minimizing the criterion subject to the state vari¬ 
able processes X t is called a linear exponen¬ 
tial quadratic Gaussian (LEQG) control problem, 
where one may assume Q and U to be nonnega¬ 
tive definite and S positive definite. 

H-J-B Equations 

The Hamilton-Jacobi-Bellman (H-J-B) equation 
for the problem minimizing criterion J defined 
by (1) among the controlled processes governed 
by (2) is seen to be 

( % + \u[a(x)D 2 v] + H(xyv)=0 

i v(T,x-> = f M, <3) 

where a(x) := (a lJ (x)) = ((aa*) lJ (x)) and 

y 

H(x, p) = — p*a(x)p+inf{b(x , u)*p+f(x, u)}. 

2 M 

In an LEQG case, where we assume that Q , U 
are nonnegative definite and S positive definite, 
the H-J-B equation has the solution expressed as 

v(t,x) = ^ x*P(t)x + G(t ), 

by using the solutions G(t ) of ordinary differen¬ 
tial equation 




Risk-Sensitive Stochastic Control 


1157 


G(0 + = 0, G(T)= 0 

and P(t) of the Riccati equation 

P{t) + PA + A*P - P(CS~ l C * - ySE*) 

7> + G=o 

with the terminal condition P(T) = U, provided 
that it has a nonnegative definite solution P(t). 
However, it may occur that the Riccati equation 
does not have any solution if y is large. In 
that case, we say that the risk-sensitive control 
problem “breaks down.” Namely, there is no 
control which makes the criterion have a finite 
value. On the other hand, if it has a solution, 
then the optimal feedback control is seen to be 
— S~ l C*P(t)x and the optimal diffusion process 
turns out to be the solution to 

dX r = TidB t -\-{AX t —CS~ x C*X t )dt, X 0 = x. 

The situation can extend to certain general cases. 
Under sufficiently general conditions one can say 
that if H-J-B equation (3) has a solution, then no 
“breakdown” occurs in the corresponding risk- 
sensitive stochastic control problem (cf. Bensous- 
san and Nagai 2000; Bensoussan et al. 1998; 
Nagai 1996). 

The LEQG problems were first investigated in 
Jacobson (1973), and then a theory of the LEQG 
control with complete or partial state information 
is developed in Whittle (1981) and Bensoussan 
and Van Schuppen (1985). Development of the 
studies of nonlinear risk-sensitive control can be 
seen in Bensoussan et al. (1998), Nagai (1996), 
Fleming and McEneaney (1995), etc. 

Singular Limits and H°° Control 

The large deviation theory of Freidlin-Wentzell 
applies to the risk-sensitive control problem with 
the criterion 

J e (x, 0; T ) = e -\ogE[e e <fi^ u * s ^ +v ^ d *] 
0 

( 4 ) 


and the controlled dynamics 

dx t = Jeo(X t )dB t + {, b(X t ) + C(X,)u,}dt. 

(5) 

The corresponding H-J-B equation is 

( + §tr [a(x)D 2 v € \ + H 0 (x , Vv € ) + V = 0 

| v € (T,x) = 0, 

(6) 

H 0 (x,p) = ^p*a(x)p + b(x)*p 

+ inf u eR™{u*C(x)p + \u*S(x)u} 

= b{x)* p — \ p*{CS~ l C(x)* — 6a(x)}p. 

By employing viscosity solution theory, we can 
see that, when sending e —>► 0, the solution v € of 
(6) converges to the viscosity solution w of the 
equation 

j d if + H 0 (x,Xw) + V=0 
I w(T,x) = 0. ; 

Noting that Ho(x, p) can be regarded as 

H 0 (x,p) = sup.,{z*/? - ? z*a(x)~ l z } 

+ inf u {u*C(x)p + \u*S(x)u}, 

Equation (7) is written as 

+ sup Z {z*Dw - ±(Dw)*a(x)- 1 Dw} 

+ inf u {u*C(x)Dw+ jU*S(x)u} = 0 
w(T, x) = 0. 

This equation has a unique viscosity solution un¬ 
der suitable conditions. Further, w(0, x) is char¬ 
acterized as the lower value of the differential 
game with the criterion 

7(0, T;z.,u(z.)) = [ } &(x s ,zs,u(z.)s)ds, 

Jo 

ty(x,z,u) = — —-z* a{x)~ x z+ -u* S{x)u+Vix) 
20 2 

and the controlled dynamics 

dx s = {b{x s ) + Zs + C(x s )u(z.) s }ds, v 0 = x, 

where z s is a measurable, R N -valued function on 
[0, T] such that \z s \ 2 ds < oo and the set of 
such {z^} is denoted by Zj. Further, let U be the 
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totality of a measurable, R m -valued function such 
that \u s \ 2 ds < oo and u(z) be a map defined 
on Z with its value on U such that whenever for 
each 0 < r < T and z^\z^ c Z , z (1) (s) = 
z^ (s), almost everywhere on 0 < ^ < r, then 
u(^) s = u( Z V)s , a.e. on 0 < s < r, and the set 
of such u(z.) is denoted by IV. Thus, the lower 
value of the game is defined as 

w(0, x) = inf sup 7(0, T\z ., u{z .)) 

u(z.)eTu z(EZ 

(cf. Bensoussan and Nagai 1997 and references 
therein). The differential game is known to be re¬ 
lated to 77 00 or Robust control. If 0 is large, then 
H-J-B equation (6) may fail to have a solution (cf. 
Bensoussan and Nagai 1997,2000). The size of 0 
ensuring the existence of solution to (6) is related 
to the level of robustness which the above differ¬ 
ential game concerns (Basar and Bernhard 1991; 
Bensoussan and Nagai 1997, 2000; Bensoussan 
et al. 1998; Whittle 1990). 


Risk-Sensitive Asset Management 

The idea of risk-sensitive control applies to 
mathematical finance (Bielecki and Pliska 1999; 
Fleming 1995). Consider a market model with 
m - hi securities, where the security prices are 
defined by 

dS°(t ) = r(X,)S°(t)dt, 

n+m 

dS\t ) = V (OK (X,)dt + 

k= 1 

/ = 1 ,...m, with an n + m dimen¬ 

sional Brownian motion process B t = 
( B ), Bf ,..., 7?f +m ) defined on a filtered 
probability space (£2, T, P; Tt). The volatilities 
a, the instantaneous mean returns a of the risky 
assets, and the interest rate r of the riskless 
asset are affected by the economic factors 
defined as the solution of the 
stochastic differential equation 


dX t = p{X t )dt +k(X t )dB t , X(0) = * e R n . 

Let us set the total wealth Wj of an investor to be 
Wt = NjS 1 t with N l T , number of the share 
invested to i th security S l T at time T, and Wo 
the initial wealth. Expected power utility maxi¬ 
mization maximizing ^ E [ [e y log Wt ], 

y < 1,^0 (Merton 1990) is equivalent to 

sup - log E[e r log Wt ], (8) 

y 

and it has been studied in terms of “risk-sensitive 
asset management.” When introducing portfolio 
proportion h\ invested to /th security defined by 
h l ( t ) = Nl ^ for each i = 0,..., m and 
setting h(t)* = h 2 (t ),..., the 

total wealth W t turns out to satisfy 

dW(t) 

—^={r(X t )+h(t)*a(X t )}dt 

W(t) 

+ h(t)*a(X t )dB t , 

under the self-financing condition, where a (x) = 
a(x) — r(x) 1, 1 = (1,1,..., 1)*. In consider¬ 
ing the maximization problem, the portfolio pro¬ 
portion h t is considered an investment strategy 
to be controlled and assumed to be := 

a(S(u), X(u), u < t) progressively measurable 
in the case of full information. The problem is 
often considered under partial information where 
h t is assumed to be Qf \= a(S(u), u < t ) 
measurable. Here we first discuss the case of full 
information, and the set of admissible strategies 
A(T) (or ^4) is determined as the totality of 
Qf x progressively measurable investment strate¬ 
gies satisfying some suitably defined integrability 
conditions. 

Considering (8) for y < 0 amounts to studying 
the minimization problem 

v(0,x)= inf log E[e yl ° s WT{h) ], (9) 

heA(T ) 

Then introducing a probability measure p h de¬ 
fined by 
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P h (A) = E[e y ^ h t°( x s)dW s - Y T h*oa*{X s )h 5 ds : ^ 

A e Ft, the value function is expressed as 

C(0, x) = y log Wo + inf 
heA(T) 

logE h [e~ Yf o ^Mds] 

with the initial wealth Wo, where 

rj(x,h ) = —h*a(x) + - ^ /z*erer*(v)/z — r(x ) 

and a(x) = a(x)—r(x)l. By using the Brownian 
motion Bj 1 := B t — y <j*(X s )h s ds under the 
new probability measure P h , the dynamics of the 
economic factor X t is written as 

dX t = {0(X t ) + yXcr*(X t )h t }dt + A {X t )dB h t . 

(10) 

Thus, we arrive at the risk-sensitive control prob¬ 
lem with the value function 0(0, x) and the con¬ 
trolled dynamics X t governed by (10). Note that 
> 0 for y < 1 and that the case where y < 0 
is called risk averse, which we mainly discuss 
here. Then the corresponding H-J-B equation is 
deduced as 

( ^ + \tr[XX*D 2 v] + \{Dv)*XX*Dv 
< +inf h{[P + yXo*h]* Dv — yq(x,h)} = 0, 

{ v(T,x) = y log W 0} 

(11) 

which can be rewritten as 

^ + ±tr[A X*D 2 v]+p;Dv 
+ \(Dv)*XN~ x X*Dv -U Y = 0, (12) 

v(t , x) = y log Wo. 

Here U y = — 2 + r(x), /3 y = 
P + j 3 ^A <j* (<7<7*)~ l a and N~ l = / + 
Y^cr*(crcr*) _1 a. Under suitable conditions H- 
J-B equation (12) has a solution with sufficient 
regularities (Bensoussan et al. 1998; Nagai 2003). 
Moreover, identification 

v(0, v; T ) = v(0, x) = 0(0, x) (13) 


can be verified. Further, h(t , X t ) = pz^(crcr*) -1 
{a(X t ) + cr A * Dv(t, X t )} is the optimal invest¬ 
ment strategy for problem (9) (Nagai 2003). 

A typical example is the case of linear Gaus¬ 
sian model such that r(x) = r, a(x) = Ax + a, 
a(x) = E, P(x) = Bx + b, A(x) = A, where 
A, £, E, A are constant matrices; ( 2 , b are 
constant vectors; and r is a constant. Then, the 
solution to (12) has an explicit representation as 
v(t,x ) = \x*P(t)x + q(t)*x + k{t), where 
P(t ) is the negative semi-definite solution to the 
Riccati equation 

P(t) + P(t)KN- l X*P(t) + K*P(t) + P(t)Ki 
+ I ^;d*(EE*)“ 1 d = 0, P(T) = 0 

(14) 

and q(t), k(t) are, respectively, the solutions to 

q(t) + (^i + AJV- ! AP(0)*^(0 + P(t)b 

+Tq;(^*+^(0AS*)(EE*)- 1 5= 0, q(T) = 0 


k(t ) + itr[AA*/>(0] + \q(t)*K\.*q(t) 

+ ^(a + SA*?(0)*(SST 1 

(a + £A*#(r)) = 0, 

A(T) = y log Wo. 

( 16 ) 

where Aj := B , a = 

a - rl and A -1 := / + 

In this case the optimal strategy has a more 
explicit form: h t = ^(EE*) _1 [<3 + AX t \ + 

i^CEETMSA^CO + £A*/>(0*,] (Cf. 

Davis and Lleo 2008; Kuroda and Nagai 2002). 

The economic factor X t may be more suitably 
considered to be unobservable and then the prob¬ 
lem should be formulated as the risk-sensitive 
stochastic control problem under partial infor¬ 
mation. Indeed, one can formulate the problem 
by regarding the log prices Y l t := log S\, i = 
0, 1,2,..., m as the observable quantities and the 
economic factor X t as the unobservable system 
process. As for linear Gaussian models and hid¬ 
den Markov models, the problems are reduced 
to the ones of full information by obtaining the 
relevant controlled dynamics in a finite dimen¬ 
sion through deducing the filtering equation by 
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the methods of change of measure (Nagai 1999; 
Nagai and Runggaldier 2008). Further, one can 
obtain the explicit form of the optimal strategy, 
which is Qf measurable, in the case of linear 
Gaussian model (Nagai 1999) as the parallel 
result to the above. 

Linear Gaussian models for 0 < y < 1 are 
extensively studied in Fleming and Sheu (1999, 
2002). In that case, one concerns the problems 

sup log E[e yl0 * WT{h) ], (17) 

h 

or 

X(y) = sup lim — log E[e ylogWr ^]. (18) 

h T^oo T 

If 0 < y is small, there is a stationary solution 
of (14) and the verification theorem holds for 
the problem on infinite horizon (so does for the 
problem on a finite time horizon). Further, under 
some conditions there is a threshold y such that 
/(y) = oo for y < y. To know explicitly the size 
of y is important, while it is limited to the case of 
1 dimension to be able to realize. 

Problems on Infinite Horizon 

The value for the problem on infinite time horizon 
counterpart of (9) is defined as 

X(y) = inf y(h; y), (19) 

heA 

X(h;y)= lim -log E[e vlogWT(h) ] 

T^oo T 

when suitably setting the set A of admissible 
strategies. The corresponding H-J-B equation of 
ergodic type for the problem is seen to be 

X (y) = ±tr[AA *D 2 w \ + p* y Dw 

+ \(Dw)*\N~ x \*Dw - U Y . 

( 20 ) 

However, when setting as A = {A.|[o,r] £ 

A(T), VT}, identification of /(y ) with the 
solution /(y) to the H-J-B equation (20) cannot 
be seen in general. Indeed, even in the case of 


linear Gaussian model, such identification does 
not always hold (Fleming and Sheu 1999; Kuroda 
and Nagai 2002; Nagai 2003) if y < 0. Instead, 
introduce the asymptotic value 

Ky) = Mm 2<)(0 ,x;T). 

T —>oo 1 

Then we can see that /(y) = jf(y) under suffi¬ 
ciently general conditions (cf. Hata et al. 2010; 
Nagai 2012). 

In the case of the linear Gaussian model, the 
solution to the H-J-B equation of ergodic type 
is given by w(v) = 2 x*Px + q*x with the 
stationary solutions P of (14) and q of (15), and 
if 

PAs*(EE*r l EA*p < 

holds, then one can see that y(y) = x(y) 
(Kuroda and Nagai 2002). Further, the optimal 
strategy is given by h t = h(X t ), with h{x) = 
^(SX*)- 1 ^ + XA *q + (A + XA *T)x] 
(Kuroda and Nagai 2002). Decomposition as 

h t = A-yh) + := + 

AX,] + ^(XXVPA*? + XA *PX t ] is 
regarded as a generalization of Merton’s Mutual 
Funds Theorem (Davis and Lleo 2008; Merton 
1990). Here h) is a log utility portfolio (Kelly 
portfolio) (Kelly 1956). See also Nagai and 
Peng (2002) concerning the partial information 
counterparts of the results in Kuroda and Nagai 
( 2002 ). 

In relation to the above problems on mathe¬ 
matical finance, a new kind of problem studying 

7(/c) = lim — inf log P (log Wt( h) < kT ) 

r-^oo 2" heA{T ) 

( 21 ) 

for a given constant /c, arises, and it is called 
“downside risk minimization.” The problem is 
considered “large deviation control” and can be 
discussed as the dual to risk-sensitive asset man¬ 
agement (19) in the risk-averse case y < 0 
(Hata et al. 2010; Nagai 2011, 2012). Indeed, we 
obtain 
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I(k) = ~ , inf , sup{)/& - /(y)}. 

£e(-oo,/c] y<0 

Further, an asymptotically optimal strategy is 
given as follows. For given /c, take y(/c) which 
attains the supremum in sup y<0 {y/c — /(]/)}, then 
the optimal strategy h (t, X t ), 0 <t<T for prob¬ 
lem (9) with y = y(k) forms the asymptotically 
optimal strategy for (21). Historically, the studies 
of “upside maximization” concerning 

7(/c) = sup lim — log /’(log WtQi) > kT) 

heA T ^°° T 

have been preceding (cf. Pham 2003), and the 
duality relationship between this and (18) was 
discussed. To develop further studies for the prob¬ 
lem, there are difficulties to know the size of y 
(Cf. Fleming and Sheu 1999, 2002). 


Cross-References 

► Credit Risk Modeling 

► Financial Markets Modeling 

► Investment-Consumption Modeling 

► Option Games: The Interface Between Optimal 
Stopping and Game Theory 
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Abstract 

Robotic grasping is the process of establishing 
a physical connection between the robot (or an 
appendage of the robot called the gripper) and 
an external object in such a way that the robot 
can exert forces and torques on the object. Grasp 
control requires the satisfaction of contact con¬ 
straints, of which two types are considered. Form 
constraints specify geometric configurations of 
the gripper that bring it into contact with the 
object to be grasped. This article is principally 
concerned with force constraints and force clo¬ 
sure that specify forces exerted on the object 
that are sufficient to lift, move, or otherwise 
manipulate it. 


Keywords 

Force constraints; Force closure; Grasp con¬ 
straints; Grasp matrix; Hand Jacobian; Twists; 
Wrenches 


Introduction 

Grasp control refers to the art of controlling the 
motion of an object by constraining its dynamics 
through contacts with a hand. The process of con¬ 
trolling the grasp is not limited to robotic hands 
only but also applies to human hands (Johansson 
and Edin 1991) and to all other mechanisms using 
contact constraints to control the motion of the 
manipulated object (Brost and Goldberg 1996). 

A crucial role in the control of grasping is 
played by contact constraints. All the interactions 
between the robotic hand and the grasped object 
occur at the contacts whose understanding is 
paramount (Salisbury and Roth 1983). The uni¬ 
lateral nature of contact interaction in grasping 


makes the control problems much more challeng¬ 
ing than cooperative manipulation where multiple 
arms hold the object rigidly allowing bilateral 
force transmission at each contact point (Chiac- 
chio et al. 1991). 

The importance of unilateral contact con¬ 
straints in grasping led a large part of the 
literature to focus on the closure properties 
of the grasp (Bicchi 1995). Those properties 
refer to the ability of a grasp to prevent 
motions of the grasped object relying only 
on unilateral frictionless constraints in case 
of form closure (Reuleaux 1876) and on 
contact constraints with friction in case of force 
closure (Nguyen 1988). While form closure is 
a purely geometric property of the grasp and 
depends on where the unilateral contact points 
are on the object, force closure depends on the 
ability that the robotic hand has to resist and 
apply forces to the object through the contacts 
while satisfying the friction constraints. In other 
terms force closure directly involves the control 
of the robotic hand kinematics and not only the 
geometry of the contacts (Bicchi 1995). This 
entry focuses on force-closed grasps. 

The optimal choice of the contact points on the 
object surface is a critical issue known as grasp 
planning. Among the many optimal criteria that 
have been proposed in the literature to choose the 
contact points, I want to recall the one proposed 
in Ferrari and Canny (1992) where the grasp¬ 
ing configuration is evaluated according to the 
magnitude of the largest worst-case disturbance 
wrench that can be resisted by the grasp. 

Many approaches have been studied in the 
literature on grasp planning in the presence of 
uncertainties. The uncertainty can be either due to 
the shape of the object which is partially known 
or partially sensed as in Goldfeder et al. (2009) 
or due to the errors in positioning the fingers on 
the object during the grasping (Roa and Suarez 
2009). In what follows all the parameters of the 
grasp including those related to the hand, the 
object, and the contact points are assumed to be 
known with no uncertainties. 

The main objective of grasp control is that of 
tracking a desired trajectory with the grasped 
object by applying a set of contact forces 
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satisfying the friction constraints (Bicchi and 
Kumar 2000). Complex in-hand object motions 
can be obtained by rolling and sliding the contact 
points on the object surface as proposed in 
Montana (1988) or by using finger gaiting to 
get large-scale motions (Han and Trinkle 1998). 
This entry deals with non-rolling and non-sliding 
contact points and summarizes the fundamental 
theory of computed-torque control for object 
trajectory and internal force control proposed 
in Li et al. (1989). For a comprehensive review of 
the theory of grasping and its control, the reader 
is referred to Murray et al. (1994), Shimoga 
(1996), Okamura et al. (2000), Bicchi and Kumar 
(2000), and Prattichizzo and Trinkle (2008). 

Contact and Grasp Model 

Notations and definitions on grasping are taken 
from Prattichizzo and Trinkle (2008). Refer to 
Fig. 1 and let {N} represent the inertial frame 
fixed to the palm of the robotic hand. Let u = 
[p T , (j) T ] T e M 6 denote the vector describing the 
position and orientation of frame { B }, fixed to 
the object, relative to {V}. Vector 0 expresses the 


Euler angles, the pitch-roll-yaw variables, or the 
exponential coordinates parameterizing SO (3). 
Denote by v = [v T co T ] T e M 6 the twist of the 
object. It is worth to note that v is not equal to 
u , but satisfies v = U(u)u where matrix U e 
M 6x6 is such that UU T is the identity matrix, and 
the dot over the variable implies differentiation 
with respect to time (Murray et al. 1994). The 
joint variables of the robotic hand are defined 
by q = [qi ... q Hq ] r e Let n c be the 
number of contact points. The position of contact 
point i in {N} is defined by the vector q e M 3 , 
in the contact point frame {C}; whose axes are 
{hi, ti, di} where the unit vector is normal 
to the tangent plane at the contact, and directed 
toward the object while the other two unit vectors 
are orthogonal and lie in the tangent plane. 

Two matrices are of utmost importance in 
grasp analysis: the grasp matrix G and the hand 
Jacobian J . These two matrices are computed 
using the complete grasp matrix, the complete 
Jacobian, and the contact selection matrix that are 
defined as follows: the transpose of the complete 
grasp matrix G T e M 6WcX6 maps the object twist 
to the n c twist vectors of the contact frames {C }/ 
as thought on the object v c>0 bj = G T v, while 


Robot Grasp Control, 
Fig. 1 a two-fingered 
hand grasping an object 
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the complete hand Jacobian Matrix J G M 6 ” c Xnq 
maps the joint velocities to the twists of the con¬ 
tact frames as thought on the hand v c ,hnd = Jq. 

When a contact occurs between the hand and 
the object, assuming no sliding, some compo¬ 
nents of the relative contact twist between the 
object and the hand are set to zero according to 
the used contact model. In this entry the hard- 
finger (HF) and the soft-finger (SF) contact mod¬ 
els are considered (Mason and Salisbury 1985). 
Those components are selected by the contact 
selection matrix which selects m components of 
the relative contact twists for all the contacts and 
sets them to zero: H(v c ,hnd~ v c , 0 bj) = 0. For more 
details on how to compute the contact selection 
matrix, the reader is referred to Prattichizzo and 
Trinkle (2008). Then the following contact con¬ 
straint equation is obtained: 


[J-G T ] 


( 1 ) 


A; = [fin fit fio m in ] T for the SF contact 
model. The subscripts indicate one normal (n) 
and two tangential ( t,o ) components of contact 
force fi and moment at contact i . 

In terms of forces, the grasp matrix maps 
the transmitted contact wrenches A to the set of 
wrenches that the hand can apply to the object 
GA, and the transpose of hand Jacobian maps the 
contact forces —A to the corresponding vector of 
joint loads —J T A. 

Grouping all the noncontact wrenches applied 
to the object in g G M 6 and all the noncontact 
contributions to the joint loads of the robotic hand 
in r G W 1 ^ , the rigid-body dynamic equations of 
the whole system, consisting of the hand and of 
the grasped object, are 


M oh fu)v + N oh fu, v) = GA + g 
M hnd (q)q + N hnd (q,q) = -J T A + r 


where the transpose of the grasp matrix and the 
hand Jacobian are finally defined by multiplying 
the contact selection matrix and the transpose of 
the complete grasp matrix and the complete hand 
Jacobian as 

G t = HG t g M mx6 

J = HJ g 


where M O bj(0 and M bnd (-) are symmetric, pos¬ 
itive definite inertia matrices and N ob f-, •) and 
MmdG, *) are the velocity-product terms for the 
object and the hand, respectively. For the sake of 
simplicity, the gravity terms are disregarded. 

The dynamics of the hand and object are not 
independent but depend on the kinematic con¬ 
straints imposed by the contact model (1) 


In the force domain, the wrenches that the hand 
applies to the object at the contact points are 
collected in the vector A. Correspondingly, on 
the hand, a force vector —A, opposite to the 
preceding one, is applied by the object through 
the contact points. At each contact point, the 
contact wrenches have components only along 
the directions constrained by the contact model. 
Furthermore, contact force components must sat¬ 
isfy the friction constraints (see section “Force 
Closure and Grasp Control”). More specifically, 
the m -dimensional vector A = [Af ... A^ ] T 
contains the contact wrench components applied 
to the object through the n c contacts, where the 
wrench at contact i is defined, for the differ¬ 
ent contact models here considered, as A ? = 
[fin fit fo] T for the HF contact model and 



subject to Jq = G T v 

where 

g = g- M ob fu)v - N oh fu, v). 
r = r - M hnd (q)q - N hnd (q, q) 

It is worth underlying that dynamics can be dis¬ 
regarded for slow motions of the hand and of the 
object, while it becomes very relevant in applica¬ 
tions with high-speed grasping and manipulation 
as discussed in Namiki et al. (2003). 
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Controllable Wrenches and Twists 

From the first equation in (2) to impose any 
motion to the object by contact forces, the grasp 
matrix G must be full row rank, i.e., rank (G) = 
6, which is equivalent to have a trivial null space 
of G t , i.e., Af(G T ) = 0. This is an important 
property of the grasp which has been referred to 
as non-indeterminate in Prattichizzo and Trinkle 
(2008) to reflect the idea that the contacts on the 
object are placed in a way that there are no twists 
of the object that are not controllable by contact 
wrenches. 

However, this condition depends only on the 
contacts on the object and does not consider the 
role of the hand kinematics which comes from 
the second equation in (2) and from the contact 
constraint. Under the simplifying assumption that 
Af(J T ) = 0, referred to as non-defective grasp 
in Prattichizzo and Trinkle (2008), it is simple 
to verify that, for any given contact wrench A, a 
control torque r exists which is able to apply the 
given contact wrench. The mechanical interpreta¬ 
tion of this assumption is that when Af(J T ) = 0, 
there are no contact forces resisted by the robotic 
hand constraints, i.e., with zero joint load. The 
simplifying assumption of non-defective grasps 
ensures that Af(J T ) PI A f(G) = 0 which is 
a necessary condition to determine the contact 
force A from the rigid-body equation (2) as shown 
in Prattichizzo and Trinkle (2008). 

If a grasp is non-defective, it means that each 
finger of the robotic hand involved in the contact 
with the object must have a number of joints 
sufficient to control all the components of the 
contact wrench. For example, in the case of two 
HF contact points occurring at the fingertips of a 
two-fingered robotic hand, each finger must have 
at least three joints and must be in a non-singular 
configuration. 

This entry does not consider whole-hand or 
power grasps which, differently from the fingertip 
grasps, exploit the whole surface of the fingers, 
including the palm, to constraint the object. The 
analysis of controllable wrenches and twists for 
whole-arm grasps, taking into account the hand 
and object dynamics, can be found in Prattichizzo 
and Bicchi (1997). 


Force Closure and Grasp Control 

The dynamic formulation of the grasp with the 
contact kinematic constraints given in (2) holds 
only if the contact forces satisfy the friction law 
imposing constraints on the components of the 
contact force and moment. Limiting the analy¬ 
sis to HF contact models, Coulomb friction law 
requires that the components of contact force A, 
at the / -th contact lie inside the friction cone 
Ti 

J~i — {(fin, fit, fio) \ yjfu + fu> — /L' fin) 

(3) 

where /x z represents the friction coefficient at the 
i -th contact. Extending to all contact points, A is 
constrained to lie in T where T is the generalized 
friction cone defined as: T = T\ x • • • x T Uc = 
{A e R m | A f e Tu i = 1,..., n c {. 

While grasping an object, the applied contact 
forces must be consistent with the friction con¬ 
straints. This is not straightforward for the grasp 
control and requires to exploit the beneficial char¬ 
acteristics of the internal forces. From the object 
dynamics in (2), for a given g, one gets 

X = -G+g + N(G)y (4) 

where G + denotes the generalized inverse of the 
grasp matrix and N(G ) denotes a matrix whose 
columns form a basis for J\f(G ), and y is a vector 
parameterizing the solution set. The contact force 
A consists of a particular solution balancing the g 
term and of a homogeneous solution belonging to 
the null space of the grasp matrix. 

In general, the particular solution — G + ~g does 
not satisfy the friction constraint (3) at all the con¬ 
tact points and needs the homogeneous solution 
A h = N(G)y to keep the contact forces within 
the friction cones. Contact forces A h in Af(G) are 
referred to as internal forces since they do not 
contribute to the object dynamics, i.e., GA/* = 
0. Instead, these forces affect the tightness of 
the grasp and play a crucial role in maintaining 
grasps that rely on friction. The existence of a 
nontrivial null space of the grasp matrix is a 
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desirable property and has been referred to as 
graspability (Prattichizzo and Trinkle 2008). 

Another relevant and desirable property of the 
grasp is the frictional force closure which means 
that for any noncontact wrench g, an internal 
force Xh exists such that the contact force X 
in (4) belongs to the generalized friction cone T. 
In Murray et al. (1994) the authors state that a 
grasp has frictional form closure if and only if the 
grasp matrix is full row rank (non-indeterminate 
grasp) and there exists Xh such that GXh = 0 
and X h belong to the interior of the generalized 
friction cone T. 

Grasp control is about using contact forces, 
which must satisfy the friction constraints, so as 
to let the object to track a given trajectory. This is 
also referred to as dexterous manipulation (Bicchi 
and Kumar 2000). In Li et al. (1989), a computed- 
torque controller is proposed to track both the 
desired trajectory of the grasped object u des and 
the desired internal force A^s- Under the ad¬ 
ditional simplifying assumption that the robotic 
hand Jacobian is invertible, i.e., there are no 
redundant motions of the fingers, the computed- 
torque control law 

r = N hnd (q,q) + / r G + N ohj (u,v) 

Mmd J (/} ) J j q A/ho U u 

T M h0 U(fides K v e u K m 6? m ) 

+ J (A/^des K s f ex fl ), 

with Mh 0 = M hnd J(q)J~ 1 G T + / r G+M ob j/ 
guarantees that both the trajectory and the inter¬ 
nal force errors 

= U ^des 
= Xh X}i d Q s 

with respect to the desired object trajectory u dts 
and internal force A^s converge to zero ac¬ 
cording to a second- and first-order dynamics, 
respectively. 

H - K v e u + K u e u — 0 
e h + K s f ex h = 0 

where K v , K u , and K s are positive definite 
matrices. 


The computed-torque controller proposed 
in Li et al. (1989) guarantees only that the desired 
object trajectory and the desired internal forces 
are asymptotically tracked, but it does not ensure 
the non-violation of friction constraints by the 
contact forces. To guarantee that the contact force 
vectors remain in the friction cone during the 
manipulation, a force distribution problem must 
be solved at each time instant. The force closure 
assumption ensures that a solution exists that sat¬ 
isfies the friction constraints during the manipula¬ 
tion. This solution, which becomes the reference 
for the internal force control, can be found with 
an efficient algorithm, based on the minimization 
of a convex function that checks the force closure 
property at each time instant (Bicchi 1995). 

Summary and Future Directions 

The basic foundation of grasp control has been 
reviewed with a particular attention to modeling 
of contact constraints, force closure, and control 
of object motion and internal forces. This entry 
did not explicitly address grasp stability that 
is often equated to grasp closure, because all 
external forces can be balanced by the hand. A 
more formal analysis of grasp stability in terms 
of deflection from an equilibrium point has been 
proposed for hands with general kinematics in 
Jen et al. (1996). 

The computed-torque control is a classical 
approach to the grasp control. For a deeper study 
of other approaches to grasp control based on pas¬ 
sivity theory, the reader is referred to Wimboeck 
et al. (2011). 

Recent developments in underactuated robotic 
hands Birglen et al. (2008) have led to a renewed 
interest in grasp control. Designing hand with a 
lower number of actuators has a lot of advantages 
in terms of robustness and reliability but dramati¬ 
cally reduces the dexterous manipulation abilities 
which can be recovered only by designing new 
control algorithms (Prattichizzo et al. 2013). 

Cross-References 

► Force Control in Robotics 

► Parallel Robots 
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► Robot Visual Control 

► Walking Robots 


Recommended Reading 

Grasp synthesis and dexterous manipulation 
are important research topics. Grasp synthesis 
is the problem of choosing the posture of the 
hand and contact point locations to optimize 
a grasp quality metric. One of the first studies 
of grasp synthesis for multi-fingered hands was 
undertaken in Jameson (1985) where the author 
proposed a Levenberg-Marquardt algorithm 
to search the surface of an object for the 
locations of three points that would achieve force 
closure. Since this work, many other metrics 
and approaches to searching for high-quality 
grasps have been implemented as discussed in 
Nguyen (1988), Pollard (1997), Park and Starr 
(1992), Chen and Burdick (1993), and references 
therein. 

Dexterous manipulation is the capability of 
manipulating an object so as to arbitrarily steer 
its configuration in space. Research on dexter¬ 
ous manipulation first appeared in Hanafusa and 
Asada (1979) where the authors developed a plan 
to turn a nut onto a bolt. Since then a progression 
of increasingly complex manipulation tasks have 
been studied to varying degrees of detail. For 
the planar case the reader is referred to Mason 
(1982), Brost (1991), Peshkin and Sanderson 
(1988), Lynch (1996), and references therein. 
Several approaches have been proposed to plan¬ 
ning and execute dexterous manipulation tasks in 
three dimensions continues in Cherif and Gupta 
(1999), Han et al. (2000), and Higashimori et al. 
(2007). Dexterous manipulation can be evaluated 
with manipulability ellipsoids of velocity and 
force as proposed in Chiacchio et al. (1991) 
for multiple-fingered systems and more recently 
in Prattichizzo et al. (2012) for underactuated 
robotic hands. 
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Abstract 

The motion control problem for robots, both for 
manipulator arms and for wheeled mobile robots, 
is to determine a time sequence of control inputs 
to achieve a desired motion, or output, response. 
The control inputs are usually motor currents 
but can be translated into torques or velocities 
for the purpose of control design. The desired 
motion is typically given by a reference trajec¬ 
tory, consisting of positions and velocities that 
are generated from motion planning and trajec¬ 
tory generation algorithms designed to calculate 
collision-free paths, taking into account various 
kinematic and dynamic constraints on the robot. 
In this chapter we give an overview of some 
common control methods for motion control of 
robots, concentrating on the control of manipula¬ 
tor arms. 


Keywords 

Adaptive control; Feedback linearization; Inverse 
kinematics; Motion planning; Passivity-based 
control; PID control; Robust control 
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Introduction 

We consider the motion control problem for an 
ft-degree-of-freedom robot manipulator, such 
as shown schematically in Fig. 1. The variables 
0\,... ,6 n are the joint variables , which define 
the configuration of the robot at each instant of 
time. 

A robot manipulator is fundamentally a posi¬ 
tioning device designed to move material, parts, 
tools, or specialized devices through variable 
programmed motions for the performance of 
a variety of tasks (Robot Institute of America, 
1980). Thus, manipulator tasks, such as materials 
transfer, welding, and painting, and even tasks 
involving the control of interaction forces, such 
as assembly or grinding, are performed through 
the coordination and control of the motion of the 
joints of the robot. 

A typical robot control architecture is shown 
in Fig. 2, which is designed to translate sensing 
into action , through motion planning, trajectory 
generation, and feedback control. In this entry we 
concentrate on the function of the controller. 


Motion Planning 

The desired joint motions are specified as refer¬ 
ence trajectories (positions and velocities) gener¬ 
ated from motion planning algorithms that must 
determine collision-free paths taking into account 
various kinematic and dynamic constraints on 
the robot (Lavelle 2006). A detailed discussion 
of motion planning is outside the scope of this 
entry. The motion planning problem begins by 

Robot Motion Control, 

Fig. 1 Six-link robot 
manipulator 


decomposing a given task into a discrete set of 
end-effector motions. A continuous path for the 
end-effector in the task space is then computed, 
taking into account issues of joint limits and col¬ 
lisions with objects in the workspace, including 
self-collisions. 

Finding optimal paths in configuration space 
is computationally complex, and methods 
have been developed to determine feasible, 
suboptimal paths using various methods such 
as artificial potential functions, grid search, and 
roadmaps (Lavelle 2006). 

Once a feasible path in task space is 
determined, a trajectory , which is a time- 
parameterized function in task space or 
configuration space, is computed. To compute 
configuration space or joint space trajectories 
from task space trajectories, the inverse 
kinematics of the manipulator is used. 


Trajectory Generation 

To simplify computation, joint-level trajectories 
are typically generated by calculating the inverse 
kinematics only at discrete points along the task 
space trajectory and then interpolating between 
these points. Two of the most common interpola¬ 
tion schemes utilize either polynomials in time or 
trapezoidal velocity profiles. 

For example, a cubic polynomial reference 
trajectory, 0 r (t ), may be specified as 


Q r (t) — H - t T- $ 2 ^ + a?,t 3 
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Robot Motion Control, Fig. 2 Control architecture for robot control 
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Robot Motion Control, Fig. 3 A cubic polynomial ref¬ 
erence trajectory 

If the desired positions and velocities of the joint 
variable are specified at initial and final times, 
to and t f , respectively, it is a simple calculation 
to determine the four polynomial coefficients, 
do ,..., . The reference velocity and accelera¬ 

tion are then given by 

6 r (t) — d\ + H - 3^3^ 


Robot Motion Control, Fig. 4 Trapezoidal velocity pro¬ 
file 


Q r (Y) — 2^2 T - fad^t 

A typical cubic polynomial trajectory is shown in 
Fig. 3. 

A trapezoidal velocity profile is illustrated in 
Fig. 4. 

In this case, the velocity of the joint angle 
increases linearly to a maximum value, K max , 
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which remains constant for a period of time and 
then decreases linearly. 


Independent Joint Control 

The simplest approach to control design for 
a multi-degree-of-freedom manipulator is to 
treat each link of the robot as a single- 
input/single-output (SISO) system and design 
the controllers independently for each link. 
Proportional, integral, derivative (PID) control 
is the most common method employed in this 
case. This approach works well for highly geared 
manipulators moving at relatively low speeds, 
since the large gear reduction and low speed 
tend to reduce the coupling effects among the 
various links. More advanced linear or nonlinear 
control methods can be used to achieve higher 
performance at the expense of added complexity 
of the control system. 

The basic architecture of such a system, using 
a linear model to represent the dynamics of each 
joint of the robot, is shown in the frequency 
domain in Fig. 6. 

The control design objective is to choose the 
compensator in such a way that the plant out¬ 
put 0 tracks or follows a desired output, given 
by the reference signal, 0 r . The control signal, 
however, is not the only input acting on the 
system. Disturbances , which are really inputs 
that we do not control, also influence the behavior 
of the output. Therefore, the controller must be 
designed, in addition, so that the effects of the 
disturbance, D , on the plant output are reduced. If 
this is accomplished, the plant is said to reject the 
disturbances. The twin objectives of tracking and 
disturbance rejection are central to any control 
methodology. 

The plant transfer function, P(s ), represents 
the dynamics of a single degree-of-freedom sys¬ 
tem, typically inertia and damping, 


P(s) = 


1 

Js 2 + Bs 


( 1 ) 


u(s) = (^K p +^-+K d s^(e r (s)-0(s)) (2) 

where K p , K d are the proportional, integral, 
and derivative gains, respectively, and 0 r (s) — 
0(s) is the tracking error between the reference 
trajectory 0 r (s) and joint variable 9(s). 


Set-Point Tracking 


If the reference trajectory 6 r is a constant set 
point, then the closed-loop transfer function, 
T(s), from 6 r to 6 (with D = 0) is 


T(s) = 


P(s)C(s) 

1 + P(s)C{s) 


K d s 2 + K p s + K, 

Js 3 + (B + K d )s 2 + K p s + K t 


Applying the Routh-Hurwitz criterion, it follows 
that the closed-loop system is stable if the gains 
are positive and 


(B + K d )K p 

Ki < - -(3) 

In addition, the presence of the integral control 
term, guarantees zero steady-state error to a 
constant disturbance term D . 


Feedforward Control 

In order to track nonconstant reference signals, 
such as a cubic polynomial trajectory or trape¬ 
zoidal velocity trajectory, a feedforward term 
may be superimposed on the PID control signal as 
shown in Fig. 5. Under the condition that the plant 
P(s) is minimum phase , the feedforward transfer 
function F(s) can be taken as l/P(s), the inverse 
of the plant. This guarantees asymptotic tracking 
of any time-varying reference trajectory provided 
the closed-loop transfer function is stable. 

PID control is, by far, the most common type 
of control used in industry due to its simplicity. 
The main problem in implementing PID control 
is in the tuning , that is, in the choice of the 


C (s) is a PID compensator 
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Robot Motion Control, Fig. 5 Feedforward control architecture 
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Robot Motion Control, Fig. 6 Single-axis control 

proportional, derivative, and integral gains. As 
we see from the inequality (3), the magnitude of 
the integral gain K t is limited by the stability 
constraint. Therefore, one common design rule 
of thumb is to first set Ki = 0 and design the 
proportional and derivative gains, K p and Kj, to 
achieve the desired transient behavior (rise time , 
settling time , and so forth) and then to choose K t 
within the limits imposed by (3) to remove the 
steady-state error. 


Advanced Control Methods 

Advanced control methods for robots generally 
aim to take into account issues such as dynamic 
coupling between joints; compliance in the joints 
or links; uncertainty in the inertia parameters, 
such as the masses and moments of inertia of the 
links; and robustness to sensor noise and other 
effects. A common model of the dynamics of 
ft-link, rigid robots, i.e., without consideration of 
friction, elasticity in the joints or links, and other 
effects, is given by the so-called Euler-Lagrange 
equations 

M(0)0 + C(0,0)0 + g(0) = T (4) 


where 0 = (9\, 02,..., 9 n ) T is the vector of 
configuration (joint) variables as in Fig. 1. The 
ft-dimensional vectors, 0 and 0, are then the 
joint velocities and accelerations, respectively. 
The ft x ft matrix, M(0), is called the inertia ma¬ 
trix. The vectors C(0, 0)0 and g(9) are Coriolis 
and centrifugal forces and gravitational forces, 
respectively. 

Equation (4) is a system of n coupled, nonlin¬ 
ear, second-order equations and is, in fact, a rep¬ 
resentation of Newton’s Second Law of Motion, 
where the (generalized) forces acting on the joints 
of the robot (r — C(0, 9)9 — g(9)) equate to the 
mass times acceleration, given by M(9)9. 

In this case, the control problem becomes one 
of choosing the control input torque vector r (t), 
as a function of time, so that the solution, (0(f), 
9(t)), of Eq. (4) tracks a reference trajectory of 
joint positions and velocities, ( 0 r (t ), 9 r (t)). 


Feedback Linearization Control 

An intuitive method of control for this system 
is the method of feedback linearization , which 
computes the input torque r according to 
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r = M(6)a + C(0, 0)0 + g(0) (5) 

a = e r + - 0 ) + ^(0 r - 0) (6) 

with Kd , K p matrices of appropriate velocity and 
position error gains. 

The control law given by Eqs. (5) and (6) 
is often referred to as the method of inverse 
dynamics although historically, the method of 
inverse dynamics control was implemented as a 
feedforward control 

r = M(6 r )a + C(<9 r , 6 r )6 r + g(6 r ) (7) 

a = 6 r + K d (6 r - 0) + K p (6 r - 6) (8) 

using the reference position and velocity in place 
of the measured state. The primary reason for 
implementing the inverse dynamics in this fash¬ 
ion was the lack of sufficiently fast computation 
to enable computation of the terms in Eq. (5) in 
real time. The nonlinearities in Eq. (7) could be 
precomputed offline and stored to facilitate real¬ 
time implementation. With the advent of faster 
computers, the feedback linearization control is 
now feasible in real time. 

Equations (5) and (6) form a so-called inner- 
loop/outer-loop architecture (Fig. 7). The signif¬ 
icance of this architecture is that the nonlinear 
inner-loop control term (5) results in a linear 
system with input a and output 6. The design of 
the outer-loop control can then take advantage of 
control methods for linear systems. In fact, the 
control (6) in this case is simply a PD control with 
feedforward acceleration. 

The result of the controller (5) and (6) is a 
closed-loop system in terms of the tracking error, 
e(t) = 6(t) — 6 r (t), that satisfies the linear 
equation 


e + K d e + K p e — 0 (9) 

and therefore, the tracking error converges 
exponentially to zero for any given reference 
trajectory. 


Task Space Linearization 

The inner-loop/outer-loop control architecture 
above can be modified to track trajectories 
directly in the task space. Moreover, one can 
achieve task space tracking by modifying 
only the outer-loop control a in Eq. (6) while 
leaving the inner-loop control (5) unchanged. Let 
X e R 6 represent the end-effector position and 
orientation and let X r (t) be a reference trajectory 
in task space. Since X is a function of the joint 
variables 0, we have 

X = J(6)6 (10) 

X = J(6)6 + J (6)6 (11) 

where J is the manipulator Jacobian. If we now 
choose the outer-loop term a according to 

a = J~ l {a x -j0} (12) 

with 

a x = X r - K 0 (X - X r ) - K x (X - X r ) (13) 

we see that the result is a linear system in the task 
space tracking error X (t) = X(t) — X r (t) 

X + KiX + K 0 X = 0 (14) 


Robot Motion Control, 
Fig. 7 Inner-loop/ 
outer-loop control 
architecture 
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Therefore, a modification of the outer-loop con¬ 
trol achieves a linear and decoupled system di¬ 
rectly in the task space coordinates without the 
need to compute a joint space trajectory and 
without the need to modify the nonlinear inner- 
loop control. 

It is important to note that the above result 
is valid in the case of six degree-of-freedom 
manipulators when the Jacobian J is square and 
invertible. The case when the Jacobian is not 
invertible, for example, at kinematic singularities, 
or when the number of joints is not equal to the 
dimension of the task space is outside the scope 
of this entry. 


Robust and Adaptive Control 


uncertainty such as external disturbances or un¬ 
modeled dynamics. 

Robust Feedback Linearization 

If we denote by M (6), C (6,6), and g(6) expres¬ 
sions for the terms M{9 ), C(0, 9), and g(9) in 
the equations of motion (4) based on nominal or 
estimated values of the true parameters, we can 
define a control input 

u = M(9)(a + 8a) + C(0, 9)9 + g(9) (15) 

where a is as defined in Eq. (6) and 8a represents 
an additional control intended to compensate for 
the parameter uncertainty. This leads to the state 
space model in terms of the tracking error e 


There are several theoretical and practical chal¬ 
lenges to the method of feedback linearization 
control discussed in the previous section. For 
example, in order to compute Eq. (5), one must 
have exact knowledge of the parameters defining 
Eq. (4). In addition, effects of compliance, fric¬ 
tion, and so on are not modeled by Eq. (4) and so 
the stability and performance of the system pre¬ 
dicted by Eq. (9) may not be achieved in practice. 
This has stimulated a great deal of research into 
robust and adaptive control, control of elasticity, 
and other issues. 

In distinguishing between robust control and 
adaptive control, we follow the commonly ac¬ 
cepted notion that a robust controller is a fixed 
controller designed to satisfy performance spec¬ 
ifications over a given range of uncertainties, 
whereas an adaptive controller incorporates some 
sort of online parameter estimation. This distinc¬ 
tion is important. For example, in a repetitive 
motion task, the tracking errors produced by a 
fixed robust controller would tend to be repetitive 
as well, whereas tracking errors produced by an 
adaptive controller might be expected to decrease 
over time as the plant and/or control parameters 
are updated based on runtime information. At 
the same time, adaptive controllers that perform 
well in the face of parametric uncertainty may 
not perform well in the face of other types of 


e = Ae + B{8a + r]} 


where rj represents the uncertainty resulting from 
inexact cancellation of nonlinearities and 


A = 


0 I 
-K 0 K\ 


; B = 


0 

I 


Under the assumption that the uncertainty is 
bounded as \\rj\\ < p(e,t), the control term 8a 
can be chosen as 


8a 


~^WW\ 


P(e, t) 


B T Pe 


The Lyapunov function 


; if \\B T Pe\\ > e 
; if \\B T Pe\\ < e 


V = e T Pe (16) 


where P is a symmetric, positive definite matrix 
satisfying a Lyapunov equation 

A t P + PA + Q = 0 (17) 

for a given symmetric positive definite matrix Q 
can be used to show uniform ultimate bound¬ 
edness of all trajectories, where the size of the 
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ultimate boundedness set depends on e. This is 
a practical notion of asymptotic stability in the 
sense that the tracking errors can be made small. 

Passivity-Based Control 

Passivity-based control is an alternative to feed¬ 
back linearization control considered previously 
and relies on some fundamental structural prop¬ 
erties of the Euler-Lagrange equations, primarily 
linearity in the parameters and passivity. 

The passivity property (Ortega and Spong 
1989) of robot dynamics follows from the fact 
that the matrix N(q,q) = M(q) — 2C(q,q) is 
skew symmetric, that is, the components njk of 
N satisfy njk = —n^j (Spong et al. 2006). This 
property implies that the total energy E of the 
robot satisfies 

E = 0 T u (18) 

and can be used to design provably correct robust 
and adaptive control laws. 


Passivity-Based Robust Control 

The passivity and linearity-in-the-parameters 
properties of the robot dynamics can be exploited 
to design robust and adaptive controllers that do 
not attempt to cancel the system nonlinearities as 
in the inverse dynamics approach. A passivity- 
based robust controller may be defined as 

u = M(0)a+C(0,0)v + g(0)-Kr (20) 

where the quantities v, a, and r are given as 

v = e r -a o 

a = v = 0 r — A 0 
r = 0~]v = 0 + AO 

and K is a diagonal matrix of positive gains. 

Using the linearity-in-the-parameters prop¬ 
erty, the closed-loop system can be written as 

M(6)r + C(0, 9)r + Kr = Y(0, 9,a,v ) 

(6 - O) (21) 


Linearity in the Parameters 


In the robust passivity-based approach, the term 
O is chosen as 


The robot equations of motion are defined in 
terms of certain parameters, such as link masses, 
moments of inertia, etc. The complexity of the 
dynamic equations makes the determination of 
these parameters a difficult task. Fortunately, the 
equations of motion are linear in these inertia 
parameters in the following sense: There ex¬ 
ists an n x l matrix function Y(q,q,q ) and an 
l -dimensional constant vector O such that the 
Euler-Lagrange equations can be written as 

M(6)6 + C(0, 0)0 + g(9) = Y(9, 0, 0)0 (19) 


6 = $ 0 + so 

where <£>o is a fixed nominal parameter vector and 
<50 is an additional control term. The additional 
term <50 can be designed according to 


<50 


\-Pwrjn ;if >e 


-f Y T r ; if ||F r r|| < e 


The function T(0, 0, 0) is called the regressor 
and O G R l is the parameter vector. The dimen¬ 
sion of the parameter space, that is, the number 
of parameters needed to write the dynamics in 
this way, is not unique, and finding a minimal set 
of parameters that can parameterize the dynamic 
equations is difficult in general. 


where p is a (constant) bound on the parameter 
uncertainty. Uniform ultimate boundedness of 
the tracking errors follows using the Lyapunov 
function 

V = ^r T M(d)r + 9 t AK6 
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where, as before, the size of the ultimate bound¬ 
edness set depends on the parameter e. 

Passivity-Based Adaptive Control 

In the adaptive version of this approach, we con¬ 
sider again the control law (20) and the resulting 
closed-loop system 

M(6)r + C(0, 0)r + Kr = 7(6 - O) 

In this case, the term 6 is taken as the output of 
an estimator 

<J> = -T ~ 1 Y T (6,6,a,v)r (22) 

The Lyapunov function 

V = -r T M(6)r + 0 T AK9 + l<Fr<!> 

2 2 

can be used to show global convergence of the 
tracking errors to zero and boundedness of the 
parameter estimates. 

One of the problems with the adaptive 
control approaches considered here is the so- 
called parameter drift problem. It can be 
shown that the estimated parameters converge 
to the true parameters provided the reference 
trajectory satisfies the condition of persistency of 
excitation 

rto+T 

ai < / Y T (e r , e r ,e r )Y(o r ,d r ,e r )dt < pi 
Jto 

for all to, where a, ft, and T are positive 
constants. 


Summary and Future Directions 

We have discussed the commonly applied meth¬ 
ods of PID control, feedback linearization con¬ 
trol, as well as robust and adaptive control for 


motion control of robot manipulators. There is a 
large and relatively mature body of literature on 
these methods, and in fact, the material here is 
now contained in standard textbooks in robotics, 
such as Siciliano et al. (2010) and Spong et al. 
(2006). 

Future directions in robot motion control in¬ 
clude the full integration of vision, force, and 
position feedback, cooperative control of mul¬ 
tiple arms, and advances in machine learning 
and human-robot interaction. Direct control of 
robots through brain-machine interfaces is also 
an active area of research and will enable new 
areas of applications such as medical assistive 
robots. 


Cross-Referenes 

► Adaptive Control, Overview 

► Cooperative Manipulators 

► Flexible Robots 

► Force Control in Robotics 

► Lyapunov’s Stability Theory 

► Robot Teleoperation 

Recommended Reading 

Many of the fundamental theoretical problems 
in motion control of robot manipulators were 
solved during an intense period of research from 
about the mid-1980s until the early-1990s dur¬ 
ing which time researchers first began to exploit 
the structural properties of manipulator dynamics 
such as feedback linearizability, skew symmetry 
and passivity, multiple time-scale behavior, and 
other properties. For a more advanced treatment 
of some of these topics, the reader is referred 
to Spong et al. (1992) and Canudas de Wit et al. 
(1996). 

A survey of robust control of robots up to 
about 1990 is found in Abdallah et al. (1991). The 
passivity-based robust control result here is due 
to Spong (1992). The first results in passivity- 
based adaptive control of manipulators were 
in Horowitz and Tomizuka (1986) and Slotine 
and Li (1987). The Lyapunov stability proof 
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of passivity-based adaptive control is due 
to Spong et al. (1990). A unifying treatment 
of adaptive manipulator control from a passivity 
perspective was presented in Ortega and Spong 
(1989). 
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Abstract 

Robots may allow human beings to physically 
interact with remote objects and environments. 
This possibility is known as robot teleoperation 
and permits to operate in conditions or environ¬ 
ments dangerous for human operators. Although 
teleoperation was among the first developments 
in robotics back in the 1950’s, still nowadays 
there are important and difficult challenges for 
researchers and scientists, showing the intrinsic 
difficulties of this fascinating field of robotics. 


Keywords 

Bilateral control; Force reflection; Robot teleop¬ 
eration; Time delay 


Introduction 

A robotic teleoperation system allows to repro¬ 
duce the actions of a human operator and to 
interact physically with objects and environments 
placed at a distance. This possibility has always 
attracted the human being, and telemanipulation 
has been one of the first fields to be developed in 
robotics: the first modern applications of this type 
of technology are dated back to the 1940s and 
the early 1950s for handling radioactive material 
(Goertz and Thompson 1954), for underwater 
and space applications (Martin and Kuban 1985; 
Vertut and Coiffet 1986), and for human pros- 
theses (Kobrinskii 1960). For an overview on 
applications, see Sheridan (1992), Hokayem and 
Spong (2006), Ferre et al. (2007) and the related 
references. Nevertheless, despite the research 
interest and the many existing devices, many 
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challenging problems have still to be fully solved 
both from the technical and control point of view. 

In these notes, an overview on robot teleop¬ 
eration is presented. In particular, the following 
points are illustrated: 

• General description of a telemanipulation sys¬ 
tem and of its key components: the “master” 
the “slave” and the “communication channel” 

• Overview on applications and existing devices 

• Some control techniques for telemanipulation 
systems: “traditional” force reflection, shared 
compliance control, Passivity-based control, 
predictive control, four-channel architecture 

General Description 

of a Telemanipulation System 

A telemanipulator is a complex mechatronic sys¬ 
tem in which the main elements are a master (or 
local) and a slave (or remote) device, intercon¬ 
nected by a communication channel. The overall 
system is interfaced on one side (the master) with 
a human operator and on the other (the slave) with 
the environment: see Fig. 1. 

Both the master and slave devices have a local 
controller, with a hardware/software complexity 
that can be quite different depending on the 


system and task to be executed. Key features 
of this type of devices, usually not present in a 
typical robotic manipulation system, are: 

1. A human operator is involved in the loop for 
the (high-level) control of the task execution. 

2. It is necessary to provide to the operator, 
possibly in real time, data related to the task. 
This implies the presence of a suitable user 
interface and the selection of proper signals 
transmitted to the operator. These signals, e.g., 
related to forces applied to the environment, 
relevant positions of the slave, graphical video 
data, and tactile or acoustic information, have 
strong implications on the control properties 
and performances of the system. 

3. A communication channel is present between 
the master and the slave. This channel may 
represent a source of problems when time- 
delays are present since, as well known from 
the control theory, delays in a feedback loop 
may generate instability. Problems of this 
type, firstly observed in a force feedback 
scheme in 1965 (Ferrel 1966), arise, for 
example, in underwater or space operations. 
Note that even time-delays of the order of 
the tenth of a second may create instability 
problems. 



Robot Teleoperation, Fig. 1 A telemanipulation system and its block scheme representation. Subscripts m and s refer 
to variables at the master and slave site, respectively 
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In the block diagram of Fig. 1, some implicit 
choices have been made. The operator specifies 
a desired velocity (x m ) to be applied to the envi¬ 
ronment through the master, the communication 
channel, and the slave and receives back a force 
signal ( f m d ). In the figure, the flow of the signals 
could be reversed as well, letting the operator 
specify a force to the environment and receiving 
back a velocity information. This is equivalent 
to reversing the roles of the master and slave 
devices. When this operation is possible, the tele¬ 
operation system is defined bilaterally controlled 
(Bejczy and Handlykken 1981). 

One of the goals of the control system is to 
have, in steady state, the slave velocity equal to 
the master velocity, i.e., x s = x m , and similarly 
for the forces, f m d = f s . When this is accom¬ 
plished, the teleoperator is defined transparent 
(Lawrence 1993). 

In this general framework, the main features 
of the components of a telemanipulator are the 
following. 

The Master 

The master, or local system, is the interface 
through which the operator specifies commands 
to the whole device. Typical features of the 
master are: 

- Capability of assigning tasks to the slave and 
providing the operator with relevant infor¬ 
mation about the task development. In fact, 
an important feature of the master is its ca¬ 
pability of providing the operator with the 
telepresence , i.e., the sensation of being in 
some manner involved with task execution. 
In this respect, several solutions have been 
adopted, varying from joysticks and/or con¬ 
soles (Hirzinger et al. 1992) to exoskeletons 
(Bergamasco et al. 2007; Smith et al. 1992) 
and so on. In these devices, different types 
of signals may be reflected to the operator, 
from simple graphical data to full kinetostatic 
information. 

- Capability of acquiring and processing data 
from both the operator and the slave. Typi¬ 
cal elaborations are filtering, prediction, delay 
compensation, modeling of remote and local 
dynamics, and so on. 


The Slave 

The slave, or remote system, is the part of the 
teleoperator which directly interacts with the en¬ 
vironment for task execution. Requirements sim¬ 
ilar to the master may be specified for the slave 
system: 

- A robotic system for the interaction with the 
environment and the execution of the task 
planned by the operator. This part, usually 
provided with autonomous features, has to be 
in some way customized to operate in particu¬ 
lar environments, e.g., submarine, outer space, 
and nuclear areas. Note that the kinematics 
and the dynamics of the remote manipulator 
may be different from those of the local one 
(when present), originating several problems 
when telepresence is needed for task execution 
(Colgate 1993). 

- Signal acquisition and processing. Sensory 
capability is a main requirement for the slave 
device, which is often equipped with video 
cameras, force/tactile sensors, proximity sen¬ 
sors, and so on. 

- Capability of data processing. Also the 
remote site must be able to elaborate the 
information needed for task execution. In fact, 
besides other considerations, the destabilizing 
effects caused by communication delays 
and/or restricted bandwidths of transmission 
must be compensated locally, providing 
the slave system with a certain degree of 
autonomy. 


The Communication Line 

The communication line represents the link be¬ 
tween the master and slave sites. Different plat¬ 
forms may be used for this purpose, from radio 
connections by means of satellites to cables for 
underwater operations. The main drawback that 
can be introduced by this element is a delay, due 
both to a physical delay in the transmission line 
(e.g., in a long satellite communication) and to 
limited bandwidth of the hardware. This delay, 
that sometimes is not even constant, can cause 
noticeable instability problems if proper compen¬ 
sating actions are not taken. 
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An Overview on Applications 

Use of telemanipulators, in the broader sense of 
the terminology, may be found in a number of 
different applications developed since the early 
1950s. First examples of these devices have been 
designed and realized for operations in radioac¬ 
tive environments and for human limb prostheses. 
At the moment, this type of technology is applied 
in a number of different fields: space, underwa¬ 
ter, medicine, hazardous environments, security, 
simulators, and so on. 

Space Applications 

Robotics is used in space for exploration, sci¬ 
entific experiments, and commercial activities. 
Main reasons of space telerobotics are the high 
costs and the hostile environment for human 
beings. For many years, the main example of 
teleoperation in space was applications in space 
shuttle activities where the operators had a direct 
control of the task executed by the manipula¬ 
tor. Nowadays, an important application of robot 
technology is for planetary missions, where au¬ 
tonomous telerobots are required and the operator 
has only a supervisory control of the task. Main 
directions of current research activity for space 
robotics are the development of arms for both 
intra-vehicular and extravehicular activities, free- 
flying platforms, and planetary rovers. 

Among the most known examples of robot 
arms for space one can list the Canadian Remote 
Manipulator System (RMS), installed on the US 
space shuttles. The 6 degree-of-freedom (dof) 
arm, built by the Canadian firm SPAR, had a 
flexible, 15 m long structure and was capable 
of executing preprogrammed and/or teleoperated 
tasks. Five arms have been built, working on 
space shuttles from 1981 to 2011. Since 2001 
the Canadarm 2 is used on the ISS (Interna¬ 
tional Space Station). This 7 dof, 17.6m long 
arm is used for assembly and maintenance pur¬ 
poses. 

Concerning planetary exploration, a first 
successful space telerobotic program has been the 
Mars Viking Program, which performed scien¬ 
tific experiments on Mars in 1976. More recently, 
NASA has sent to Mars the rovers Sojourner 


in 1997 (working for about 3 months) and 
Spirit and Opportunity, which arrived in 2004. 
Opportunity is still working (January 2014), see 
http://marsrovers.jpl.nasa.gov/home/index.html. 
New missions on Mars with other, more complex, 
rovers are currently planned by NASA. 

With the current technological possibilities, 
further substantial developments in this field are 
slowed down by the large amount of money and 
time required to guarantee a successful mission. 
However, relevant technical problems still ex¬ 
ist due to reliability requirements, weight con¬ 
straints, hostile environments and communica¬ 
tion time-delays (ranging from 1 s in earth orbits 
to 4-40 min or more for planetary missions). 

Underwater 

After the first successful military applications of 
underwater telerobotics (in 1966 the US Navy’s 
CURV - Cable-controlled Underwater Recovery 
Vehicle - was successfully employed to retrieve 
a nuclear bomb from the ocean), extensive use of 
ROVs (remote operated vehicles) has started in 
the 1980s for offshore operations for oil/gas in¬ 
dustry. At the moment, underwater telerobotics is 
mainly used for business, military missions, and 
scientific expeditions. Telerobotic (autonomous) 
tasks are usually limited to small routine tasks 
rather than complete activities, for example, sim¬ 
ple tool switching operations, repetitive bolt/nut 
screwing, and piloting to new locations. First ex¬ 
amples of underwater teleoperation were mainly 
based on manned submersibles, either free swim¬ 
ming or connected to a surface ship, and with 
teleoperated arms on the outer structure. In more 
recent operations, human operators remotely con¬ 
trol the submersibles by long fiber-optic cables 
for data communication, increasing the costs and 
complexity of the missions. 

Probably, the most important users are in the 
business field, where it is more convenient to use 
teleoperated devices rather than human divers 
to perform inspections and repairs on deep sea 
equipment. The main users of telesubmersibles 
are the oil and communication (telephone) indus¬ 
tries, where underwater pipes and cables require 
routine operations. The scientific community 
uses this technology for marine biological, 
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geological, and archeological missions, while the 
military have used telerobotics in many salvage 
operations, such as plane or watercraft recovery. 

The conditions of the water environment, e.g., 
the high pressures, the poor visibility, and the 
communication difficulties, cause the major prob¬ 
lems in this field. In order to solve the problems 
due to the high pressure, a very robust mechanical 
structure and (typically) hydraulic actuators are 
employed. On the other hand, vision problems 
are not so easily solved, being related to several 
factors of the environment. External lighting is 
necessary, and other technologies (e.g., sonar) are 
sometimes used. Computer graphic simulation 
may help the user during task execution in par¬ 
tially known environments. For references, see, 
e.g., Ridao et al. (2007). 

Medical Telerobotics 

Several teleoperated devices are found in the 
medical field. In fact, robotic manipulators are 
used to perform surgery, diagnose illnesses or in¬ 
juries, help impaired people, and train specialized 
medical personnel. 

Robotic systems of different complexities 
have been developed since the 1950s for aid 
to impaired people. Among the most common 
systems are automated wheelchairs, controlled 
by voice or by joysticks for hand, mouth, eye, or 
head movements. 

At the moment, there is a relevant interest 
in applying teleoperated devices in microsurgery 
operations, e.g., eye surgery, where small precise 
movements are needed. The movements of the 
operator are scaled down by the mechanism so 
that very fine operations can be performed while 
maintaining a suitable telepresence effect. An¬ 
other important class of surgical process consists 
of the so-called minimally invasive procedures. In 
this case, the surgeon operates through small in¬ 
sertions using thin medical instruments and small 
video cameras. The increased difficulties for the 
surgeon are partially compensated by computers, 
which are used to create virtual environments 
where the use of telepresence plays a fundamen¬ 
tal role. 

A very attractive application is the use of 
telemanipulators in remote surgery operations. 


Telediagnosis may also broaden the range of a 
single doctor by allowing to exam a patient visu¬ 
ally or viewing records on a computer interface. 
Finally, telepresence is becoming very important 
for the instruction of specialized doctors and to 
perform rehearsals before the actual operation. 

Security 

Applications in this area aim to employ teler- 
obotic devices for the protection of persons and 
properties. Most systems used in this area are 
teleoperated devices since these tasks require 
decision capabilities and intelligence levels not 
currently possible for machines, although the 
use of autonomous systems is more and more 
frequent. 

In the area of security, robots may be used 
for patrolling buildings and for protection pur¬ 
poses. These devices can either be autonomous 
or teleoperated. Military applications adopt prin¬ 
cipally teleoperation, mainly for locating enemies 
or dangerous equipment without direct risk for 
human personnel. Unmanned aeroplanes or tele¬ 
operated devices for the detection and destruction 
of mines or bombs are well-known examples of 
this technology. Teleoperation is also used for fire 
extinguishing, in order to spray water or chemical 
agents with remotely operated vehicles. 

Telerobotics in Hazardous Environments 

Robots may substitute human beings for opera¬ 
tions in hazardous environments; as a matter of 
fact, nuclear industry was the first important user 
of modern teleoperating devices. Telerobotics is 
applied in several nuclear or chemical plants and 
also for military applications (e.g., for build¬ 
ing military equipment and arms) in a variety 
of tasks. Besides direct handling of radioactive 
or chemical material, robots are used in waste 
cleanup/disposal and plant inspection. Ammuni¬ 
tion disposal also makes use of telerobotic ma¬ 
chines. 

Telerobotics in Mining and Other 
Industries 

Besides the typical use of robots in a number of 
industrial applications (assembly, welding, paint¬ 
ing, and so on), other applications of robotic 
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systems in nonconventional production processes 
have been developed, for example, in mines, con¬ 
structions, agriculture, warehousing, and many 
other activities. 

Use of telemanipulators for mining applica¬ 
tions, despite the relevant motivations such as 
high costs and risks of human work, finds dif¬ 
ficulties and limitations caused by the particular 
environment and the relevant level of autonomy 
requested to operate in mines. As a matter of fact, 
the mining industry has only recently started to 
experiment teleoperated devices; see e.g., Duff 
et al. (2010). These machines are being developed 
to perform frame wall building, structure testing, 
hole drilling, wall blasting, mine digging, and 
so on. 

In construction tasks, not considering that all 
construction/destruction machinery controlled by 
a human can be regarded as examples of teleoper¬ 
ators (e.g., cranes and front-end loaders), applica¬ 
tions of real telerobotic systems are not so numer¬ 
ous because of the unstructured environments and 
the nonrepetitive tasks. Current work in this area 
concerns the development of machines for earth 
movement, construction of structures, building 
window washing, bridge inspection and mainte¬ 
nance, and power line repair. 

The Control Problem 

For the development of a reliable teleoperation 
system, providing force feedback to the user, the 
problems caused by the interaction of the robotic 
device with the environment and the possible 
time-delays caused by the communication 
channel have to be properly considered and 
solved. 

In telemanipulation without either force feed¬ 
back to the operator or a local compliance con¬ 
trol, the remote manipulator is strictly controlled 
according to the master position signal. As a 
consequence, the system results in being stiff, 
and errors between the master and slave posi¬ 
tions may cause excessive and undesired contact 
forces. 


In bilateral telemanipulation, it has been 
proven that a profitable manner for increasing 
system performances (e.g., in terms of task com¬ 
pletion time, total contact time, and cumulative 
contact force) is to reflect back to the operator 
information about the force applied to the 
environment. On the other hand, it results that the 
force reflection gain, that gives to the operator 
the feeling of the interaction, destabilizes 
the system, especially when time-delays are 
present. 

Control schemes for robotic teleoperation de¬ 
vices can be classified according to the general 
structures reported in Fig. 2, showing the direct 
teleoperation , the coordinated teleoperation , and 
the supervisory control schemes. In the direct 
teleoperation scheme, possible only for negli¬ 
gible time delays, the operator has direct con¬ 
trol of the slave robot and receives feedback 
in real time. In the coordinated teleoperation 
scheme, the operator still controls the remote 
robot, but low-level control loops in the slave 
system are present because time delays do not 
allow the operator to control directly the ac¬ 
tuators. In the supervisory control scheme, the 
remote site has more autonomy and task exe¬ 
cution is controlled locally, while the operator 
gives mainly high-level commands and acts as 
a supervisor. A local loop is present also at the 
master side, indicating the presence of (usually) a 
model (graphical, mathematical, etc.) of the slave 
site to improve performances in case of large time 
delays. 

Some of the main control architectures for 
teleoperation devices presented in literature 
to deal with the problems of time-delay and 
force reflection are now briefly described 
and commented. The considered architectures 
are the “traditional” force reflection , the 
shared compliance control , the passivity-based 
teleoperation , the predictive control , and four- 
channel scheme. However, many other control 
schemes have been presented in the literature; 
see, e.g., Arcara and Melchiorri (2002), Hirche 
et al. (2007), and therefore what is presented here 
is a brief, though significant, overview in order 
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Direct Teleoperation Coordinated Teleoperation 


Supervisory Control 


Robot Teleoperation, Fig. 2 Possible structures of bilateral control schemes for robotic teleoperation 


Robot Teleoperation, 
Fig. 3 The “traditional 
force reflection” 
transmission scheme 


fmd 


%sd 



to focus on the major problems encountered in 
this field and on some of the approaches for their 
solutions. 


Traditional Force Reflection Teleoperation 

The simplest manner of transmitting the remote 
force to the operator is to reflect it directly, 
without any particular elaboration, as shown 
in Fig. 3. The resulting transmission equations 
are 


( = fsit- T) 

l Xsd(t) = x m (t - T) 


( 1 ) 


where T is the time-delay introduced by the com¬ 
munication network and subscript d indicates the 
desired set point for the master (m) and slave (s) 
controllers. 


This technique presents relevant instability 
problems due to time-delays. As a matter of fact, 
it is possible to verify that the communication 
channel does not present strictly passive 
properties, even for limited bandwidths of the 
input signals x m and f s . This result is valid 
also considering an attenuation between f m d 
and f s , i.e., introducing a force reflection 
gain Gf r < 1.0 in (1) and computing 

therefore f md (t ) = G fr f s {t - T). The 

attenuation reduces the telepresence sensation 
and degrades the performances, but still does 
not cause a passive (then stable) network. 
The nonpassive channel has the global effect 
of introducing in the overall system energy 
flows that, if not properly reduced by the 
local controllers, contribute to destabilize the 
telemanipulator. 

The dynamics of the overall system may 
be described by the following two sets of 
equations: the first taking into account the 
master dynamics and the force transmitted by 
the communication channel and the second 
including the slave dynamics, the position 
signals of the channel, and the local position 
controller: 
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( M m X m if — fmd(t ) B m X m (t) KhX m (t) 
\ fmd(t) = fsit-T) 


MsXsit) — fs(t) B s x s (t ) 

x s dit) — SpX m (t r) 

/j( 0 = -*s(0] 


In the above equations, and 5/, / = m,s, 
are masses and damping factors at the master and 
slave sites, Kh represents the operator (simply 
modeled as a stiffness) and K p the slave po¬ 
sition controller. The gain S p has been added, 
with respect to Eq. (1), in order to scale velocity 
variables between the two robotic systems. 

It can be shown that this control scheme does 
not guarantee stability in the presence of time 
delays, although in practical applications stability 
may still be achieved for small time delays due 
to dissipation introduced by friction and the local 
controllers. 

Shared Compliance Control 

As previously mentioned, both the interactions of 
the robotic device with the environment and the 
effects of time-delays have to be considered in 
the definition of control strategies for telemanip¬ 
ulation systems. The position-error based force 
reflection scheme deals with both these effects 
(Kim et al. 1992). This scheme is based on the 
computation of the feedback signal f m d as a 
force proportional to the error between master 
and slave positions: 

fmd if) = Gf r \x m (t) X s (t T)] 

This signal gives to the operator a sensation 
related to the difference between the postures of 
the robotic devices caused either by interactions 
or delays. Note that in this manner an elastic 
(proportional) element is introduced between the 
positions of the robots. This allows to obtain a 
stable behavior of the overall system comprehen¬ 
sive of local controllers, at least for limited values 
of G f r . 

An additional feature for dealing with prob¬ 
lems due to time-delays is the so-called shared 
compliance control (SCC). A local, autonomous 
force feedback is realized at the slave site in order 


to program active compliance and damping of the 
robotic device. This control action is important 
when compliance has to be realized between the 
(stiff) mechanical device and its environment and 
during the collision or contact phases. The over¬ 
all control system is therefore based on sharing 
autonomous and human-driven control actions. A 
block diagram of the whole system, including the 
master and slave dynamics (1 /(M m s 2 + B m s + 
Kf and l/(M s s 2 + B s s) respectively), the force 
reflection gain (G/ r ), the shared compliance con¬ 
troller (G cc ), and an environment model ( K e ), is 
shown in Fig. 4. 

For a given time-delay, the force reflection 
gain Gf r can be increased with respect to the 
traditional force reflection scheme. In any case, 
when the time-delay increases, the gain has to be 
correspondingly decreased to guarantee stability, 
i.e., the value of Gy> depends directly on the 
amount of time-delay. In fact, also this control 
scheme in general does not present passivity 
features, although it can be shown that it may be 
stable (for a limited range of time delays) with a 
proper choice of the control parameters. 

Passivity-Based Teleoperation 

A control scheme inspired by the passivity theory 
(Van der Schaft 2000) is now described. Basic 
consideration is that the communication channel 
may represent, if proper actions are not taken, 
a non-passive element between the master and 
slave. With proper modifications, the transmis¬ 
sion line presents passive properties, and there¬ 
fore, the stability of the overall system may be 
achieved for any value of the time-delay T. 

Lossless Transmission Line 
Results of passivity and scattering theories can be 
used to show that in traditional force reflection 
teleoperation, Eq. (1), the instability of the overall 
system in presence of time-delays is caused by 
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Robot Teleoperation, Fig. 4 Position-error based force reflection with SCC at the remote site 


the non passive properties of the communication 
channel (Niemeyer and Slotine 1991). On the 
other hand, it has been shown that the definition 
of a communication network based on a lossless 
transmission line provides the system with pas¬ 
sivity features for any time-delay (Anderson and 
Spong 1989), facilitating therefore the stability of 
the overall system. 

For the definition of a lossless transmission 
line, it is convenient to refer, instead of the 
velocity and force variables x, f at each port 
(see Fig. 3), to the equivalent wave variables u 
and v that are related to the passivity formalism 
and whose definition derives from the theory of 
electric circuits. By using these variables, it is 
possible to describe the power balance in a circuit 
as the difference of two positive terms which 
consider the input and the output power. In fact, 
by introducing the input wave u = \a T m , uJ] T 
and the output wave v = [vvJ] T , the power 
balance in the teleoperator can be expressed as 

p = \ (u T u-v T v) = f T x = [fT, //] 

By considering a proper scaling factor b , defined 
as the characteristic impedance of the transmis¬ 
sion line, the previous equation defines the fol¬ 
lowing transformations between power and wave 
variables: 

dm — splb ( ^ — ^/2b ^ 

Vtn — y^(/m — ^ — ~^ 2 b^f s ~F ^ A?) 


The resulting network is described by 

fmd(t ) = fs (t T') + b\x m (t') %sd(t T)] 
Xsd(t) = x m (t - T) + j > [f md (t -T)- f s (t)\ 

In terms of wave variables, the passivity-based 
communication network is described as (see 
Fig. 5) 

fmd(t ) = b JC m (t) + \f 2 b v m (t ) 

Xsd (t) = -j[./v(f ) - \/2 b v s (t)] 

In analogy with electric networks, impedance 
adaptation should be added to both extremities 
of the transmission line, as described e.g., in 
Niemeyer and Slotine (1991). 

Predictive Control 

In a well-known example of space telerobotics, 
the ROTEX project (Hirzinger et al. 1992), the 
problems introduced by force feedback and time- 
delays have been solved in a different manner. In 
fact, in this case the force information is not trans¬ 
mitted to the operator, and an extensive use of 
graphic simulation and telesensor programming 
is made to help control of the task execution. 

In particular, the predictive display technique 
(Sheridan 1992) has been employed for gener¬ 
ating and extrapolating beforehand visual indi¬ 
cations, such as cursors or wire frame models 
of the manipulator and its environment. These 
information are generated by the control sys¬ 
tem and assist the operator in driving the task 
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Robot Teleoperation, Fig. 5 Transmission line based on passivity 


execution more efficiently. In this case, a proper 
prediction algorithm has to be set on the basis of 
current initial conditions of the manipulator and, 
possibly, of current control variables. 

In telerobotics, predictive displays have to 
be purposely designed in order to consider the 
prediction of motions of the manipulator. Usually, 
the task is graphically simulated in real time, 
without time-delay, exploiting a model of the 
remote environment and of the slave device. The 
operator can observe the task executed by the 
remote system on the screen, where a simulated 
copy (with T = 0 s) of the robotic device can 
be superimposed on the real operating device in 
the scene of the remote site. In this manner, the 
operator may program appropriate actions for the 
interaction with the environment. 

This type of task planning helps when a no¬ 
ticeable time-delay occurs. In fact, when opera¬ 
tors deal with relevant time-delays (e.g., larger 
than 1 s), usually they operate with a “move 
and wait” strategy, conservatively specifying only 
small displacements to the remote robot. By us¬ 
ing predictive display, the time required to ex¬ 
ecute complex tasks is greatly reduced. On the 
other hand, the operator has only visual informa¬ 
tion about the remote environment and the task 
execution. 

Four-Channel Scheme 

A generalization of the scheme of Fig. 1, the 
so-called four-channel architecture (Hirche et al. 
2007; Lawrence 1993), is shown in Fig. 6 . In this 
scheme, both the velocity and force signal of 


the master and slave are transmitted, and with 
a proper choice of the four blocks Ci,C 2 ,C 3 , 
and C 4 many design goals can be achieved, in 
particular concerning the stability and the trans¬ 
parency of the overall system. In particular, if 
C3 = C4 = 0 , the standard velocity-force trans¬ 
mission scheme is obtained, while ideal trans¬ 
parency is achieved if C\ = Z cs , C2 = C3 = 
/, C4 = —Z cm . In the figure, the blocks Z m and 
Z s represent the master and the slave dynamics 
(impedances), respectively, while C m and C s are 
the local master and slave controllers, f£ is an 
external force applied by the user, and f* an 
exogenous force from the environment. 


Summary and Future Directions 

In these notes, an overview on telemanipulation 
has been presented with the aim of giving a 
general presentation of the impact of this area of 
robotics on both industry and research, of outlin¬ 
ing typical problems encountered in dealing with 
remote manipulation systems, and of illustrating 
some approaches for their solution. 

In this respect, it has to be pointed out that, 
besides the control schemes considered in these 
notes (purposely developed for telerobotic sys¬ 
tems), many other schemes have been presented 
in the literature (see, e.g., Hokayem and Spong 
2006, Ferre et al. 2007, and Arcara and Mel- 
chiorri 2002). More in general, however, a rele¬ 
vant literature exists, and important results have 
been presented from a methodological point of 
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Robot Teleoperation, Fig. 6 Block scheme of the four-channel architecture, (a) The human operator, (b) Master 
controller, (c) Communication line, (d) Slave controller, (e) Environment 


view to face control problems of time-delay sys¬ 
tems: see for example, Gu et al. (2003). 

There are, however, other important aspects of 
telemanipulation which, for space constraints, 
can only be mentioned here, such as the 
“impedance shaping” (typical in applications in 
which there is a relevant dynamic/mechanical 
difference between the master and slave 
mechanisms) or criteria for defining (and 
measuring) performance of teleoperator systems, 
such as the “time to completion,” criteria based 
on energy consumption, dexterity, and so on. 
Other interesting, and important, extensions are 
the possibility of controlling remote teams of 
robots cooperating for the execution of a common 
task (e.g., for aerial inspections, transport of 
heavy loads, etc.). 

Future developments of robotic teleoperation 
systems will deal with the technological improve¬ 
ments of the user interface, giving to the operator 
more “realistic” feedback of the remote environ¬ 
ment, the application of this type of technology 
to more complex situations, and the use of multi¬ 
robot systems controlled either by one or more 
cooperating users. Control will play in any case a 
fundamental role in these scenarios. 

Cross-References 

► Advanced Manipulation for Underwater Sam¬ 
pling 

► Control of Linear Systems with Delays 


► Disaster Response Robot 

► Force Control in Robotics 

► Model-Predictive Control in Practice 

► Redundant Robots 

► Robot Visual Control 

Recommended Reading 

Introductory and historical perspectives of 
telemanipulation, along with descriptions 
of several interesting applications of this 
technology, may be found in Ferre et al. 
(2007), Hokayem and Spong (2006), Sheridan 
(1992), and Vertut and Coiffet (1986). Specific 
applications, e.g., space, underwater, medical, 
and hazardous environment, are described in 
Duff et al. (2010), Hirzinger et al. (1992), 
http://marsrovers.jpl.nasa.gov/home/index.html, 
and Ridao et al. (2007). Some of the main control 
schemes specifically developed for this type of 
robotic devices are reported in Anderson and 
Spong (1989), Arcara and Melchiorri (2002), 
Colgate (1993), Hirche et al. (2007), Kim et al. 
(1992), and Niemeyer and Slotine (1991), while 
some basic background material on control 
theory is available in Gu et al. (2003) and Van der 
Schaft (2000). 
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Robot Visual Control 

Francis Chaumette 
Inria, Rennes, France 

Abstract 

This article presents the basic concepts of vision- 
based control, that is, the use of visual data 
to control the motions of a robotics system. It 
details the modeling steps allowing the design 
of kinematics control schemes. Applications are 
also described. 


Keywords 

Jacobian; Kinematics; Robot control; Visual ser- 
voing 


Introduction 

Visual control, also named visual servoing, refers 
to the use of computer vision data as input of real¬ 
time closed-loop control schemes to control the 
motion of a dynamic system, a robot typically 
(Chaumette and Hutchinson 2008; Hutchinson 
et al. 1996). It can be seen as sensor-based control 
from a vision sensor and relies on techniques 
from image processing, computer vision, and 
control theory. 

An iteration of the control scheme consists of 
the following steps: 
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Robot Visual Control, Fig. 1 A few images acquired 
during two visual servoing tasks: on the top, pedestrian 
tracking using a pan-tilt camera; on the bottom, control- 

• Acquire an image. 

• Extract some useful image measurements. 

• Compute the current value of the visual fea¬ 
tures used as inputs of the control scheme. 

• Compute the error between the current and the 
desired values of the visual features. 

• Update the control outputs, which are usually 
the robot velocity, to regulate that error to 
zero, i.e., to minimize its norm. 

For instance, for the first example depicted on 
Fig. 1, the image processing part consists in ex¬ 
tracting and tracking the center of gravity of the 
moving people, the visual features are composed 
of the two Cartesian coordinates of this center 
of gravity, and the control schemes compute the 
pan and tilt velocities so that the center of gravity 
is as near as possible of the image center de¬ 
spite the unknown motion of the people. In the 
second example where a camera mounted on a 
six-degrees-of-freedom robot arm is considered, 
the image measurements are a set of segments 
that are tracked in the image sequence. From 
these measurements and the knowledge of the 3D 
object model, the pose from the camera to the 
object is estimated and used as visual features. 
The control scheme now computes the six com¬ 
ponents of the robot velocity so that this pose 
reaches a particular desired value corresponding 


ling the 6 degrees of freedom of an eye-in-hand system so 
that an object appears at a particular position in the image 
(shown in blue ) 

to the object position depicted in blue on the 
images. 


Basic Theory 

Main if not all visual servoing tasks can be 
expressed as the regulation to zero of an error e(t) 
which is defined by 

e(0 = s(m(r(0), a) - s*(0- (1) 

The parameters in (1) are defined as follows 
(Chaumette and Hutchinson 2008). The vector 
m(r(t)) is a set of image measurements (e.g., 
the image coordinates of interest points, or the 
area, the center of gravity, and other geometric 
characteristics of an object). These image mea¬ 
surements depend on the pose r(7) between the 
camera and the environment. They are used to 
compute a vector s(m(r (t)), a) of visual features, 
in which a is a set of parameters that represent 
potential additional knowledge about the sys¬ 
tem (e.g., coarse camera intrinsic parameters or 
3D model of objects). The vector s*(t) contains 
the desired value of the features, which can be 
either constant in the case of a fixed goal or 
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varying if the task consists in following a spec¬ 
ified trajectory. 

Visual servoing schemes mainly differ in the 
way that the visual features are designed. As 
represented on Fig. 2, the two most classical ap¬ 
proaches are named image-based visual servoing 
(IBVS), in which s consists of a set of 2D pa¬ 
rameters that are directly expressed in the image 
(Espiau et al. 1992; Weiss et al. 1987), and 
pose-based visual servoing (PBVS), in which s 
consists of a set of 3D parameters related to the 
pose between the camera and the target (Weiss 
et al. 1987; Wilson et al. 1996). In that case, 
the 3D parameters have to be estimated from 
the image measurements either through a pose 
estimation process using the knowledge of the 3D 
target model, or through a partial pose estima¬ 
tion process using the properties of the epipolar 
geometry between the current and the desired 
images, or finally through a triangulation process 
if a stereovision system is considered. Inside 
IB VS and PBVS approaches, many possibilities 
exist depending on the choice of the features. 
Each choice will induce a particular behavior of 
the system. There also exist hybrid approaches, 
named 2-1/2D visual servoing, which combine 
2D and 3D parameters in s in order to benefit 
from the advantages of IB VS and PBVS while 
avoiding their respective drawbacks (Malis et al. 
1999). 

The design of the control scheme is based on 
the link between the time variation of the features 


and the robot control inputs, which are usually 
the velocity of the robot joints q. This relation is 
given by 

3s 

s = J s q + 7 — (2) 

ot 

where J s is the features Jacobian matrix, defined 
from the equation above similarly as the classical 
robot Jacobian. For an eye-in-hand system (see 
the left part of Fig. 3), the term represents 
the time variation of s due to a potential object 
motion, while for an eye-to-hand system (see the 
right part of Fig. 3) it represents the time variation 
of s due to a potential sensor motion. 

As for the features Jacobian, in the eye-in¬ 
hand configuration, it can be decomposed as 
Chaumette and Hutchinson (2008) 

Js = L s c \ n J(q) (3) 

where 

• J(q) is the robot Jacobian such that x n = 
J(q)q where x n is the robot end effector ve¬ 
locity. 

• c \ n is the spatial motion transform matrix 
from the vision sensor to the end effector. It 
is given by 

C 17 _ [ C t«]x C R/? SA\ 

n ~ [ ° R „ J {} 

where c R n and c t n are respectively, the rota¬ 
tion matrix and the translation vector between 



Robot Visual Control, Fig. 2 If the goal is to move the 
camera from frame R c to the desired frame R c *, two 
main approaches are possible: IB VS on the left , where the 



features s and s* are expressed in the image, and PBVS 
on the right, where the features s and s* are related to the 
pose between the camera and the observed object 
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Fig. 3 In visual servoing, 
the vision sensor can be 
either mounted on the robot 
(eye-in-hand configuration) 
or remote and observing 
the robot (eye-to-hand 
configuration). For the 
same robot motion, the 
motion produced in the 
image will be opposite 
from one configuration to 
the other 



the sensor frame and the end effector frame 
and where [ c t n \ x is the skew symmetric matrix 
associated to c t n . Matrix c Y n is constant when 
the vision sensor is rigidly attached to the end 
effector, which is usually the case. Thanks to 
the robustness of closed-loop control schemes, 
a coarse approximation of c R /7 and % is suf¬ 
ficient in practice to get an estimation of c Y n . 
If needed, an accurate estimation is possible 
through classical hand-eye calibration meth¬ 
ods. 

• L s is the interaction matrix of s defined such 
that s = L s v where v is the relative velocity 
between the camera and the environment. 

In the eye-to-hand configuration, the features 
Jacobian J s is composed of Chaumette and 
Hutchinson (2008) 

J s = —L s c \ f f V„ J(q) (5) 


where 

* f Vn is the spatial motion transform matrix 
from the robot reference frame to the end 
effector frame. It is known from the robot 
kinematics model. 

• C Yf is the spatial motion transform matrix 
from the camera frame to the reference frame. 
It is constant as long as the camera does not 
move. In that case, similarly as for the eye- 
in-hand configuration, a coarse approximation 
of C R/ and c tf is usually sufficient to get an 
estimation of C V/. 


A lot of works have concerned the modeling 
of the visual features and the determination of 
the analytical form of the interaction matrix. To 
give just an example, in the case of an image 
point with normalized Cartesian coordinates x = 
(x,y) and whose 3D corresponding point has 
depth Z, its interaction matrix is given by Espiau 
etal. (1992) 


L x 


— 1/Z 0 x/Z xy — (1 + x 2 ) y 
0 —1/Z y/Z 1 + y 2 — xy —x 

( 6 ) 


where the three first columns contain the ele¬ 
ments related to the three components of the 
translational velocity and where the three last 
columns contain the elements related to the three 
components of the rotational velocity. 

By just changing the parameters representing 
the same image point, that is, by using the cylin¬ 
drical coordinates defined by y = ( p, 6 ) with 
p = A Jx 2 + y 2 and 0 = Arctan(y/x), the 
interaction matrix of these parameters has a com¬ 
pletely different form (Chaumette and Hutchin¬ 
son 2008): 


L y = 


—clZ —s/Z p/Z (1 + p 2 )s — (1 + p 2 )c 0 
s/(pZ) —c/(pZ) 0 c/p s/p -1 


(7) 


where c = cos 0 and s = sin0. This implies 
that using the Cartesian coordinates or the cylin¬ 
drical coordinates as visual features will induce 
a different behavior, that is, a different robot 
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trajectory and a different trajectory of the point 
in the image. 

Currently, the analytical form of the interac¬ 
tion matrix is available for most classical geomet¬ 
rical primitives, such as segments, straight lines, 
ellipses, moments related to planar objects of any 
shape (Chaumette 2004), and also coordinates of 
3D points and pose parameters. Methods also 
exist to estimate off-line or online a numerical 
value of the interaction matrix. Omnidirectional 
vision sensors, the coupling between a camera 
and structured light, and even 2D echographic 
probes have also been studied. A large variety of 
visual features is thus available for many vision 
sensors. 

Once the modeling step has been performed, 
the design of the control scheme can be quite 
simple. The most classical control scheme has 
the following form (Chaumette and Hutchinson 
2008): 

q = -AJ s +(s-s*)+J s + ^--J s + ^ ( 8 ) 

where A is a positivive gain tuning the rate of 
convergence of the system and J s + is the Moore- 
Penrose pseudo inverse of an approximation or 
an estimation of the features Jacobian. The ex¬ 
act value of all its elements is indeed generally 
unknown since it depends of the intrinsic and 
extrinsic camera parameters, as well as of some 
3D parameters such as the depth of the point in 
Eqs. (6) and (7). 

The second term of the control scheme an¬ 
ticipates for the variation of s* in the case of a 
nonconstant desired value. The third term com¬ 
pensates as much as possible a possible target 
motion in the eye-in-hand case and a possible 
camera motion in the eye-to-hand case. They are 
both null in the case of a fixed desired value and a 
motionless target or camera. They try to remove 
the tracking error in the other cases. 

Following the Lyapunov theory, the stabil¬ 
ity of the system can be studied (Chaumette 
and Hutchinson 2008). Generally, visual servo- 
ing schemes can be demonstrated to be locally 
asymptotically stable (i.e., the robot will con¬ 
verge if it starts from a local neighborhood of 


the desired pose) if the errors introduced in J s 
are not too strong. Some particular visual ser- 
voing schemes can be demonstrated to be glob¬ 
ally asymptotically stable (i.e., the robot will 
converge whatever its initial pose) under similar 
conditions. 

Finally, when the visual features do not con¬ 
strain all the robot degrees of freedom, it is 
possible to combine the visual task with supple¬ 
mentary tasks such as, for instance, joint limits 
avoidance or the visibility constraint (to be sure 
that the target considered will always remain 
in the camera field of view). In that case, the 
redundancy framework can be applied and the 
error to be regulated to zero has the following 
form: 

e = J s + (s - s*) + (I - J s +J s ) e 2 (9) 

where (I — J s + J s ) is a projection operator on 
the null space of the visual task so that the 
supplementary task e 2 will be achieved at best 
under the constraint that the visual task is realized 
(Espiau et al. 1992). A similar control scheme 
to (8) is now given by 

, 3e 

q = —Ae——. (10) 

This scheme has for instance been applied for 
the first example depicted on Fig. 4 where the 
rotational motion of the mobile robot is con¬ 
trolled by vision, while its translational motion is 
controlled by the odometry to move at a constant 
velocity. 

Applications 

Potential applications of visual servoing are nu¬ 
merous. It can be used as soon as a vision sensor 
is available and a task is assigned to a dynamic 
system to control its motion. A non-exhaustive 
list of examples is (see Fig. 4): 

• The control of a pan-tilt-zoom camera, as 
illustrated in Fig. 1 for the pan-tilt case 

• Grasping using a robot arm 
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Robot Visual Control, Fig. 4 Few applications of visual 

servoing: navigation of a mobile robot to follow a wall 

using an omnidirectional vision sensor (top row), grasping 

• Locomotion and dexterous manipulation with 
a humanoid robot 

• Micro- or nano-manipulation of MEMS or 
biological cells 

• Pipe inspection by an underwater autonomous 
vehicle 

• Autonomous navigation of a mobile robot in 
indoor or outdoor environment 

• Aircraft landing 

• Autonomous satellite rendezvous 


a ball with a humanoid robot {middle row), assembly of 
MEMS and film of a dialogue within the constraints of a 
script in animation {bottom row) 

• Biopsy using ultrasound probes or heart mo¬ 
tion compensation in medical robotics 

• Virtual cinematography in animation 

Summary and Future Directions 

Visual servoing is basically a nonlinear control 
problem. Several modeling works have been re¬ 
alized to design visual features so that the control 
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problem is transformed as much as possible to a 
linear control problem, leading to better stability 
properties. On one hand, improvements on this 
topic are still expected. On the other hand, the 
design of advanced control schemes, such as 
optimal control or model predictive control, is 
another way to make improvements. Then, taking 
into account dynamic constraints, such as non- 
holonomic constraints or underactuated systems, 
also necessitates the design of specific control 
laws. 

Cross-References 

► Lyapunov’s Stability Theory 

► Redundant Robots 

► Robot Motion Control 

Recommended Reading 

In addition to the classical tutorial Hutchinson 
et al. (1996) and the most recent one Chaumette 
and Hutchinson (2008), the books Corke (1997, 
2011) and the collection of papers Hashimoto 
(1993), Kriegman et al. (1998), and Chesi et al. 
(2010) provide a good overview of past and 
recent works in the field. The other references 
below cited in text present the main pioneering 
contributions in visual servoing. 
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Abstract 

Robust adaptive control pertains to the satis¬ 
factory behavior of adaptive control systems 
in the presence of nonparametric perturbations 
such as disturbances, unmodeled dynamics, and 
time delays. This article covers the highlights of 
robust adaptive controllers, methods used, and 
results obtained. Both methods of achieving 
robustness, which include modifications in 
the adaptive law and persistent excitation 
in the reference input, are presented. In 
both cases, results obtained for robustness 
to disturbances and unmodeled dynamics are 
discussed. 


Keywords 

Dead zone; Global boundedness; Parameter 
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Introduction 

The central problem in adaptive control pertains 
to regulation and tracking of systems in the pres¬ 
ence of parametric uncertainties. The classical 
adaptive control problem solved in 1980 assumed 
that the underlying transfer function had un¬ 
known parameters, but no other uncertainties. No 
disturbances, delays, time variations in parame¬ 
ters, or unmodeled dynamics were assumed to 
be present. Under these ideal conditions, it was 
shown that an adaptive controller can be designed 
so that the closed-loop system has bounded sig¬ 
nals and that asymptotic regulation and tracking 
were possible. 

With asymptotic regulation and tracking 
achieved under such ideal conditions, the goal 
of robust adaptive control was to ensure globally 
bounded signals in the closed-loop adaptive 
system when the plant was subjected to a 
variety of nonparametric perturbations such as 
external disturbances, time-varying parameters, 
unmodeled dynamics, and time delays. With 
adaptation in the control parameters in the ideal 
case accommodating parametric uncertainties, 
the approaches developed in robust adaptive 
control focused on developing solutions in the 
perturbed case to accommodate nonparametric 
uncertainties and improving on the classical 
adaptive controller which either underperformed 
or even exhibited instability with the introduction 
of nonparametric perturbations. 

We briefly present the adaptive control solu¬ 
tions for the ideal case before proceeding with 
robust adaptive control. 


Classical Adaptive Control 

Adaptive Control of Plants with State 
Feedback 

One of the very first problems where stable adap¬ 
tive control was solved was for the case when 
states are accessible (Narendra and Kudva 1972), 
with the plant given by (The argument t is sup¬ 
pressed for the sake of convenience, except for 
emphasis.) 


x p = A p x p + bXu (1) 

where A p e M nxn and the scalar A are unknown 
parameters with b and the sign of A known 
and (A p ,b) controllable. An adaptive controller 
that ensures global boundedness and asymptotic 
regulation and tracking for such plants is of the 
form 

u = 6 T X (t)x p + 0 r (t)r , (2) 

and the adaptive laws for adjusting the unknown 
parameters are given by 

9 = —sign (A) T oob^n P e , (3) 

where co = [vj, r] r and 0 = [0j, 0 r ] T , x m is 
the state of a reference model 

Xm — A m x m + bv (4) 

with A m Hurwitz, and P being the solution of the 
Lyapunov equation A^P + PA m = — Q, Q > 0. 
The reference model in (4) is to be chosen so that 
certain matching conditions are satisfied, which 
are of the form 

Ap + bxef = A m , Xd* = 1 (5) 

for some 6* = [0* T , 0*] T . In such a case, the 
controller in (2) and (3) guarantees stability and 
ensures that x(t) tracks x m (t). The underlying 
Lyapunov function is quadratic in e and the 
parameter error 6 = 6 — 6*, given by 

V = 1 [e T Pe + A0 r r _1 e) (6) 

with a negative semi-definite time derivative V 
given by 

V<-e T Qe. (7) 

Adaptive Control of Plants with Output 
Feedback 

Consider the single-input single-output (SISO) 
system of equations 
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yit) = W(s)u(t) (8) 

where u e 9^ is the input, y e 9^ the measurable 
output, and s the differential operator. The trans¬ 
fer function of the plant is parameterized as 

w(s) - k 'W) <9) 

where k p is a scalar and Z(s) and P(s ) are monic 
polynomials with deg(Z(s)) < deg(P(s)). The 
following assumptions will be made throughout: 

Assumption 1 fk(s) is minimum phase. 

Assumption 2 The sign of k p is known. 

Assumption 3 The relative degree n * and the 
order of W(s) are known. 

The goal is to design a control input u so that 
the output y in ( 8 ) tracks the output y m of the 
reference system 

y m (t) = W m (s)r(t ) = k m r(t) (10) 
P m (s) 

where k m is a scalar and Z m (s) and P m (s) are 
monic polynomials with W m (s) relative degree 
n*. 

The structure of the adaptive controller is now 
presented: 


H(0 

= A&>i + b\u(t) 

do 

H(0 

= ka >2 + b\y(t) 

( 12 ) 

HO 

= HO, (0, HO, « 2 r (0f 

(13) 

m 

= HO, o 0 (t), e T 2 {t)] T 

(14) 

u 

= 9 t (t)co 

(15) 


where A e OfO 2-1 )*^ 2-1 ) i s Hurwitzx, b\ e DT' 2-1 , 
Q)\, 6 O 2 £ $, and 0 e $i 2n is an adaptive gain 
vector with k(t) e 6 \{t) e 9t” -1 , 02(0 £ 
St" -1 , and 0 O (0 e SR. 

The update law for the adaptive parameter 
differs depending on whether the relative degree 
of W m (s) is unity or greater than one and can be 
described as follows: 


0(0 = — sign(k p )Te y o) n* = 1 (16) 

and 

kt) = -signer«*>2 OD 

where e y = y — y m , e a is an augmented error, 
and £ is a modified regressor, both of which are 
determined by the following equations: 


£ = W(s)co, co = [r,co[ 

r ,y,co T 2 ] T , (18) 

e 2 = 6 r t - W(s)[6 t co\ 

09) 

e a = e y + ki(t)e 2 

( 20 ) 

i &a 

1_ i + r? 

( 21 ) 

The results of Narendra and Annaswamy 
(2005) guarantee that the above adaptive 
controller in Eqs. (11)—(21) will guarantee that 
e y (t ) tends to zero as t oo with all signals 


remaining bounded in both the «* = 1 and 
n* > 2 cases. 

Need for Robust Adaptive Control 

When a disturbance 77 is present, the plant dynam¬ 
ics often is of the form 

x p — A p x p + bX(u + ( 0 ) ( 22 ) 

while the reference model and the controller re¬ 
main the same as in (4) and (2), respectively. This 
in turn necessitates new tools for the analysis and 
synthesis of adaptive systems. The main reason 
for this is the fact that the standard Lyapunov 
function candidate given by 

V = -e T Pe + -Xe T r~ l d (23) 
2 2 

together with the parameter adjustment as in (3) 
yields a time derivative 

V < ~^e T Qe + ki\\e\\d 0 k x > 0 , (24) 

where do is an upper bound on the perturbation 
77 . The second term on the right-hand side of (24) 







Robust Adaptive Control 


1197 


causes V to be sign indefinite. This is because 
V is a function of both e and 9 , and therefore, 
the second term can be large compared to the 
first with the second argument of V, 9 , which 
can be arbitrary, causing V to be sign indefi¬ 
nite. The same property is what caused V to be 
semi-definite in the ideal case. Hence, in this 
perturbed case, no guarantees of boundedness 
can be provided. In fact, it can be shown that if 
rj(t) is chosen in a particular manner, the closed- 
loop signals can actually be shown to become 
unbounded, either in the presence of bounded 
disturbances (Narendra and Annaswamy 2005) or 
with unmodeled dynamics (Rohrs et al. 1985). 
This in turn led to the area of robust adaptive 
control. 

Various approaches that have been developed 
under the rubric of robust adaptive control can be 
grouped into two categories. The first of these is 
related to modifications in the adaptive laws so as 
to ensure boundedness. These changes consist of 
modifications in the adaptive law (3) as 

9 = -Vco{t)b T m Pe - og{9,e) (25) 

The problem then reduces to finding a suitable 
g(9,e). This is discussed in detail in the next sec¬ 


tion. The second approach used in adaptive con¬ 
trol pertains to the use of a persistently exciting 
reference signal r. The latter ensures parameter 
convergence of the adaptive system and there¬ 
fore exponential stability. This in turn ensures 
robustness of the overall system. These details are 
addressed in section “Robust Adaptive Control 
with Persistently Exciting Reference Input.” 


Robust Adaptive Control with 
Modifications in the Adaptive Law 

Robustness to Bounded Disturbances 

When a bounded input disturbance rj is present, 
the plant dynamics is changed as 

Xp — ApXp + bX(u + Tj(t )), (26) 

while the reference model and the controller 
remain the same as in (4) and (2), respectively. As 
mentioned above, a modification to the adaptive 
law as in (25) is needed. Over the years, different 
choices have been suggested for the nonlinear 
function g(9,e). For example, these are chosen 
as 


g(0.e) 


9 Ioannou and Sun (2013) 

11 e 11 9 Narendra and Annaswamy (2005) 

9 ( 1 — Kreisselmeier and Narendra (1982) 

\ “max / 


(27) 


where 9 max is a known bound on the parameter 9. 
(One can choose to set cr to zero if ||0|| < 9 max , 
as is done in Ioannou and Sun (2013), Tsakalis 
and Ioannou (1987) and many other references 
in the literature.) An alternate approach that is 
different from (25) is to not have an additive 
term g(-,) but rather set 9 = 0 whenever the 
error e is small in some sense. Such a dead 
zone approach was suggested, for example, in 
Egardt (1979) and Peterson and Narendra (1982). 
It can be shown that each one of these choices 
leads to boundedness, which is described be¬ 
low. Without loss of generality, we assume that 
A > 0. 


With the same Lyapunov function candidate as 
in (23), its time derivative now becomes 

v <-^e T Qe+ k l \\e\\\\r]\\ 

-\m T g(0 ,e), k\ > 0 (28) 

The property of g(.,.), together with the fact that 
rj is bounded, ensures that V < 0 outside a com¬ 
pact set Q in the ( e , 9) space. This ensures global 
boundedness of both e and 9. Boundedness of x p 
follows. 






1198 


Robust Adaptive Control 


In all of the above methods, the idea behind 
adding the term g(e,9 ) is this: the parameter 
9 can drift away from the correct direction due 
to the term ^i||^||||^||, and the construction of 
g(e, 9) is such that it counteracts this drift and 
keeps the parameter in check, by adding a nega¬ 
tive quadratic term in 9. The boundedness of both 
e and 0 is simultaneously assured in the above 
since V has a time derivative V that is nonpositive 
outside a compact set in the ( e , 0) space. It should 
be noted however that this was possible to a large 
extent because rj was bounded, and as a result, the 
sign-indefinite term remained linear in \\e\\. 

An alternative procedure, originally proposed 
in Pomet and Praly (1992) and revised and refined 
in Khalil (2001) and Lavretsky (2010), proceeds 


in a slightly different manner. Here, the bounded¬ 
ness of 0 is first established, independent of the 
error equation. It should be noted that a similar 
approach is adopted in the context of output 
feedback in plants with higher relative degree 
by using normalization and an augmented error 
approach (Narendra and Annaswamy 2005). In 
Khalil (2001) and Lavretsky (2010), no normal¬ 
ization is used but a projection algorithm. This is 
described below. 

The projection algorithm for adjusting the pa¬ 
rameter 9 is given by 

(9 = Proj($, y), (29) 

where 


Proj($, y) 


-ym 


u 


V/(g)(V/(g)) r - 

l|V/(0)|P 

if [m>0/\y T Vf(6)>0\ 
otherwise 


(30) 


y = —e T Pbco 


m = 


ii« ii 


2 — f)' 2 

Anax 


+ 2e ^max 


(31) 

(32) 


where 0 ' mix and s are arbitrary positive constants, 
and £2o and £2 i are defined as 


fio = {0 e R"|/(0) < 0} 

(33) 

£2 1 = {0eR ,, |/(0)<l}. 


From the above relations, one can show that 


9(0) e Q 0 => 9(t) e 


In addition, 


Robustness to Unmodeled Dynamics 

One of the major observations in the early eight¬ 
ies was the stark difference between the sys¬ 
tem signals in the ideal adaptive system and the 
perturbed adaptive system when the perturbation 
was due to a commonly present unmodeled dy¬ 
namics such as those of an actuator used for con¬ 
trol implementation. Among other references, the 
publication in Rohrs et al. (1985) pointed out the 
fact that when an adaptive controller prescribed 
for a first-order plant is evaluated with unmodeled 
dynamics present, instability occurs readily and 
for a wide range of command signals. A number 
of solutions have been suggested to alleviate this 
instability and form the subject matter of this 
section. 

We consider the plant in (26) with an addi¬ 
tional unmodeled dynamics so that 


#max = max (II0II), 0 max = max (II0II) (34) 

(Je£2q (Je£2i 


x p =A p x p + bXv 

Xr\ —ArjXr] + V = X 


where 9 max = #^ ax + e (Matsutani et al. 2011). 


( 35 ) 
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where is a Hurwitz matrix. If rj = v — u, then 
the plant dynamics can be rewritten as 

x p = A p x p + bX(u + rj) (36) 

Unlike the bounded disturbance case, no upper 
bound do can be assumed to exist as r) is a state- 
dependent disturbance. It is this that causes a 
huge difference between deriving boundedness in 
section “Robustness to Bounded Disturbances” 
and here in section “Robustness to Unmodeled 
Dynamics.” Significant effort has been extended 
in the adaptive control community in this regard. 
These results fall into two categories (i) that 
assure global boundedness for a narrow class of 
unmodeled dynamics and (ii) that assure semi- 
global boundedness for a slightly larger class of 
unmodeled dynamics. More recently, some re¬ 
sults have been obtained that are able to establish 
global boundedness with minimal restrictions on 
the unmodeled dynamics. In what follows, we 
give examples of each of the above two cases as 
well as the recent results. 

Global Boundedness in the Presence of a Small 
Class of Unmodeled Dynamics 
For the plant in (26), under assumptions in (5), 
the plant can be rewritten as 

x — A m x + bX(u T- 0 X t x + vf) (37) 

where A and 6* are unknown, A m and b are 
known, and rj = v — u whose state-space rep¬ 
resentation can be shown to be of the form 

Xr, = A+ trjU, T] = C^X n (38) 


is satisfied, where bo is an upper bound on \ \brj\\ 
and a a denotes the singular value of the matrix 
A, then boundedness follows. That is, robustness 
of adaptive controllers can be ensured if the 
unmodeled dynamics is fast and their zeros are 
restricted in some sense. 

A specific example of such an unmodeled 
dynamics is given by 

< (si - A^-'br, = - . (40) 

1 1 + fis 

Global Boundedness for a Large Class of 
Unmodeled Dynamics: A First-Order Example 
A different approach can be taken for the problem 
of unmodeled dynamics which allows a global re¬ 
sult, for a class of adaptive systems (Hussain et al. 
2013). The main idea here is to use the projection 
algorithm and use properties of adaptive systems 
in conjunction with linear time-varying systems. 
This is presented in this section using a first-order 
plant. 

We consider the control of 

x p (t) — a p x p (t) T kpv(t ) (41) 

where a p is unknown and k p is known. It is 
assumed that M < a , where a is a known 
positive constant. The unmodeled dynamics is 
given by (38) with 

G,(5)=c[(s/ bxii -4,)-V (42) 

The goal is to design the control input such that 
x p (t ) follows x m (t ) which is specified by the 
reference model 


for some vector 

For a class of unmodeled dynamics {c^, A^, b^}, 
if the control input in (2) and the projection 
algorithm in (29) with y and f(6) chosen as 
in (31) and (32) are used, one can guarantee 
boundedness. In particular, if the inequality 


Xmif) — GmXmit') T" k m v(t ) (43) 

where a m < 0 and r{t) is the reference input. The 
adaptive controller we propose is given by 

u(t) = 9(t)x p (t) + p-r(t) (44) 

/C n 


kO x , max An 



< 1 


(39) where the parameter 6(t) is updated using a 
projection algorithm given by 
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6(t) =y Proj(0(f), -x p (t)(x p (t) - x m (t))), 

Y > 0 (45) 


and 


Proj($, y) = 


!r_L y yoeQ A , y e> o] 

^max ^max 


otherwise 


(46) 


= {9 e R 1 1 -e' mm < e < e' m j 

= {0 e M 1 | -9 mm <9 < 0 max } (47) 

q a — 

with positive constants #^ ax and # max given by 


adaptive system specified by Eqs. (41)-(48) al¬ 
ways has guaranteed bounded solutions for a 
class of unmodeled dynamics G r] (s). There is an 
optimal value of £o, however, for which a largest 
class of G v (s) can be found. 

It should be noted that in the Rohrs example 
in Rohrs et al. (1985), the plant is first order, with 
a p = — 1, and 


G,= 


w 


s 2 + 2 £cQnS + G0% 


,2 ’ 


(49) 


for £ = 1, co n = 15. It is easy to show 
that for these values of £ and co n , if 6 max = 17, 
then Eq. (48) is satisfied and that the closed-loop 
system is robust to G n . 

In general, for a first-order plant as in (41), it 
can be shown that the adaptive system is robust 
for G^ for all (£, co n ) that satisfy the following 
inequalities for all | a p \<a: 


#max > -, ^ ^max — #m ax + e 0> (48) 

Kp 

where £o is an arbitrary constant. It can be shown 
that if # max is chosen as in (48), then the closed 


5.2 , rs \ S- kp $max 

-Clpt, + f\dp , &>«)£ - 


> 0 


(50) 


(D n > ^n n 


where 


f(a p ,co n ) 


a \ + <4 

2 (O n 


= max 


2 ^ 



cik p 0 mdx 


(51) 


When a time delay r is present in the plant to 
be controlled, the plant under consideration can 
be represented as in (37) where 

r](t) = u(t — r) — u(t ) 

Similar results of global boundedness can be 
derived in this case as well (Matsutani 2013; 
Matsutani et al. 2012, 2013). 


When r](t) is bounded with a finite upper bound 
do, it can be shown that no modifications are nec¬ 
essary in the adaptive law to ensure boundedness 
if the reference input is persistently exciting. It 
can be shown that if the reference input r{t) is 
such that the vector co* defined as co* = [x^ ,r] T 
is persistently exciting with 


Robust Adaptive Control with 
Persistently Exciting Reference Input 


1 

T 



co* T (x)wdx 


> kdo 


V t > t 0 yweR n 


We return to the plant in (26) with the control where k,T are finite constants and w is a unit 
input as in (2) and the adaptive law as in (3). vector, then the adaptive system is well behaved, 
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i.e., has globally bounded solutions (Narendra 
and Annaswamy 2005). 

An alternative approach for achieving robust¬ 
ness has been addressed in Anderson et al. (1986) 
that addresses local stability in the presence of 
persistently exciting signals. The starting point 
for this investigation is (35) but when all states 
are not accessible. Assuming that an output y = 
CpXp is measurable and a controller as in (11)- 
(15) and a reference model as in (10) are used, 
the underlying error equation can be written as 


In order to use the method of averaging 
for robust adaptive control, we write Eqs. (53) 
and (54) as 


e 


A bco T 

e 

9 


—pcoh T 0 

9 


Theorem 1 Let co(t) be bounded, almost peri¬ 
odic, and persistently exciting. Then there exists 
a c* > 0 such that for all pc e (0, c*], the origin 
of (56) is exponentially stable if 


e\ 



T co + v) 

(52) 


r / r T _ _p \n 

/ 



A i tj co(t)W m (s)co T (t)dt\ 


> 0, 


where W m (s) is asymptotically stable, 0 is the 
parameter error vector, and v is the effect of the 
unmodeled dynamics rj at the output. Suppose 
the standard adaptive law is used, and as a first 
step the perturbation v is ignored, the underlying 
error equation and the adaptive law are given by 


V/ = 


(57) 


and is unstable if 

aT 


fft 


a 


co(t)W m (s)co (t)dt 


< 0 , 


for some j = 1,..., n (58) 

ei = W m (s)9 T co (53) 


6 = —pe\co, /x > 0. (54) 

If the origin in the (e\, 6) space of (53) and (54) 
is exponentially stable, all solutions of (52) are 
bounded for sufficiently small initial conditions 
and v(t). Therefore, the question that is of inter¬ 
est is the set of conditions of persistent excitation 
that will assure such an exponential stability. This 
is addressed in Anderson et al. (1986). The un¬ 
derlying tool is the Method of Averaging (Arnold 
1982; Hale 1969; Krylov and Bogoliuboff 1943) 
used in the study of nonlinear oscillations and 
addresses the stability property of the differential 
equation 

x = pf(x, t, p), v(0) = vo (55) 

where p is a small parameter. By a process of 
averaging, the nonautonomous system in (55) 
is approximated by an autonomous differential 
equation in x av , an averaged value of v. This 
autonomous system, which is easier to analyze, 
can be used to derive stability properties of (55). 


In Kokotovic et al. (1985), it is fur¬ 
ther shown that co(t) can be expressed at 
co(t) = J2 T=-oo ^0 v k) exp(/ VfrO and the 
inequality in (57) can be satisfied if the condition 

oo 

J2 - ){ [Wm(iv k )] 5K [£2(iv*)6 T (i V*)] > 0 

k ——oo 

(59) 

is satisfied, where £2(i Vk) is the complex conju¬ 
gate of Q(i Vk). Given a general transfer function 
W m (s), there exists a large class of functions co 
that satisfies (59), even when W m (s) is not SPR. 

co in Theorem 1 is not an independent variable 
but rather an internal variable of the nonlinear 
system in (56). Hence, it cannot be shown to be 
bounded or persistently exciting. If co* represents 
the signal corresponding to co in the reference 
model, it can be made to satisfy (57) by the proper 
choice of the reference input. Expressing co = 
co* + co e , co will also be bounded, persistently 
exciting, and satisfy (57) if co e is small. This can 
be achieved by choosing the initial conditions 
e(to) and 9 (to) in (56) to be sufficiently small. 
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The conditions of Theorem 1 are then verified, 
and for a sufficiently small /z, exponential stabil¬ 
ity of the origin of (56) follows. 

Theorem 1 provides conditions for exponen¬ 
tial stability and instability when the solutions 
of the adaptive system are sufficiently close to 
the tuned solutions. These are very valuable in 
understanding the stability and instability mecha¬ 
nisms peculiar to adaptive control in the presence 
of different types of perturbations. Many of these 
results have been summarized and presented in a 
unified fashion in Anderson et al. (1986). 

Cross-References 

► Adaptive Control, Overview 

► History of Adaptive Control 

► Nonlinear Adaptive Control 

► Stochastic Adaptive Control 
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Abstract 

Robust control needs to start with a model of 
system uncertainty. What is a good uncertainty 
model? First it needs to capture the possible 
system perturbations and uncertainties. Second 
it needs to be mathematically tractable. The gap 
metric was introduced by Zames and El-Sakkary 
for this purpose. Its study climaxed in an award¬ 
winning paper by Georgiou and Smith. A modi¬ 
fied gap, called the v-gap, was later discovered by 
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Vinnicombe and was shown to have advantages. 
With these metrics in hand, robust stabilization 
issues can be nicely addressed. 

Keywords 

Gap metric; H-infinity control; v-gap metric; 
Robust stabilization; Uncertain system 

Introduction 

The gap is rooted in mathematical literature for 
the purpose of measuring the distance between 
unbounded operators (Kato 1976). It is intro¬ 
duced to control theory by Zames and El-Sakkary 
(1980) to measure the distance between systems 
and subsequently to model an uncertain sys¬ 
tem, with the recognition that a possibly unstable 
system is simply a possibly unbounded opera¬ 
tor. Here only continuous-time systems will be 
treated. Discrete-time systems can be treated in 
an analogous way. Let us identify a linear time- 
invariant (LTI) system with its transfer function. 
The set of m -input ^-output finite-dimensional 
LTI systems is then identified with the set 7 Z p * m 
of p x m real rational matrices. Such a system 
can be considered as a linear operator from input 
space PL™ t0 output space H p , defined by the 
input-output relation y = Pu. Here %2 is the 
collection of all bounded-energy signals v(s) 
satisfying 


No matter whether or not P is stable, we define 
the graph of P as 


Qp 



g n 



i.e., the graph is the set of all finite energy input- 
output pairs. It is easy to see that Qp is a linear 
subspace of 1~L™ +P and a little more effort shows 
that it is closed. Hence it uniquely corresponds 
to a bounded linear operator on H r ^ Jrp , called the 
orthogonal projection onto Qp , denoted by YIq p . 
Now with two systems P\ , P 2 G 7Z pxm , the gap 
in between is defined as 


8(p u P 2 ) = \\n gPi -n gP2 \\. 


That the gap is a metric in 7?/ xm follows from 
the fact that the induced operator norm used 
above defines a metric on the set of all orthogonal 
projections. 

With the gap between two systems, an uncer¬ 
tain system described by the gap is simply a gap 
metric ball with a center P, called the nominal 
system, and a radius r, qualifying the amount of 
uncertainty: 

B(P, r) = {P e 1Z pxm : 8(P, P ) < r}. 


Gap Computation and Robust 
Stabilization 


/ 1 r °° \ ! /2 

\\x \\2 := sup — / \x(<J + jco)\ 2 dco 

<7>0 V Z7r 7-00 / 

< 00 . 

This operator is possibly unbounded since for an 
input u G , the corresponding output y = Pu 
is not necessarily in 1~L P . It is bounded if and only 
if P is stable, i.e., if and only if P G the 

set of p x m real rational matrices bounded over 
Re s > 0. In this case, the induced operator norm 
is the Hoo norm of P : 

\\P\\oo = sup CT[P(5)] = sup d[P(ja})]. 

Res>0 weM 


With the basic definitions constructed above, the 
following questions are then asked: 
Computation: How can the gap between two 
systems be computed? 

Analysis: How much stability robustness does 
a stable feedback system have against gap 
uncertainty in the plant or in both the plant and 
the controller? 

Synthesis: How can a feedback controller be de¬ 
signed so that the feedback system has optimal 
robustness against gap uncertainty? 

For the question on computation, it is rather 
easy to see that if P\ and P 2 are static, also said 
to be memoryless, systems, i.e., P\(s) = K\ and 
P 2 (s) = K 2 , then 
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S(P U P 2 ) = a [(/ +^i K [)~ ]1/2 

{K x - K 2 )(I + K' 2 K 2 )- x/2 }. 


In the single-input-single-output case, this is ex¬ 
actly the chordal distance between two numbers 
K\ and K 2 . Hence the expression above general¬ 
izes the chordal distance between two complex 
numbers to constant matrices. What if P\ and 
P 2 are dynamic systems? It is not until Georgiou 
(1988) when the computation of the gap was 
settled by using the coprime factorization. 

For each P g 7Z pxm , there are 


V —U~ 


~M U~ 

1 

1 

_1 

’ 

_ N V _ 




such that P = NM 1 = M ] N and 


" V 

—U~ 

~M U~ 


~ I 0" 

_-N 

M _ 

_ N V _ 


_0 I _ 


These matrices are said to give a doubly coprime 
factorization of P. Also P = NM~ l and P = 
M~ l N are said to be right and left coprime fac¬ 
torizations, respectively. In the doubly coprime 
factorization, we can further require 

M t (-s)M(s) + N t (-s)N(s) = I and 
M(s)M t (s ) + M(s)M r (s ) = /. 

In this case, the coprime factorizations are said to 
be normalized. 

Theorem 1 (Computation of the gap) Let 

Pi = NiM~ l ,i = 1,2, be normalized right 
coprime factorizations. Then 


8(P\, P 2 ) = max I inf 

( Qenn"£ m 


"Ml" 


~m 2 

_Ni_ 


_n 2 _ 


Q II 00 


inf 

Qtnu™S m 


~m 2 


"Mj" 

_n 2 _ 


_Ni_ 



The problems of finding the two infima above 
are TLoo model-matching problems, special forms 
of TLoo control problems. See article ► Optimal 
Control via Factorization and Model Matching 
and article ► H-Infinity Control in this encyclo¬ 
pedia. In principle, they can be solved using the 
standard ways. 

The analysis and synthesis questions are 
satisfactorily answered by Georgiou and Smith 
(1990). Let us consider the feedback system 
shown in Fig. 1. 

Such a feedback system is denoted by a plant 
and controller pair or simply a feedback pair 
(P, C) G TlP* m xTl m * p . This closed-loop system 


is stable if the transfer matrix from 


w 1 
-w 2 


to 


u 1 
J2 


nicknamed the Gang of Four matrix, 


(I + PC)- 1 (/ + PC)~ l C 

C(J + PC)- 1 P(I + PC)~ X C 


is stable, i.e., belongs to 1ZTLoo- 

Theorem 2 (Stability margin) Assume ( P,C ) 
form a stable closed-loop system. All feedback 
systems ( P,C ) with P G B(P,r) are stable if 
andonlyifr < ||G<xF||^ 0 1 . 

It follows from Theorem 2 that \\GoF\\^ is 
the stability margin of the closed-loop system 
in Fig. 1. The natural design problem is then 
to design a controller C for a given P such 
that IIGoFH^ 1 is maximized or equivalently 
HGoFHoo is minimized. Such a problem again 
is an TLoo control problem, which is the 
topic of article ► H-Infinity Control in this 
encyclopedia. It is realized by Georgiou and 
Smith (1990) that this particular TLoo control 
problem has some unique features. Let P have 
a normalized doubly coprime factorization and 
let 


GoF = 


R(s) = M T (-s)U(s) + N t (-s)V(s). 































Robust Control in Gap Metric 


1205 



Robust Control in Gap Metric, Fig. 1 An uncertain 
feedback system 

inf || GoF || oo =(l + inf H^-eiloo) • 

c V Q e1ZH °o / 

The minimization over Q above is a one- 
block Ho o model-matching problem. It can 
be solved rather easily, much more easily 
than the H 0 o model-matching problem arising 
in the computation of gap. After finding an 
optimal Q , an optimal controller is obtained 
as 

C = -(U -MQW-NQ)- 1 . 

Qiu and Davison (1992a) extended Theorem 2 
to the case when both the plant and controller are 
subject to uncertainty. 

Theorem 3 (The arcsin theorem) Assume 
(P, C ) form a stable closed-loop system. All 


feedback systems (P , C) e B{P , rp) x B(C , rc ) 
are stable if and only if 

arcsin rp + arcsin rc < arcsin || GoF H^ 1 . 

This theorem further strengthens the role of 
\\GoF\\^ as the stability robustness of the feed¬ 
back system ( P , C). 


The v-Gap 

Partly because of the lack of efficient ways in 
computing the gap, there were efforts in seeking 
other metrics on 7Z pxm with better numerical 
and analytical properties. Several such metrics 
were proposed, including the graph metric by 
Vidyasagar (1984), pointwise gap metric by 
Qiu and Davison (1992b), and v-gap metric by 
Vinnicombe (1993). The winner is the v-gap 
which is defined by ingeniously exploring the 
special structures and properties of rational 
matrices in U pxm . For P U P 2 e K pxm , let 
Pi = Mf 1 , i = 1,2, be normalized right 
coprime factirizations. Define the v-gap metric as 


Sv(Pu Pi) = sup a {[/ + Pi(jco)Piijco)*] l/2 [Pi(jco) - P 2 (jco)][I + P 2 (jco)*P 2 (jco)\ 1/2 } 

ft)€l 


if det[M[(-s)Mi(s) + A 2 r (— s)N\(s)] has 
equal number of unstable poles and zeros and 
8 V (P \, Pf) = 1 otherwise. Apparently v-gap is 
easier to compute than the gap. When the pole- 
zero number condition is satisfied, the v-gap is 
the peak of the chordal distance between the 
system frequency responses. The v-gap is no 
greater than the gap, i.e., 

8 v (PuP 2 )<8(PuP 2 ). 

Hence the v-gap ball 

B V (P, r) = {P e lZ pxm : 8 V (P, P) < r} 

is a superset of the gap ball with the same center 
and radius. Theorems 2 and 3 can be restated 


with the gap balls B replaced by the new gap 
balls B v . Consequently the restated Theorems 2 
and 3 are less conservative than the original ver¬ 
sions for the gap. The optimal robust stabilization 
problems for the gap and the v-gap are the same: 
design C to maximizing \\GoF\\^ for a given 
P. 


Summary and Future Directions 

The gap, as well as the v-gap, and the associated 
robust control theory can be extended to infinite 
dimensional systems as in Georgiou and Smith 
(1992) and Ball and Sasane (2012), time-varying 
systems as in Foias, Georgiou, and Smith et al. 
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(1993) and Feintuch (1998), and even nonlinear 
systems as in Georgiou and Smith (1997), Ander¬ 
son et al. (2002), James et al. (2005), and Bian 
and French (2005), in varying degrees. There are 
still research opportunities in these extensions. 
The use of normalized coprime factorizations 
seems to be an obstacle in these extensions. 

For a plant P, the controller optimizing 
\\G°F\\oo is not always a good controller. This 
gives another example where “optimal” is not 
always equal to “good.” One reason is that the 
actual plant uncertainty cannot necessarily be 
well described by a gap ball or a v-gap ball. 
Another reason is that performance issues other 
than the stability robustness, such as tracking and 
disturbance rejection, are not taken into account 
in the optimization. The actual plant uncertainty 
might be better described by a gap ball centered 
at a frequency-shaped plant P = W 0 PWi where 
Wj and W 0 are real rational weighting matrices 
which can also be chosen to address tracking and 
disturbance rejection requirements. In this case, 
an optimal controller C can then be designed 
to optimize the GoF matrix corresponding the 
shaped plant P. Finally C = WiCW 0 is used 
as a designed controller for the original plant 
P. With the proper choice of Wi and W 0 , it is 
more likely that a good controller will result. 
This loop-shaping design method was proposed 
in McFarlane and Glover (1992) and further 
developed in Vinnicombe (2001). 

In the process of obtaining the arcsin theorem, 
it has been realized that the gap and even more 
so the v-gap are closely related to the canonical 
angles between linear subspaces. In fact the gap 
is the sin of the largest canonical angle between 
certain subspaces and the largest canonical angle 
itself is also a metric, a better one in some 
geometric sense. For the latest development on 
canonical angles, see Qiu et al. (2008) and Zhang 
and Qiu (2010). 

In addition to the effort in deepening and 
expanding the notion of gap and its use in robust 
control, there is also effort in making it more 
accessible and more closely related to classical 
frequency response analysis; see Qiu and Zhou 
(2013). It again appears that the use of coprime 
factorizations in the current theory is hinder¬ 
ing this effort. Hence, circumventing the use of 


coprime factorizations, normalized or not, in the 
development of the gap would help its extension 
and popularization. 

Cross-References 

► H-Infinity Control 

► Optimal Control via Factorization and Model 
Matching 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 


Recommended Reading 

The most authoritative work on gap, v-gap and 
the associated robust stabilization theory is the 
comprehensive monograph Vinnicombe (2001). 
This theory is inherently an input-output fre¬ 
quency domain theory. However many related 
computations, such as those of doubly normal¬ 
ized coprime factorizations, FL 0 0 model match¬ 
ing, and the optimal controller synthesis, can 
be done using state-space formulas and further 
using MATLAB programs. Vinnicombe (2001) 
contains a list of such state-space formulas. This 
theory provides a good example of the once pop¬ 
ular and successful philosophy behind the linear 
multivariable control theory: thinking in terms of 
transfer functions and computing in term of state- 
space equations. 

The system and control background needed 
to understand and study the gap, the v-gap, and 
robust stabilization, in particular the coprime 
factorization and frequency domain stabilization 
theory, can be found in Vidyasagar (1985). 

The book Zhou and Doyle (1998) also con¬ 
tains considerable content on gap based robust 
control. 
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Abstract 

Basic robust control problems are studied for 
the feedback systems where the underlying plant 
model is infinite dimensional. The Hoo optimal 
controller formula is given for the mixed sensitiv¬ 
ity minimization problem with rational weights. 
Key steps of the numerical computations required 
to determine the controller parameters are illus¬ 
trated with an example where the plant model 
include time delay terms. 

Keywords 

Coprime factorizations; Direct design methods; 
Inner-outer factorizations 

Introduction 

Robust control deals with the feedback system 
shown in Fig. 1, where Pa represents the uncer¬ 
tain physical plant and C is a fixed controller to 
be designed. 

Here, it is assumed that the controller and the 
plant are linear time invariant (LTI) systems and 



Robust Control of Infinite Dimensional Systems, 
Fig. 1 Feedback system F(C, P/s) with fixed controller 
C and uncertain plant Pa 
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they are represented by their transfer functions. 
Furthermore, Pa satisfies the following condi¬ 
tions: 

Pa(s) = P(s ) + A (s) 

where P is the nominal plant model, with P (s) 
and Pa(s) having the same number of poles in 
C+ ; and there is a known uncertainty bound W(s ) 
satisfying 

\A(jco)\ <\W(j(o)\ VwgM. 

Definition 1 All Pa satisfying the above condi¬ 
tions are said to be in the set of uncertain plants 
V , which is characterized by the given functions 
P(s) and W(s). 

Depending on physical system modeling, 
other forms of uncertainty representations can be 
more convenient than the additive unstructured 
uncertainty model taken here; see, e.g., Doyle 
et al. (1992), Ozbay (2000), and Zhou et al. 
(1996) for the examples of multiplicative, 
coprime factor, parametric, and structured 
uncertainty descriptions. Note that for notational 
convenience and simplicity of the presentation, 
single-input-single-output (SISO) plants are 
considered here; for extensions to multi-input- 
multi-output (MIMO) plants, see, e.g., Curtain 
and Zwart (1995). 

When the plant under consideration is infi¬ 
nite dimensional, the transfer function P(s) is 
irrational, i.e., it cannot be expressed as a ratio 
of two polynomials (it does not admit a finite¬ 
dimensional state-space representation). Typical 
examples of such systems are spatially distributed 
parameter systems modeled by partial differential 
equations, fractional-order systems, and systems 
with time delays. The reader is referred to Curtain 
and Morris (2009) for examples of transfer func¬ 
tions of distributed parameter systems. There are 
many interesting industrial applications where 
fractional-order transfer functions are used for 
modeling and control, see, e.g., Monje et al. 
(2010); typically, such functions are rational in 
s a , where a is a rational number in the open 
interval (0, 1). Transfer functions of systems 
with time delays involve terms like e~ hs where 
h > 0 is the delay; see Sipahi et al. (2011) for 


various real-life examples where time-delay mod¬ 
els appear. Transfer functions considered here are 
functions of the complex variable s with real 
coefficients, so P(s) = P(s) where s denotes the 
complex conjugate of s. 

Definition 2 A linear time invariant system H 
is said to be stable if its transfer function H(s) 
is bounded and analytic in C+. In this case, the 
system norm is 

||tfII = II#lloo = sup \H(s)\, 

Re(s)>0 

which is equivalent to the energy amplification 
through the system H\ see Doyle et al. (1992) 
and Foias et al. (1996). 

Definition 2 is sometimes called the H 0 o- 
stability, and in this setting, the set of all stable 
plants is the function space 1-Loo. It is worth noting 
that for infinite-dimensional systems, there are 
other definitions of stability (Curtain and Zwart 
1995; Desoer and Vidyasagar 2009), leading to 
different measures of the system norm. 

Robust Control Design Objectives 

Let T{C, Pa) denote the feedback system shown 
in Fig. 1. This system is said to be robustly stable 
if all the transfer functions from external inputs 
(r, v) to internal signals (e, u) are in Hoo for all 
Pa € V. In the controller design, robust stability 
of the feedback system is the primary constraint. 

The feedback system T(C, Pa) is robustly 
stable if and only if the following conditions hold; 
see, e.g., Doyle et al. (1992) and Foias et al. 
(1996), 

(a) S,CS,PS € -Hoc, where S = (l + PC)~\ 
and 

(b) ||WCS||oo < 1 . 

In order to illustrate these design constraints 
for robustly stabilizing controller, as an example, 
consider a strictly proper stable plant, i.e., 

P £ Hoo with lim \P(s)\ =0. 

kKoo 
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In this case, all controllers in the form C = 
<2/( 1 — PQ) satisfy condition (a) for any Q e 
Hoo (moreover, any controller C satisfying (a) 
must be in this form for some Q e Hoo)- Now 
consider a rational with a stable Q such that 
| Q ( jco ) | is a continuous function of co Gl. Then, 
condition (b) becomes 

ll^2lloo<i <=► \Q(jo>)\<\W(jco)\-' 


Hoo -optimization problem, known as the mixed 
sensitivity minimization : given W\, W 2 , P , find a 
controller C satisfying ( a ) and achieving 


(d) 


W U S 

_w 2 t _ 


sup (|m(s)S(s )| 2 + |W^)r(s )| 2 

Re(s)>0 V 


<r 


Vco e M. 

So, whenever the modeling uncertainty is “large” 
on a frequency band co e £2, the magnitude of Q 
should be “small” in this region. 

When the plant is unstable, say p e C+ is 
a pole of P(s) of multiplicity one, conditions (a) 
and (b) impose a restriction on the controller, that 
leads to 


for the smallest possible y > 0 , where 

T(s ) := 1 — S(s) and W 2 (s) represents 
the multiplicative uncertainty bound, with 
\W 2 (jco)\ = \W(Jco)\/\P(Jco)\, V co e M. The 
smallest achievable y is the optimal performance 
level y 0 pt and the corresponding controller is 
denoted by C opt . Typically, when P is infinite 
dimensional so is the optimal controller. 


1 > HWCSIloo > 


W(p) 

N{p) 


= lim 

s—y p 


(s + p) 


P{s). 


where N(p) 


So, a necessary condition, for (b) to hold in this 
case, is \ W(p)\ < |iV(/?)|, which means that the 
modeling uncertainty at the unstable pole of the 
plant should be small enough for the existence of 
a robustly stabilizing controller. This is one of the 
fundamental quantifiable limitations of feedback 
systems with unstable plants; see Stein (2003) for 
further discussions on other limitations. 

Many other performance-related design objec¬ 
tives, such as reference tracking and disturbance 
attenuation, are captured by the sensitivity mini¬ 
mization , which is defined as finding a controller 
satisfying (a) and achieving 


(c) Halloo <K 


for the smallest possible y > 0 , for a given stable 
sensitivity weight W\(s). Selection of W\ de¬ 
pends on the class of reference signals and distur¬ 
bances considered; see Doyle et al. (1992), Ozbay 
(2000), and Stein (2003) for general guidelines. 
Stability robustness and performance objectives 
defined above can be blended to define a single 


Design Methods 

Approximation of the Plant 

One possible way to design a robust controller 
for an infinite-dimensional plant P is to design 
a robust controller C a for an approximate finite¬ 
dimensional plant P a \ (for a frequency domain 
approximation technique for infinite-dimensional 
systems, see Gu et al. 1989). When W \, W 2 , and 
P a are finite dimensional, standard state-space 
methods, Zhou et al. (1996), can be used to find 
an 1-Loo controller C a achieving 


WxS a 

W 2 T a 


< Ya 
00 


for the smallest possible y a , where S a := (1 + 
PaC a )~ l and T a = (1 — S a ). Then, the controller 
C = C a satisfies (a) and achieves the perfor¬ 
mance objective ( d ) with 

Y = (Ya+e) 7 ^— , ^ := ||C fl S fl (P-P fl )||oo, 

where it is assumed that the approximation of the 
plant is made in such a way that s < 1. Clearly, 
if Ya -> y 0 pt as £ -> 0, then y -> y op t as 
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s 0. The conditions under which y a —> y 0pt 
are discussed in Morris (2001). 

Direct Design Methods 

The classical two-Riccati equation approach, 
Zhou et al. (1996), developed for finite¬ 
dimensional systems, has been extended to 
various classes of infinite-dimensional systems 
by using the state-space techniques where 
semigroup theory plays an important role; see 
van Keulen (1993) for further details. 

In order to illustrate some of the key steps of 
a frequency domain method developed in Foias 
et al. (1996), consider a specific example where 
the plant is given as 


s — 0.5 _ 1 

s + 1 - 3e~ 2hs ~ 1 + H f (s) ’ 

1 _ e ~2h(s~0.5) 

H f (s)=1.5 -——. (3) 

s — 0.5 

The impulse response of H F is h F (t) = \.5e^ 2 
when t e [0, 2 h] and h F {t) = 0 otherwise. 
Stability of No can also be verified from the 
Nyquist graph of H F . Also, note that No (s) can 
be factored as No(s) = N\(s)N 2 (s) where 


Ni(s) = 


N 2 (s) = 


(s + 2 ) (s + 1 ) 
(s 2 + 2s 2 + 2 ) 

1 

s + 0.5 


( 1 


(,-!)(, + 2 ) 

(s 2 + 2s + 2)(s + 1 — 3e~ 2hs ) 
h = In(2) ^ 0.693. (1) 


First, compute the location of the poles in C+ 
using available numerical tools for finding the 
roots of quasi-polynomials; see, e.g., Sipahi et al. 
(2011) for references. For the simple example 
chosen here, P(s ) has only one pole in C+, at 
s = 0.5 (for larger values of h , the number of 
unstable poles of P may be higher). Now, the 
plant can be factored as follows: 


P(s) = 


Mn(s)No(s) 

M d (s) 


( 2 ) 


where 


with N \, A ) -1 G ?foo and N 2 is finite-dimensional 
(first order in this example). 

The above steps illustrate coprime factoriza¬ 
tions and inner-outer factorizations for systems 
with time delays (retarded case). For systems 
represented by PDEs or integrodifferential equa¬ 
tions, plant transfer function can be factored sim¬ 
ilarly, provided that the poles and zeros in C+ can 
be computed numerically. 

When the plant is in the form (2) given above 
and the weights W\ and W 2 are rational, the 
optimal performance level and the corresponding 
optimal controller is obtained by the following 
procedure (see Foias et al. (1996) for details). 

• Controller parameterization transforms the 
mixed sensitivity minimization to a problem 
of finding the smallest y > 0 for which there 
exists Q e PLoo such that 


M n (s) = 


S -Z±e-»* 
s + 1 


M d (s) = 


s -0.5 
s + 0.5 


r Wi" 


' WxN 2 ' 

[ 0 _ 


-w 2 n 2 


M N (R + M D Q) 


<y 

oo 


are all-pass (inner) transfer functions and 
(s + 2 )(s + 1 ) 


N 0 (s) = 


(s 2 + 2 s 2 + 2)(s + 0.5) 
( s — 0.5 \ 

-b 1 — 3e~ 2hs J 


is a minimum-phase (outer) transfer function. 
Note that 


where R(s ) is a rational function (whose order 
is one less than the order of Mo) satisfying 
certain interpolation conditions at the zeros of 
M d (s). 

A spectral factorization determines Wo € PLoo 
such that fF 0 -1 G Pioo and 

(| m (»| 2 +|^ 2 (»| 2 )| A f 2(»| 2 

= \W 0 (jco)\ 2 VfflgM, 
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(here, it is assumed that W 2 N 2 and (W 2 N 2)- 1 
are in TLoo)- 

• By using the norm preserving property of the 
unitary matrices and the commutant lifting 
theorem , it has been shown that 



where T is the Hankel operator whose symbol 
is 


Afz, (—s) fAfjv C—s) W ^- 1 (—,s)iV 2 (—J) (—s) Wi (s) - ^ 0 (s)fl<»J 


and T is the Toeplitz operator whose symbol 
is W\ (s) W 2 (s)N2 (s) Wf \s). Moreover, under 
mild technical assumptions, the optimal con¬ 
troller is obtained from a nonzero 1 Js 0 G TL 2 
satisfying 

(ropt - (r*r + r*T)) f 0 = o 


and let fix,... ,fiin x be the zeros of E y (s ), 
enumerated in such a way that —fi ni +k = 
fik G C+, for k = l,...,«i. For notational 
convenience, assume that the zeros of E y are 
distinct for y = y 0pt . 

Now define a rational function depending on 
y > 0 and the weights W\ and W 2 , 


The operator (T*r + T*T) is in the form of 
a skew-Toeplitz operator that gives the name 
to this approach. See Foias et al. (1996) for a 
detailed exposition. 


F Y (s) := y 


dW\(-s) 

nW t (s) 


Gy(s) 


where G y g TLoo is an outer function determined 
from the spectral factorization 


Optimal / H 0 o-Controller 

The above steps have been implemented, and 
the final optimal controller expression has been 
obtained in a simplified form described below. 

Let ot\ ,..., at G C+ be the zeros of 
Md(s), i.e., unstable poles of the plant (for 
simplicity of the exposition, they are assumed 
to be distinct). The sensitivity weight can 
be written as IFi(s) = nW\(s)/dW\(s), 
for two coprime polynomials nW\ and dW\ 
with deg(nW\) < deg(dWx) = : n\ > 1. 

Define 


Gy(-S)Gy(s) 

= ( w 2 (-s)w 2 (s) 

V (Fi(-#,(s) y 2 ) ' 

With the above definitions, under certain mild 
conditions (satisfied generically in most practical 
cases), the optimal controller can be expressed as 


Copt(^) — 


E r (s)M D (s)F Y (s)L(s) , 

1 + M N (s)F Y (s)L(s) ° U 1 ’ 


where y = y 0pt and L(s) = L 2 (s)/L\(s) with 
polynomials L\ and L 2 , of degree n\ + l — 1 , 
determined from the interpolation conditions: 



Li(Pk) + M N (p k )F y (p k )L 2 (p k ) = 0 

k = 1 ,. 

.. ,«i 

L\(a k ) + M N (a k )F y (a k )L 2 (a k ) = 0 

k = 1 ,. 

..,i 

L 2 (-Pk) + M^kXFyifitXhi-Pk) = 0 

k = 1 ,. 


L 2 X cx k X MN(a k )F y (ci k )L\( cx k ') — 0 

k = 1 ,. 














1212 


Robust Control of Infinite Dimensional Systems 


The above system of equations can be rewritten 
in the matrix form 


n y o = o (5) 

where the 2{n\ + t) x 1 vector O contains the 
coefficients of L\ and L 2 , and lZ y is a 2{n\ + 
Q x 2(«i + Q matrix which can be computed 
numerically when y is fixed. The optimal per¬ 
formance level y 0 pt is the largest y which makes 
7 Z Y singular. The corresponding nonzero gives 
L(s ), and hence, all the components of C opt are 
computed. 

Example 1 Consider the weighted sensitivity 
minimization for the plant (1) with the following 
first-order weights: 


W x (s) = W 2 (s) = ks (6) 

s 

where k > 0 represents the relative importance 
of the multiplicative uncertainty with respect to 
the tracking performance under steplike reference 
inputs (see Doyle et al. 1992; Ozbay 2000). With 
(6) the functions E y (, s ) and F y (, s ) are computed 
as 


2 0 2 


Ey(s) = 


1 + y 2 s 
-y 2 s 2 


Fy{s) = 


-y s 


ks 2 + k y s + 1 


where k y 



(7) 


In this example l = 1 and n\ — 1, with a\ = 
0.5, /3\ = j/y. For k = 0.1, the largest y 
which makes 7Z y singular is y 0 pt = 7.452, and 
the coefficients of the corresponding L(s) are 
computed from the S VD of 7Z yopt , 


L(s) = 


-0.0867 - 0.99623 s 
-0.0867 + 0.99623 j 


0.087 + ^ 
0.087 — s ' 


Note that zeros of E y (s)Mo(s) in C+ appear as 
roots of the equation 


1 + M N (s)F y (s)L(s) =0. 


Hence, there are internal unstable pole-zero can¬ 
celations in the representation (4). An internally 
stable implementation of this controller is shown 
in Gumussoy (2011) using a realization similar to 
(3). 

The above approach can also be extended to a 
class of infinite-dimensional plants with infinitely 
many poles in C+; see Gumussoy and Ozbay 
(2004) for technical details. 


Summary and Future Directions 

This entry briefly summarized robust control 
problems involving linear time invariant infinite¬ 
dimensional plants with dynamic uncertainty 
models. Salient features of these robust control 
problems are captured by the mixed sensitivity 
minimization problem, for which a numerical 
computational procedure is outlined under 
the assumption that the weights are rational 
functions. Note that different types of plant 
models involving probabilistic, parametric, or 
structured (MIMO case) uncertainty are left out 
in this entry. Other robust control problems that 
are not discussed here include simultaneous 
stabilization (control of finitely many plant 
models by a single robust controller) and strong 
stabilization (robust control with the added 
restriction that the controller must be stable) 
of infinite-dimensional systems. Stable robust 
controller design techniques for different types of 
systems with time delays are illustrated in Ozbay 
(2010) and Wakaiki et al. (2013); see also their 
references. 

For practical implementation of infinite¬ 
dimensional robust controllers, it is important to 
find low-order approximations of stable irrational 
transfer functions with prescribed H 00 error 
bound. There exist many different approximation 
techniques for various types of transfer functions, 
but there is still need for computationally efficient 
algorithms in this area. Another interesting topic 
along the same lines is direct computation 
of fixed-order Ei 0 © controllers for infinite¬ 
dimensional plants. In fact, computation of Eioo- 
optimal PID controllers is still a challenging 
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problem for infinite-dimensional plants, except 
for some time-delay systems satisfying certain 
simplifying structural assumptions. Advances in 
numerical optimization tools will play critical 
roles in the computation of low (or fixed)-order 
robust controller design for infinite-dimensional 
plants; see, e.g., Gumussoy and Michiels (2011) 
for recent results along this direction. 

In the past, robust control of infinite¬ 
dimensional systems found applications in 
many different areas such as chemical pro¬ 
cesses, flexible structures, robotic systems, 
transportation systems, and aerospace. Robust 
control problems involving systems with time- 
varying and uncertain time delays appear in 
control of networks and control over networks. 
Ongoing research in the networked systems area 
include generalization of these problems to more 
complex and interconnected systems. 

There are also many interesting robust control 
problems in biological systems, where typical 
underlying plant models are nonlinear and 
infinite dimensional. Some of these problems 
are solved under simplifying assumptions; it 
is expected that robust control theory will 
make significant contributions to this field by 
extensions of the existing results to more realistic 
plant and uncertainty models. 

Cross-References 

► Control of Linear Systems with Delays 

► Flexible Robots 

► H-Infinity Control 

► Model Order Reduction: Techniques and Tools 

► Networked Systems 

► Optimal Control via Factorization and Model 
Matching 

► Optimization Based Robust Control 

► PID Control 

► Robust Control in Gap Metric 

► Spectral Factorization 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 

► Structured Singular Value and Applications: 
Analyzing the Effect of Linear Time-Invariant 
Uncertainty in Linear Systems 
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Abstract 

Aiming at increasing system reliability and avail¬ 
ability, integration of fault diagnosis into feed¬ 
back control systems and integrated design of 
control and diagnosis receive considerable atten¬ 
tion in research and industrial applications. In the 
framework of robust control, integrated diagnosis 
and control systems are designed to meet the 
demand for system robustness. The core of such 
systems is an observer that delivers needed infor¬ 
mation for a robust fault detection and feedback 
control. 

Keywords 

Observer-based fault diagnosis and control; 
Residual generation 

Introduction 

Advanced automatic control systems are marked 
by the high integration degree of digital electron¬ 
ics, intelligent sensors, and actuators. In parallel 
to this development, a new trend of integrating 
model-based fault detection and isolation (FDI) 
into the control systems can be observed (Blanke 
et al. 2006; Ding 2013; Gertler 1998; Isermann 
2006; Patton et al. 2000), which is strongly driven 
by the enhanced needs for system reliability and 
availability. 

A critical issue surrounding the integration 
of a diagnostic module into a feedback control 
loop is the interaction between the control and 


diagnosis. Initiated by Nett et al. (1988), study 
on the integrated design of control and diagnosis 
has received much attention, both in the research 
and application domains. The original idea of the 
integrated design scheme proposed by Nett et al. 
(1988) is to manage the interactions between the 
control and diagnosis in an integrated manner 
(Ding 2009; Jacobson and Nett 1991). 

Robustness is an essential performance for 
model-based control and diagnostic systems. In 
the control and diagnosis framework, robustness 
is often addressed in different context (Ding 
2013) and thus calls for special attention in 
the integrated design of control and diagnostic 
systems. In their study on fault-tolerant controller 
architecture, Zhou and Ren (2001) have proposed 
to deal with the integrated design in the 
framework of the Youla parametrization of 
stabilization controllers (Zhou et al. 1996), 
which also builds the basis for achieving high 
robustness in an integrated control and diagnosis 
system. Below, we present the basic ideas and 
some representative schemes and methods for the 
integrated design of robust diagnosis and control 
systems. 

Plant Model and Factorization 
Technique 

Consider linear time invariant (LTI) systems 
given in the state space representation 

xit) = Ax it ) -|- Bu(t ) -|- Edd(t) -\- Ef fit) 
y(t ) = Cx(t ) + Du(t ) + F d d(t) + Ff fit) 
zit) = C z xf) + D z uf) 

where v e 7 Z n ,y e 1Z m ,u e lZ ku stand 
for the plant state, output, and input vectors, 
respectively, z e 1Z kz is the controlled 
output vector, d e lZ kd , / e lZ k f denote 
disturbance and fault vectors, respectively. 
A, B,C,D,C Z , D z , Ed, Ef, Fd, Ff are known 
matrices of appropriate dimensions. 

A transfer matrix G(s) = D C is I — 
A)~ l B with the minimal state space realization 
(A, B, C, D) can be factorized into 
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G(s) = M~ l (s)N(s ) 

M(s) = I —C(s I — A l )~ 1 L 
N(s ) = D+C(sI- A l )~ x B l 
A l = A- LC,B l = B - LD 

where L is selected so that Al is stable and can 
be interpreted as an observer gain matrix. This 
factorization is called left coprime (Zhou et al. 
1996). 

Parametrization of Stabilizing 
Controllers 

Let 

u(s) = K(s)y(s) 

be an LTI feedback controller. By means of the 
well-known Youla parametrization (Zhou et al. 
1996), all stabilizing controllers can be described 
and parametrized by 

K(s) = (•£(■*) — Qc(s)N (s )) -1 • 

(f(s) - Q c {s)M{s)) 

X(s) = I — F(sl — A L )~ l B l 
Y( s) = F(sl — A l )~ 1 L 

where Q c (s) is a stable parameter matrix, and F 
is selected so that Ap = A + BF is stable and can 
be interpreted as a state feedback gain matrix. 


is an indicator for the occurrence of a fault. It is 
well known that all LTI residual generators can 
be parametrized by 

r(s ) = R(s ) ^M(s)y(s) — N(s)u(s)J 

where R(s ) is a stable parameter matrix and 
called post-filter (Ding 2013). 


Integration of Controller and 
Residual Generator into a Control 
Loop 

It is remarkable that both the feedback controllers 
and residual generators can be parametrized 
based on the left coprime factorization of the 
plant model. This is the basis for an integration 
of diagnosis and control into a feedback control 
system. In Ding et al. (2010), it is demonstrated 
that the abovementioned Youla parametrization 
form is in fact an observer-based feedback 
controller, which can be expressed by 

u(s) = Fx(s ) + Q c (s ) (n(s)u(s) - M(s)y(s)j 

where x (s) is a state estimate delivered by a full- 
order state observer (Anderson 1998; Zhou et al. 
1996). Moreover, the residual generator can also 
be written as 

r(s) = R(s)r 0 (s),r 0 (s) = y(s)-y(s) 


Parametrizations of Residual 
Generators 

Given the system under consideration, an LTI 
residual generator is a dynamic system with 
u(t),y(t ) as its inputs and r(t) as output which 
satisfies, for d(t) = 0, f(t) = 0, 

Vv(0lim r(t) = 0. 


with y (s) being the output estimate delivered by 
an observer (Ding 2013). As a result, a stabiliza¬ 
tion feedback controller and residual generator 
can be integrated into a dynamic system of the 
following form: 

x{t) = Ax{t) + Bu{t) + Lr 0 (t) 

= A L x(t ) + B L u{t) + Ly(t) 
r 0 (t ) = y(t) - y(t),y(t) = Cx(t ) + Du(t ) 


Residual generation is the first step for a success¬ 
ful fault diagnosis. The generated residual vector 


u(s) 
r(s ) 


Fx(s) 

0 


-Qc(s) 

R(s) 


r 0 (s). 
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The core of the above control and diagnostic 
system is a state observer that delivers a state esti¬ 
mation x (t) and the primary residual vector r 0 (t ). 
The design parameters of this integrated control 
and diagnosis system are L, F; the observer and 
state feedback control gain matrices, as well as 
Q c (s),R(s). 


Robustness of Diagnostic and Control 
Systems 

While in the robust control framework, the con¬ 
troller design is typically formulated as minimiz¬ 
ing a system norm of the transfer function matrix 
from the disturbance vector d to the control 
output z (Zhou et al. 1996), the design objective 
of a robust fault detection system consists in an 
optimal trade-off between the robustness against 
d and the sensitivity to the fault vector /. Consid¬ 
ering that 


r 0) = R(s) (y(s) - y(s)) 

= R(s) (^N d (s)d(s) + N f (s)f(s)^j 
N d (s) = F d + C (si- A L )~ l (E d - LF d ) 
N f (s) = F f + C (si — A L )~ l ( E f - LF f ) 


Ding (2013), the design objective can be formu¬ 
lated as 


sup 

m 


R(s)N f (s) 


index 


R(s)N d (s ) 


or in a suboptimum form as finding R(s ) so that 
for some given a > 0, /3 > 0 


||-||_ , which indicates the minimum influence of 
f on r (Ding 2013). 

In order to detect the fault occurrence reliably 
and successfully, a decision-making procedure is 
needed. It consists of a further evaluation of the 
residual signal and a detection logic. Typically, 
a signal norm of r, e.g., C 2 norm, and a simple 
detection logic like 

( Ik || > Jth => Alarm for fault 

( Ik II - Jth => Fault-free 

are adopted for this purpose, where J t h is a 
further design parameter and called threshold 
(Ding 2013). The threshold setting depends on 
the dynamics of r, its norm-based evaluation, and 
has significant influence on the fault detection 
performance. For the purpose of reducing false 
alarms, the threshold is often set as 


Jth = sup ||r || 

f=0,\\d\\<d d 


sup 

f=o,m<di 


R(s)N d (s)d(s) 


That is, the threshold is set to be the maximum 
value of the influence of the disturbances on the 
residual signal in the fault-free case. Thus, differ¬ 
ent designs of the residual generator will result 
in different threshold settings. In this context, 
an optimal design of a fault diagnosis system is 
understood as an integrated design of the residual 
generator, the evaluation function, and the thresh¬ 
old (Ding 2013). 


An Integrated Design Scheme 
for Robust Diagnosis and Control 


R(s)N d (s) 


< a , 


R(s)N f (s) _ >fi. 


index 


Similar to the robust controller design, a (system) 
norm like H 2 or Hoo norm, denoted by ||-|| , 
is applied for the evaluation of the influence of 
the disturbances. Differently, the evaluation of 
the sensitivity to the fault vector, expressed by 
R(s)N/(s), can be realized using either a system 
norm or the so-called H- index, denoted by 


Assume that the system under our consideration 
satisfies the following conditions: 

• iirfii 2 <^- 

• (A, B) is stabilizable and (C, A) is detectable. 

• D = 0. 

• DI D, > 0 and F d F d 
A — ml B 

C z D 
A - jcol E d 


T >0. 

has full column rank for all co. 


F d 


has full row rank for all co. 
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Then, the following observer and state feed¬ 
back gain matrices 

L* = (E d Fj + YC T ) (F d Fj)~ l 
F* = — (D t z D z )~ l ( B t X + D t z C t ) 

as well as 


Q*( S ) = (),R*( S ) = (F d Fj) 1/2 

result in an optimal integrated design of the 
robust diagnostic and control system with 

• H 2 optimal control performance 

• Maximal fault detectability and the optimal 
threshold setting 


• A fault that can be detected by any LTI detec¬ 
tion system will also be detected using the de¬ 
tection system with the above parameter and 
threshold setting. Thus, this detection system 
provides the maximal fault detectability (Ding 
2013). 

It is worth remarking that: 

• The assumptions mentioned above are stan¬ 
dard in the H 2 optimal control (Zhou et al. 
1996). 

• The optimization problem 


V&>, sup 

m 


Oi ( R(jw)Nf(jw)^ 


R(s)N d (s) 


OO 


Jfh = sup ||r|| 2 = 8 d 
f=0,\\d\\<$i 

where Y > 0, X >0 are respectively the solution 
of the following two Riccati equations: 

AY + YA t + E d E T d - (E d Fj + YC T ) ■ 

(F d FJY l ( E d Fj + YC t ) T = 0 
A t X + XA + C T Z C, - (C Z T D z + XB) ■ 

(D T z D z y { (CjD z + XB) T = 0. 

That L*,F* lead to minimizing the H 2 norm 
of the transfer matrix from d to z is a well- 
known result (Zhou et al. 1996). The optimal fault 
detection performance can be understood from 
two different viewpoints: 

• Optimum in the sense of 

Oi 

V&>, sup — 

R (s) R(s)Aj(s) 

II II OO 

= a,- ({F d Fj)~ yi N}(jo,)) 

where a, \ R(Ja>)N is the /-th 

singular value of matrix R(jco)Nf(jco), 
i = 1 ,•••,*:/, N}(s) = N f (s) |l=l* (Ding 
2013). 


is a more general form of the so-called 
H-/H 00 or Hoo/Hoo optimization of 
observer-based fault detection systems, and 
thus, its solution is called unified solution 
(Ding 2013). 

• The solution given above is a state space real¬ 
ization of the robust fault detection problems, 
which is e.g., described by Ding (2013) in 
Theorem 7.16. 

• This integrated design scheme can also be ap¬ 
plied to discrete-time and stochastic systems 
(Ding 2013). 


Summary and Future Directions 

Increasing reliability and availability of advanced 
automatic control systems is of considerable 
practical interests. Integration of fault diagnosis 
into feedback control systems and integrated 
design of robust control and diagnosis are useful 
solutions for real-time applications (Ding 2009). 
They can also be integrated into a fault-tolerant 
control system (Blanke et al. 2006; Zhou and 
Ren 2001). A further potential application field 
is fault diagnosis in feedback control loops using 
embedded residual signals (Ding et al. 2010). 

From the viewpoint of research, integrated 
design of robust control and diagnosis in 
nonlinear and time-varying dynamic systems 
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are challenging issues. The £ 2 -gain technique for 
nonlinear control (Van der Schaft 2000) and the 
fault detection scheme proposed by Li and Zhou 
(2009) are promising and useful results for the 
future investigations in this area. 


Cross-References 

► Fault Detection and Diagnosis 

► Fault-Tolerant Control 

► Robust I-L 2 Performance in Feedback Control 
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Abstract 

This entry discusses an important compromise 
in feedback design: reconciling the superior per¬ 
formance characteristics of the I-L 2 optimization 
criterion, with robustness requirements expressed 
through induced norms such as Hoo- The fact that 
both criteria have frequency-domain characteri¬ 
zations and involve similar state-space machinery 
motivated many researchers to seek an adequate 
combination. We review here robust H 2 analysis 
methods based on convex optimization developed 
in the 1990s and comment on their implications 
for controller synthesis. 


Keywords 

Linear matrix inequalities; Mixed con¬ 

trol; Robustness analysis; Structured uncertainty 


Introduction 

Can mathematics help us deal with the inevitable 
theory-practice gap? Should we be optimistic 
and assume that discrepancies between models 
and nature are random and neutral towards our 
actions or be pessimistic and design for the worst 
such discrepancies? Feedback control theory has 
struggled with these questions, perhaps more so 
than other fields. 

During the surge of optimal control in the 
1960s, optimism carried the day. A prominent 
example is the LQG (I-L 2 ) regulator, which mini¬ 
mizes the effect of random disturbances and has 
an elegant state-space solution; in comparison, 
the frequency-domain designs of classical con¬ 
trol appeared primitive and conservative. But the 
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pessimists struck back in the late 1970s, showing 
things could go very wrong (unstable) with LQG, 
when a parameter variation was introduced in the 
plant model. This ushered in the robust control 
era of the 1980s, with its worst-case analysis of 
stability over deterministic sets of plants, leading 
to other design metrics such as Hoo control. In 
this mentality, exogenous disturbances were also 
treated as an adversary to be protected against in 
the worst case, perhaps an excess of pessimism. 

The robust H 2 problem incarnates the search 
for a middle ground, where stability is treated 
with the conservatism it deserves, but perfor¬ 
mance is optimized for a more neutral noise. This 
entry summarizes efforts made around the 1990s 
to seek this compromise. 

Hi Optimal Control 

In the feedback diagram of Fig. 1, signals are 
vector valued, and we focus on continuous time. 
G is a linear system with a given state-space 
representation. Initially omit the upper loop (set 
A = 0). The LQG regulator is the controller K 
that internally stabilizes the feedback loop and 
minimizes the variance of the error variable z, 
assuming the input v is white Gaussian noise. 

For an alternative description, denote by 
T zv (s) the closed-loop transfer function from 
v to z; we wish to design K such that T zv (s ) 
is analytic in Re(s ) > 0 and has minimum H 2 
norm, defined by 

1 

a 00 j f . \ 5 

(i) 

here Tr denotes matrix trace and * denotes conju¬ 
gate transpose. The equivalence between this H 2 - 
optimal control and LQG follows from classical 
filtering, modeling v as uncorrelated components 
of unit power spectral density over all frequency. 
By adding a filter in the input of G, noise of 
known, colored spectrum can be accommodated 
as well. 

A different motivation, in the case of scalar 
v, is to observe that ||r zu ||^ 2 is the energy ( C 2 - 
norm square) of the system impulse response. 



Robust H 2 Performance in Feedback Control, Fig. 1 

Feedback control and model uncertainty 

Thus it measures the transient error in response 
to known inputs or initial conditions which may 
be generated by an impulse. 

The I-L 2 (LQG) optimal feedback has an ele¬ 
gant solution, computable in state-space through 
two algebraic Riccati equations (AREs). Its quick 
popularity was, however, hampered due to its 
lack of stability margins : a small error in model 
parameters can make the closed-loop unstable 
(Doyle 1978). This motivated methods to explic¬ 
itly address such modeling errors. 

Model Uncertainty and Robustness 

Suppose some parameter in the model of G is 
uncertain, a = ao + kS, 8 e [—1,1]; often, 
the normalized variation 8 can be “pulled out” 
into the uncertainty block A of Fig. 1. The same 
technique can account for unmodeled linear time- 
invariant (LTI) dynamics, e.g., high frequency 
effects: they can be “covered” by a normalized 
transfer function A (jco) and frequency weights 
that connect it to G. Even further, a nonlinear or 
time-varying (NL,TV) modeling error can be rep¬ 
resented through an operator A in signal space. 
The references contain details on this modeling 
technique. 

To analyze the effect of such errors, suppose 
K has been chosen to stabilize G and M is 
the resulting closed-loop system, with state-space 
representation 

x = Ax + B p p + B v v , 


( 2 ) 
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Robust U 2 Performance in Feedback Control, Fig. 2 

Robustness analysis setup 

q = C q x, 
z = C z x. 

A is an n x n stable (Hurwitz) matrix, and for 
simplicity there are no feed-through terms. Fig¬ 
ure 2, represents the interconnection of M with 
the uncertainty. 

To quantify the size of uncertainty, it is con¬ 
venient to use an induced norm (gain) in signal 
space and constrain A to the normalized ball 
{|| A || < 1}. If the subsystem M qp in feedback 
with A satisfies itself the induced norm constraint 
\\M qp || < 1, the small gain theorem implies 
robust stability over the entire ball. Focusing for 
the rest of this article on the C 2 signal space 
(square-integrable functions), the latter induced 
norm is equivalent to the PL 00 norm of the transfer 
function: 

\\M qp (s)\\ Hoo ■= ess sup a(M qp (jco)), 

COER 

where cr(-) denotes matrix maximum singular 
value. 

This motivates Hoq -optimal control: design 
K to minimize the above quantity with internal 
stability. This problem also admits state-space 
solutions based on AREs and thus is a valid 
competing paradigm to V, 2 - 

To accommodate multiple sources of uncer¬ 
tainty within Fig. 1, we can use a block diagonal 
structure: 


Here, different uncertainty blocks (parametric, 
LTI, LTV, or NL) enter in separate “channels”; 
B a denotes the unit ball of operators with the pre¬ 
scribed structure. For stability studies, causality 
of the operator is required. 

Robust stability under structured uncertainty 
is a rich topic: we refer to the article on the 
structured singular value (/x) in this encyclope¬ 
dia. We invoke here robustness conditions based 
on the set A of positive definite matrix scalings 
or multipliers of the form: 

A =diag[AA*/], (4) 

with submatrices of the same dimensions as the 
blocks in (3), thus commuting with a matrix A of 
that structure. 

Consider the frequency family of matrix in¬ 
equalities 

M* p (jm)A(m)M qp (jcD) - A (co) < 0 Vco; 

A (ft)) G A. (5) 

At each co, this is a linear matrix inequality 
(LMI); testing its feasibility is a convex, tractable 
problem. A solution implies the scaled-small 
gain condition 

a < 1; 

this “/x upper bound” implies robust stability 
when uncertainty is LTI, through commuting 
A (co) with A (jo). 

If uncertainty is NLTV, (5) must be 
strengthened to enforce A (co) = A, constant 
in frequency. This condition turns out to 
be both necessary and sufficient for robust 
stability. Here the LMIs would be coupled in 
frequency; however, the Kalman-Yakubovich- 
Popov lemma reduces them to an equivalent 
LMI in terms of the state-space matrices in (2), 
with variables A G A and an n x n matrix 
P > 0: 

~ A* P + PA + C*AC q PB p " 

B*P -A 


A = diag [Ai,..., A^]. 


(3) 


< 0 . ( 6 ) 
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What about performance? The mapping 
T zv ( A) between the disturbance v and the error z 
now depends on the uncertainty. The default pro¬ 
cedure in robust control has been to measure per¬ 
formance with the same induced norm, evaluating 
||r zu (A)|| £2 ^ £2 in the worst-case over A e Ba- 
This can be computed with similar complexity to 
establishing robust stability. It amounts, however, 
to treating noise with the same worst-case 
mentality as stability, a questionable choice. For 
instance, in LTI systems the worst-case signals 
are sinusoids at the worst frequency and spatial 
direction; while one should protect against such 
signals arising in the A-loop due to instability, it 
is not natural to expect them as external distur¬ 
bances, which are usually of broad spectrum. 

Robust H2 Performance Analysis 


robust stability under structured LTI uncertainty. 
Furthermore, we have the robust H 2 performance 
bound (Paganini 1999): 

sup \\T zv (A)\\ 2 H2 < J f . (8) 

AGB ^ 1 

We sketch the argument based on the Fourier 
transforms p(jco), etc., for signals in Fig. 2. Ap¬ 
plying the quadratic form in (7) to the joint vector 
of p and v gives 

d d 

^A,-(ft>)|?/| 2 +|z| 2 <^A/(w)|^| 2 +C*y(w)C. 

i = 1 i = 1 

The subvectors pi , #/ correspond to uncertainty 
blocks,/?, = Ai(j(o)qi m , since d (A/(/&>)) < 1, 
\qt | > \pi\. Also A i(co) > 0, so these terms can 
be simplified, leading to 


In the absence of uncertainty, the H 2 norm of the 
nominal mapping T zv (0) = M zv provides a natu¬ 
ral performance criterion, measuring the response 
to flat-spectrum disturbances or the transient re¬ 
sponse. When uncertainty is present, it motivates 
a worst-case analysis of stability; a natural com¬ 
bination is to impose robust H 2 performance : 
evaluating the worst-case H 2 norm of T zv ( A) 
over the uncertainty class Ba . We will highlight 
some methods based on semidefinite program¬ 
ming to perform such evaluations; for further 
details and comparisons, we refer to Paganini and 
Feron (1999). 


A Frequency Domain Robust Performance 
Criterion 

Consider the following optimization: 


Jf: = inf 




i 


00 , , „do) 

Tr (Y(co)) — , 

OO 


[A (tw) O' 

L 0 r 


2:x 
M(jco)- 


subject to 

'A («) 0 

0 Y(co) 


<0 

(7) 


for each u>, and A(&)) e A. 

Here M is the transfer function in Fig. 2; a 
submatrix of the above includes (5), implying 


\f zv (jco)v\ 2 = \z\ 2 < v*Y(co)v. 

This means T zv (jco)*T zv (jco) < Y(co) for every 
A, and therefore the H 2 norm bound 

/ OO J f } 

-n-(r(ft>))— 

holds, from which (8) follows. 

The computation involved in (7) at each fre¬ 
quency is a semidefinite program (SDP): min¬ 
imizing the linear cost Tr(T(&>)) subject to an 
LMI constraint, a tractable problem. Adding a 
frequency sweep, we have a practical method to 
bound the desired robust performance. 

The inequality (8) is in general strict. Beyond 
the usual conservatism of convex bounds for /z, 
when noise is of dimension m, a conservatism 
of up to this order may appear; an improvement 
to address this issue with augmented SDPs is 
given in Sznaier et al. (2002). Finally, causality of 
the uncertainty is not imposed in the frequency- 
domain criterion. 

As in the study of robust stability, we wish to 
extend the analysis to NLTV uncertainty blocks. 
Now the mapping T zv ( A) can no longer be rep¬ 
resented by a transfer function, so what is the 
“H 2 ” cost? We return to our motivation for this 
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performance notion: to measure the effect of 
disturbances of flat spectrum. 

In Paganini (1999), the flat-spectrum property 
is imposed as a deterministic constraint on the 
input disturbances. For the scalar v case, define 
W^b C £2 by the family of integral quadratic 
constraints'. 



<§ + /? V)8; 

>£-> 7 , Pe[0,B]. 


The LMI above is very similar to (6); indeed 
it provides a robust stability certificate and in 
addition a bound on a generalized cost, for ar¬ 
bitrary (NLTV) causal uncertainty blocks. Again, 
we sketch the argument. 

For stability, consider the system of Fig. 2 with 
v = 0, initial condition x(0) = xq. Define the 
storage function V(x) = x*Pi; differentiating it 
under (2) and applying the LMI (10) to the joint 
vector of x(t), p(t) yield 


This imposes that the cumulative spectrum is 
approximately linear (to a tolerance rj > 0), 
up to bandwidth B , and has sublinear growth 
beyond that. Extensions to vector-valued signals 
are also given. For a stable LTI system T zv , it is 
not difficult to verify that 

\\T Z v\\h 2 = lim sup ||7>|ll, 

but the right-hand side applies to NLTV systems 
as well. The following result can be established 
in the latter case: 

lim sup ||r z „(A)v||| = J' f , 

A6B5 ltv 

where the right-hand side is the variant of (7) with 
the restriction that A (co) = A, constant in fre¬ 
quency. In this case the characterization is exact, 
with equality above. This follows from a duality 
argument in function space, where Y(co) appears 
as the multiplier for the constraint in (9). While 
coupled in frequency, J'f is again equivalent to a 
finite-dimensional SDP in state space. 

Let us review, instead, a different state-space 
method, motivated by alternate definitions of the 
H 2 cost. 


V + \z\ 2 <-q*Aq+p*Ap = J2^(\Pi\ 2 -ki\ 2 )- 

i = 1 

Integrating the above over (0, t), the sum on 
the right becomes nonpositive because A/ > 0 
and the operator A 7 - : -> pi is causal and 

contractive. This leads to 

V(x(t)) + f \z{z)\ 2 dx < V(x 0 ), (11) 

JO 

which implies Lyapunov stability; the bound can 
be sharpened to prove asymptotic stability. Also, 
letting t -> 00 yields the energy bound ||z||f < 
V(x 0 ). 

Suppose now that Xq is generated by applying 
to the (causal) system at rest, an impulse v(t) = 
8(t), assumed scalar. The result is x(0+) = 
B v , so V(xo) = B*PB V ; the impulse response 
energy of T zv ( A) is thus bounded. Minimizing 
over P, A leads to the robust H 2 performance 
bound 

sup ||r z „(A)5(oni < Js, 

AeB^tv 


A State-Space Criterion Invoking Causality 

Consider the semidefinite program 


J s := inf Tr(5* PB V ) subject to P > 0, A e A, 

< 0 . 


' A*P + PA + C%hC q + CZC z PB p 

-A 


b;p 


(10) 


where the PL 2 cost is generalized as the impulse 
response energy. An extension to multiple im¬ 
pulse channels is available. This kind of result 
was first obtained by Stoorvogel (1993) for un¬ 
structured uncertainty. 

An alternate notion of H 2 cost for NLTV 
systems, also considered in Stoorvogel (1993), 
is the average output variance when the input 
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is random white noise. This is formalized by 
replacing (2) with a stochastic differential equa¬ 
tion (e.g., Oksendal 1985) and extending the 
bound (11) using Ito calculus; for details see 
Paganini and Feron (1999). The following robust 
H 2 performance bound is obtained: 

1 C T 

limsup - / E\z(t)\ 2 dt. < J s VAe B^ LTV . 
r—^oo T Jo 

What if the uncertainty is time invariant? In¬ 
corporating frequency-dependent scalings, with 
causality, into the state-space approach must be 
done approximately, generating A (jco) through 
the span of a predefined finite basis of causal, 
rational transfer functions. Searching over this 
basis for a bound on the impulse response energy 
can be pursued with state-space SDPs, now of a 
size increasing with the basis dimensionality. We 
refer to Feron (1997) for details. 


Robust I-L2 Synthesis 

Prior sections have focused on the robustness 
analysis of a closed-loop system M, obtained 
from G after designing a nominally stabilizing 
controller. Can we synthesize K with robust H 2 
performance as an objective? We overview some 
contributions to this question. 

Multiobjective 'Hil'Hoo Control 
Let us discuss first the more modest objective 
of optimizing nominal H 2 performance while 
guaranteeing robust stability. If the uncertainty 
block A in Fig. 2 is unstructured, the problem is 
equivalent to 

Minimize \\M zv \\h 2 ’ subject to \\M qp \\uoo < l - 

Using a Youla parameterization of stabilizing 
controllers, M(s) depends affinely on a stable 
parameter Q (s ); this makes the optimization over 
Q convex. However it has been shown to give 
infinite-dimensional solutions that must be ap¬ 
proximated by suitable truncations; see Sznaier 
et al. (2000) and references therein. 


To better exploit the state-space structure 
common to H 2 and Hoo synthesis, Bernstein 
and Haddad (1989) proposed a simplification: 
minimize an auxiliary cost that upper bounds 
the H 2 norm while imposing the Hoo constraint, 
through a common storage function. This cost 
is optimized by controllers of the order of the 
plant, characterized in terms of coupled AREs; 
later on Khargonekar and Rotea (1991) recast this 
problem using convex optimization. Also Zhou 
et al. (1994) and Doyle et al. (1994) studied the 
dual (transpose) structure. 

The latter version is in fact directly related to 
the analysis condition (10), with a fixed A = XI. 
A matrix P satisfying this condition imposes 
the Hoo norm restriction and upper bounds the 
nominal H 2 cost. This idea of imposing multiple 
objectives through a common storage function 
has more general applicability: Scherer et al. 
(1997) showed that all such problems admit 
tractable synthesis based on LMIs, with solutions 
of the same order as the plant. 

Synthesis for Robust Performance 

We have seen that rather than just an upper bound 
on nominal performance, (10) ensures the more 
stringent robust I-L 2 performance requirement; 
therefore it becomes the basis of a robust H 2 
synthesis technique. In Stoorvogel (1993) this 
method is laid out for unstructured uncertainty: 
search linearly over the scalar A and solve the 
auxiliary cost synthesis problem for each A. 

What about structured uncertainty? We run 
here into a general difficulty of such synthesis 
questions, even for robust stability alone. In that 
case, seeking simultaneously a controller K and a 
scaling A so that conditions (5) or (6) are satisfied 
by the resulting M is not a computationally 
friendly problem. In the absence of a general 
solution method, iterating between an Hoo design 
of K for fixed A and the analysis conditions to 
find A is commonly used for design. 

Things can be no easier for robust H 2 per¬ 
formance, but the iterative procedure does gen¬ 
eralize to the conditions in (10): for fixed K , the 
SDP will return structured A’s, which can then be 
fixed for a multiobjective synthesis step based on 
the “auxiliary cost” in (10) as discussed above. 
If constant A are used (designing for NLTV 
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uncertainty), all controllers obtained are of the 
order of the plant. 

If uncertainty is LTI, an alternative is to carry 
out the analysis step in the frequency domain, 
finding a A (co), Y(co) through (7). In the corre¬ 
sponding situation for /x-synthesis, where only 
A (co) is found, a step of fitting and spectral 
factorization is needed to approximate such scal¬ 
ings through a rational weights, which are then 
incorporated into Hoo synthesis. A similar fre¬ 
quency weight in the performance channel can 
approximate the effect of Y(co), thus relying on 
weighted Hoo synthesis to pursue the H 2 per¬ 
formance objective. Of course, the order of the 
resulting controllers is increased. 

Summary and Future Directions 

The tradeoff between performance and robust¬ 
ness is essential to feedback control. In the 
case of linear multivariable design, it motivated 
a compromise between H 2 performance and 
%00-type robustness, pursued with the state- 
space and frequency-domain tools common 
to these metrics. We have highlighted robust 
H 2 analysis conditions obtained in the 1990s 
based on semidefinite programming, which 
provided the greatest flexibility to integrate 
the aforementioned tools and different points 
of view (worst-case, average case) present in 
this problem. As in other situations, the robust 
synthesis question has proven more difficult: 
design cannot be “automated” to the degree that 
was once envisioned. 

The passage of time makes issues that once 
attracted strong attention look narrow in scope, 
so it is not natural to indicate directions that 
directly follow on this work. Perhaps the best 
legacy that the robust H 2 generation can take 
to other problems is the willingness to integrate 
various disciplines (dynamics, operator theory, 
stochastics, optimization) to face the demands of 
applied mathematical research. 

Cross-References 

► H-Infinity Control 

► KYP Lemma and Generalizations/Applications 


► Linear Quadratic Optimal Control 

► LMI Approach to Robust Control 

► Structured Singular Value and Applications: 
Analyzing the Effect of Linear Time-Invariant 
Uncertainty in Linear Systems 

Recommended Reading 

LQG control is covered in many textbooks, e.g., 
Anderson and Moore (1990). A standard text 
for robust control with an Hoo perspective, in¬ 
cluding structured singular values, the Youla pa¬ 
rameterization, and the Riccati equation solu¬ 
tion for Hoo synthesis, is Zhou et al. (1996); 
see also Sanchez-Pena and Sznaier (1998) with 
application examples. The textbook of Dullerud 
and Paganini (2000) incorporates the more recent 
developments based on LMIs; see Boyd and Van- 
denberghe (2004) for background on semidefinite 
programming. 
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Abstract 

Model-predictive control (MPC) is indisputably 
one of the rare modern control techniques that has 
significantly affected control engineering practice 
due to its unique ability to systematically handle 
constraints and optimize performance. Robust 
MPC (RMPC) is an improved form of the nom¬ 
inal MPC that is intrinsically robust in the face 
of uncertainty. The main objective of RMPC is 
to devise an optimization-based control synthe¬ 
sis method that accounts for the intricate in¬ 
teractions of the uncertainty with the system, 
constraints, and performance criteria in a theo¬ 
retically rigorous and computationally tractable 
way. RMPC has become an area of theoreti¬ 
cal relevance and practical importance but still 
offers the fundamental challenge of reaching a 
meaningful compromise between the quality of 
structural properties and the computational com¬ 
plexity. 


Keywords 

Model-predictive control; Robust optimal con¬ 
trol; Robust stability 

Introduction 

RMPC is an optimization-based approach to the 
synthesis of robust control laws for constrained 
control systems subject to bounded uncertainty. 
RMPC synthesis can be seen as an adequately 
defined repetitive decision-making process, in 
which the underlying decision-making process 
is a suitably formulated robust optimal control 
(ROC) problem. The underlying ROC problem 
is specified in such a way so as to ensure that 
all possible predictions of the controlled state 
and corresponding control actions sequences sat¬ 
isfy constraints and that the “worst-case” cost is 
minimized. The decision variable in the corre¬ 
sponding ROC problem is a control policy (i.e., a 
sequence of control laws) ensuring that different 
control actions are allowed at different predicted 
states, while the uncertainty takes on a role of the 
adversary. RMPC utilizes recursively the solution 
to the associated ROC problem in order to im¬ 
plement the feedback control law that is, in fact, 
equal to the first control law of an optimal control 
policy. 

A theoretically rigorous approach to RMPC 
synthesis can be obtained either by employing, 
in a repetitive fashion, the dynamic programming 
solution of the corresponding ROC problem 
or by solving online, in a recursive manner, 
an infinite-dimensional optimization problem 
(Rawlings and Mayne 2009). In either case, the 
associated computational complexity renders the 
exact RMPC synthesis hardly ever tractable. 
This computational impracticability of the 
theoretically exact RMPC, in conjunction with 
the convoluted interactions of the uncertainty 
with the evolution of the controlled system, 
constraints, and control objectives, has made 
RMPC an extremely challenging and active 
research field. It has become evident that a 
prominent challenge is to develop a form 
of RMPC synthesis that adequately handles 
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the effects of the uncertainty and yet is 
computationally plausible. Contemporary 
research proposals aim to address the inevitable 
trade-off between the quality of guaranteed 
structural properties and the corresponding 
computational complexity. A categorization of 
the existing proposals for RMPC synthesis can 
be based on the treatment of the effects of 
the uncertainty. In this sense, two alternative 
approaches to RMPC synthesis appear to be 
dominant. 

The first category of the alternative approaches 
is represented by the methods that utilize, 
when possible, inherent robustness of nominal 
MPC synthesis. These proposals deploy a 
nominal MPC, albeit designed for a suitably 
modified control system, constraints, and control 
objectives. Such approaches are computationally 
practicable. However, the effects of the uncer¬ 
tainty are taken care of in an indirect way; the 
robustness properties of the controlled dynamics 
are frequently addressed via an a posteriori 
input-to-state stability analysis, which might be 
unnecessarily conservative and geometrically 
insensitive. Equally important drawbacks of 
these approaches to RMPC synthesis arise due 
to the fact that the nominal MPC synthesis is 
itself an inherently fragile (nonrobust) process; 
in particular, the stability property of the 
conventional MPC might fail to be robust 
(Grimm et al. 2004) and, furthermore, the optimal 
control of constrained discrete time systems, 
employed for the nominal MPC synthesis, can be 
a fragile process itself (Rakovic 2009). 

The second category of RMPC design 
methods encapsulates the approaches that take 
the effects of the uncertainty into account more 
directly. These proposals are compatible with 
the emerging consensus: there is a need for the 
deployment of the simplifying approximations 
of the underlying control policy and sensible 
prioritization and modification of control 
objectives so as to simultaneously enhance 
computational tractability and ensure a priori 
guarantees of the desirable topological properties 
and system-theoretic rigor. The simplifying 
parameterizations of the control policy are em¬ 
ployed primarily to allow for a computationally 


efficient handling of the interactions of the 
uncertainty with the evolution of the controlled 
system and constraints. The control objectives are 
prioritized and modified when necessary, in order 
to ensure that the corresponding ROC problem 
is computationally tractable. The effectiveness 
of such methods depends crucially on the ability 
to detect a sufficiently rich parameterization of 
control policy and to devise a systematic way for 
meaningful simplification of control objectives. 

In a stark contrast to a well-matured theory 
of the nominal MPC synthesis, a systematic as¬ 
sessment of, and unified exposure to, the current 
state of affairs in the RMPC field is a highly 
demanding chore. Nevertheless, it is possible to 
outline the main aspects of the exact RMPC syn¬ 
thesis and to provide an overview of the dominant 
simplifying approximations. 


Contemporary Setting and 
Uncertainty Effect 

The contemporary approach to the exact RMPC 
synthesis is now delineated in a step-by-step 
manner. 

The system: The most common setting in 
RMPC synthesis considers the control systems 
modelled, in discrete time, by 

v + = / (v, u, w ), (1) 

where v e M", u e M m , w e R p , and e 
W 1 are, respectively, the current state, control 
and uncertainty, and the successor state, while 
/(-,-,•) : W 1 x M m x RP —> W 1 is the state 
transition map assumed to be continuous. Thus, 
when Xk, Uk, and Wk are the state, the control, 
and the uncertainty at the time instance k , then 
= / (xk, Uk, Wk) is the state at the time 
instance k + 1. 

The constraints: The system variables x, u, and 
w are subject to hard constraints: 

(x,u,w) eXxUx W, 


( 2 ) 
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where the constraint sets X and U represent state 
and control constraints, while the constraint set 
W specifies geometric bounds on the uncertainty. 
The constraint sets XcM”,U C M m , and W C 
R p are assumed to be compact. 

The control policy: It is necessary to specify, 
in a manner that is compatible with the type 
and nature of the uncertainty, the information 
available for the RMPC synthesis. The traditional 
state feedback setting treats the case in which, 
at any time instance k , the state Xk is known 
when the current control Uk is determined, while 
the values of the current and future uncertainty 
(wk+i) are not known but are guaranteed to take 
the values within the uncertainty constraint set W 
(i.e., Wk+i £ W). Within this setting, the use of a 
control policy, 

n«-i := {tto (0, JTi , 7TJV-1 (•)}, (3) 

where N is the prediction horizon and each 
7 Xk (•) : M" -> M m is a control law, is structurally 
permissible and desirable. 

The generalized state and control predictions: 

Because of the uncertainty, the ordinary state and 
control predictions, as employed in the nominal 
MPC, are not suitable. Clearly, when x and k{x) 
are the current state and control, then the succes¬ 
sor state can take any value in the possible set 
of successor states {/ (x, k(x),w) : w e W}. 
Consequently, it is necessary to consider suitably 
generalized state and control predictions. The 
interaction of the uncertainty with the predicted 
behavior of the system is captured naturally by 
invoking the maps F(-, •) and G(-, •) specified, 
for any subset X of R n and any control function 
*:(•) :R" —M m , by 

F (X,K):={f (x,K(x),w):xeX,we W} and 
G(X,k ) := {k (x) : x e X}. (4) 

Within the considered setting, the corresponding 
state and control predictions are, in fact, set¬ 
valued and, for each relevant k , obey the relations 


Xk+ 1 = F (X k , 7t k ) and U k =G (X k , n k ), with 
X 0 := {x}. (5) 

The set sequences Xjy\= {Xo, X \,..., X^-i, X^} 
and Ujv-i := {Go, XJ\, . .., U^-i } represent the 
possible sets of the predicted states and control 
actions, which are commonly known as the state 
and control tubes. Evidently, the state and control 
tubes are functions of the initial state v and 
a control policy fl^-i. Reversely, for a given 
initial state x, any structurally permissible control 
policy fljv-i results in the possible sets of the 
predicted states and control actions. 

The robust constraint satisfaction: One of 

the primary objectives in RMPC synthesis is 
to ensure that the generalized state and control 
predictions satisfy state and control constraints. 
Because of the repetitive nature of RMPC, it 
would be ideal to consider the control policy and 
generalized state and control predictions over the 
infinite horizon (i.e., for N = oo). Unfortunately, 
this is hardly ever practicable in a direct fashion. 
When the prediction horizon is finite, the robust 
constraint satisfaction reduces to the conditions 
that for all k = 0,1,..., N — 1, the set inclusions 

and U k c U (6) 

hold true and that the possible set of states X # 
at the prediction time instance N satisfies the set 
inclusion 

X^yCX/, (7) 

where X/ c X is a suitable terminal constraint 
set. 

The terminal constraint set: In order to ac¬ 
count for the utilization of the control policy 
Hn-i and generalized state and control predic¬ 
tions over the finite horizon N and to ensure that 
these can be prolonged indirectly over the infinite 
horizon, a terminal constraint set is employed. 
This set is obtained by considering the uncertain 
dynamics 


= / (x,Kf (x) , w) 


( 8 ) 
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controlled by a local control function *:/(•). The 
design of a control law k / (•) is usually performed 
offline in an optimal manner by considering the 
unconstrained version of the system (1), while 
the terminal constraint set X/ accounts locally 
for the state and control constraints. The terminal 
constraint set X/ is assumed to be compact and 
robust positively invariant for the dynamics (8) 
and constraint sets (2). Thus, the set X/ and a 
local control function k / (•) satisfy 

F c X/ cX and U / 

:=G(Xf,K f ) CD, (9) 

or, equivalently, X/ c X, and for all x G 
X/, it holds that /c/(x) g U and Vw e W, 
/(x,/c/(x),w) G X/. The most appropriate 
choice for X/ is the maximal robust positively 
invariant set for the dynamics (8) and constraint 
sets (2). 

The generalized origin: Due to the presence 
of the uncertainty, the stabilization of the origin 
might not be attainable and, thus, it might be 
necessary to consider the origin in a general¬ 
ized sense. The most natural candidate for the 
generalized origin is a minimal robust positively 
invariant set for the dynamics (8) and constraint 
sets (2). This set is entirely determined by the 
associated state set dynamics 

X+ = F(X,K f ), (10) 

which are completely induced by the local dy¬ 
namics (8) and the uncertainty constraint set 
W. The generalized origin, namely, the minimal 
robust positively invariant set, is compact and 
well defined in the case when the local control 
function /c/(*) ensures that the corresponding 
map F(-,Kf ) is a contraction on the space of 
compact subsets of X/ (Artstein and Rakovic 
2008), which we assume to be the case. The 
generalized origin Xq is the unique solution to 
the fixed-point set equation. 

(ID 


and is an exponentially stable attractor for the 
state set dynamics (10) with the basin of at¬ 
traction being the space of compact subsets of 
X/. Thus, the conventional (0,0) fixed-point pair 
ought to be replaced by the fixed-point pair of sets 
(Xo, Ue>) required to satisfy 

X o = F (Xo,Kf) c interior (X/) and 
Uo := G(X 0 ,K f ) CD/. (12) 

The generalized cost functions: The perfor¬ 
mance requirements are, as usual, expressed via 
a cost function, which is obtained by considering 
a stage cost function £(*,*) : XxU -> M+ and 
a terminal cost function V/G) : X/ -> M+. The 
stage cost function £(•, •) is continuous and, due to 
the uncertainty, adequately lower bounded w.r.t. 
to the generalized origin Xq . The latter condition 
requires that for all x G X and all u G U, the 
function £(•, •) satisfies 

ot\ (dist (Xq, x)) < l (x, u) , (13) 

where aqG) is a /C-class (Kamke’s) function and 
dist(Xe>, •) is the distance function from the set 
Xq. The consideration of the generalized origin 
requires the additional condition that for all x G 
Xq, the use of local control function k / (•) is “free 
of charge” w.r.t. €(*,•), i.e., that for all x G Xq, 
we have 

l(x,Kf (x)) = 0. (14) 

As in the case of the terminal constraint set X/, 
the terminal cost function Vf (•) is employed to 
account for the utilization of the finite predic¬ 
tion horizon A, and it should provide locally a 
theoretically suitable upper bound of the highly 
desired infinite horizon cost. The terminal cost 
function Vf (•) is assumed to be continuous and 
adequately upper bounded w.r.t. the generalized 
origin Xq. The latter bound reduces to the re¬ 
quirement that for all x G X/, we have 

V f (x) < o' 2 (dist (Xo, x)), (15) 

where, as above, c^G) is a /C-class function. In 
addition, the terminal cost function Vf(-) satisfies 
locally a usual condition for robust stabilization, 


X = F(X, Kf ), 
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which is expressed by the requirement that for all 
x eXf and all we W, it holds that 

Vf (/ (x, Kf (x) , w)) - Vf (x) <—l (x, Kf (x)) . 

(16) 

The cost function F# (-,-,•) is defined, for 
all x G X, all n^v-i, and all y?n-i '= 
{wo, wi,.. .,wn- i}, by 

N -1 

Fv (x, rtjy-1, Wjy-i):= fa,«fc)+F/ (xjy), 

k=0 

(17) 

where, for notational simplicity, : = 7tk(xk) 
and Xfc := x^(x, n^-i, y?n-i) denote the so¬ 
lution of (1) when the initial state is x, control 
policy is n N - U and uncertainty realization is 

Wjv-l. 

The exact ROC: In view of the uncertainty, 
the corresponding exact ROC problem P#(x), 
for any x G X, aims to optimize the “worst- 
case” performance so that it takes the form of an 
infinite-dimensional minimaximization: 

J N (x, Iliv—1):= max V N (x, 11^—1, Wjv-i) , 
w/v—i eW N 

(x) := min J N (x, n^-i), 

U N -ieU N -\(x) 

n^_i (x)earg min J N (x, IIjv-i), 

U N -ieU N ~i(x) 

(18) 

where II n-i(x) denotes the set of the constraint 
admissible control policies defined, for all x G X, 
by 

Hn-i (x) := {fl^v-i : conditions (5)—(7) hold} . 

(19) 

The value function (•) might not admit a 
unique optimal control policy, so that fl^_ 1 (•) 
represents a selection from the set of optimal 
control policies (this selection is usually induced 
by a numerical solver employed for the online 
calculations). The effective domain X^ of the 
value function (•) and associated optimal con¬ 
trol policy n^_ 1 (•) is given by 

X N := {x e IT : fl^-i (x) ^ jO ). (20) 


and is known in the literature as the N -step min- 
max controllable set to a target set X/. Within the 
considered setting, the set X V is a compact subset 
of X such that Xf c X V. 

The exact RMPC: The exact RMPC synthesis 
requires online solution of the minimaximization 
(18) in order to implement numerically the con¬ 
trol law 7 Tq (•). The control law 7 Tq (•) is well 
defined for all x G Xjy, and it induces the 
controlled uncertain dynamics specified, for all 
x G Xn ,by 

x+ e T (x), T (x) := {/(x, tt 0 0 (x), w): we W} . 

(21) 

Within the considered setting, the exact RMPC 
law 7 Tq (•) renders the N-step min-max control¬ 
lable set Xn robust positively invariant. Namely, 
for all x G Xn , it holds that 

J(i)c^cx and (x) e U. (22) 

Furthermore, the associated value function 
: X N -* M+ is, by construction, 
a Lyapunov certificate verifying the robust 
asymptotic stability of the generalized origin Xq 
for the controlled uncertain dynamics (21) with 
the basin of attraction being equal to the AC step 
min-max controllable set X^. More precisely, 
for all x G Xn , it holds that 

oc\ (dist (X<9, x)) < (x) < a3 (dist (Xq, x)) , 

(23) 

where a 3 (•) is a suitable /C-class function, while 
for all x G Xn and all x + G T’(x), it holds that 

F° (x+) - F° (x) < -ai (dist (Xo, x)). (24) 

Clearly, under fairly natural conditions, the ex¬ 
act RMPC synthesis induces rather strong struc¬ 
tural properties, but the associated computational 
complexity is overwhelming. However, in the 
above overview, the effects of the uncertainty 
have been “dissected” and the “basic building 
blocks” employed for the exact RMPC synthesis 
have been clearly identified. In turn, this step-by- 
step overview suggests indirectly the meaningful 
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and simplifying approximations in order to en¬ 
hance computational practicability. 

Computational Simplifications 

The computational intractability of the exact 
RMPC synthesis can be tackled by considering 
suitable parameterizations of control policy 
n N—i associated state and control tubes 

Xn and Ujv-i and by adopting computationally 
simpler performance criteria. 

The core simplification is the use of finite¬ 
dimensional parameterization of control policy. 
The control policy should be suitably parameter¬ 
ized so as to allow for the utilization of both the 
least conservative generalized state and control 
predictions and a range of simpler, but sensible, 
cost functions. 

The explicit form of the exact state and control 
tubes is usually highly complex, and it is com¬ 
putationally beneficial to employ, when feasible, 
the implicit representation of the possible sets of 
predicted state and control actions. An alternative 
is to utilize outer-bounding approximations of the 
exact state and control tubes; these are obtained 
by making use of simpler sets that usually admit 
finite-dimensional parameterizations. In the latter 
case, the exact set dynamics of the state and 
control tubes given by (5) are usually relaxed to 
set inclusions 

{*0} £ X 0 , and, F (X k , n k ) c X k + x 
and G (X k , 7t k ) c U k . 

The generalized origin, i.e., the minimal robust 
positively invariant set Xq, is an integral com¬ 
ponent for the analysis. Its explicit computation 
is rather demanding and, hence, its use for the 
online calculations might not be convenient. A 
computationally feasible alternative is to deploy 
the terminal constraint set X/ as a “relaxed form” 
of the generalized origin; this is particularly ben¬ 
eficial when the local control function /c/(-) is 
optimal w.r.t. infinite horizon cost associated with 
the unconstrained version of the system (1). 

The performance requirements should be care¬ 
fully prioritized and modified when necessary, in 


such a way so as to be expressible by the cost 
functions that do not require intractable minimax 
optimization but still ensure that the associated 
value function verifies the robust stability and 
attractivity of the generalized origin Xq or the 
terminal constraint set X/. 

The outlined guidelines have played a piv¬ 
otal role in devising a number of theoretically 
sound and computationally efficient parameter¬ 
ized RMPC syntheses within the setting of linear 
control systems subject to additive disturbances 
and poly topic constraints. In this linear-poly topic 
setting, the state transition map / (•, *, •) of (1) is 
linear: 

/ (x, u, w) = Ax + Bu + w, (25) 

where the matrix pair (A, B) e x M nxm 

is assumed to be known and strictly stabilizable. 
The local control function /<:/(•) and associated 
local uncertain dynamics are linear: 

u = Kx and = (A + BK ) x + w. (26) 

The matrix K e M mXn is designed offline and 
is such that the eigenvalues of the matrix A + BK 
are strictly inside of the unit circle. The constraint 
sets X and U are polytopes (A polytope is a con¬ 
vex and compact set specified by finitely many 
linear/affine inequalities, or by a convex hull of 
finitely many points) in W 1 and M m that contain 
the origin in their interior. The uncertainty con¬ 
straint set W is a polytope in W 1 that contains the 
origin. 

The terminal constraint set X/ is the maximal 
robust positively invariant for = (A+BK)x + 
w and constraint set (X^, W) where Xk := {x e 
X : Kx G U}. The set Xy is assumed to be 
a polytope in W 1 that contains the generalized 
origin Xq (which is the minimal robust positively 
invariant set for = (A + BK)x + w and 
constraint set (X^, W)) in its interior. 

It has recently been demonstrated that the 
major simplified RMPC syntheses in the linear- 
polytopic setting employ control policies within 
the class of separable state feedback (SSF) 
control policies (Rakovic 2012). More precisely, 
the predictions of the overall states x k and 
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associated control actions Uk are parameterized 
in terms of the predictions of the partial states 
X(j t k),j = 0,1,.. .,k and partial control actions 
U(j t k), j = 0,1, ...,k via 

k k 

%k = ^ ' X(j,k) and Uk — ^ ' ^{j,k)-> (27) 

j =0 7=0 

where, for notational simplicity, Uk := 7tk(xk) 
and := n(j,k)(x(j,k))- To ensure the dynam¬ 
ical consistency with (25), the predicted partial 
states X(j t k) evolve according to 

X(J,k+1 ) = ^-X{j,k) T" B u (j,k), (28) 

(for j = 0,1,..., N — 1 and k = j, j + 
1 ..., V — 1), while the “partial” initial conditions 
X(k,k) satisfy 

X(o,o) = x and (29a) 

X(k,k) = Wk -1 for k = 1,2,..., N. (29b) 

As elaborated on in Rakovic (2012) and Rakovic 
et al. (2012), the utilization of the SSF control 
policy allows for: 

• The deployment of the highly desirable im¬ 
plicit representation of the exact state and con¬ 
trol tubes induced by the SSF control policy. 
This implicit representation is parameterized 
via 0(N 2 ) decision variables. 

• The numerically convenient formulation of 
the robust constraint satisfaction via 0(N 2 ) 
linear/affine inequalities and equalities. 

• The computationally efficient minimization of 
an upper bound of the “worst-case” cost for 
which the stage and terminal cost functions are 
specified in terms of the weighted distances 
from the terminal constraint set Xy and the 
associated control set U/ = KX /. 

As shown in Rakovic (2012) and Rakovic et al. 
(2012), the RMPC control laws, based on the use 
of the SSF control policy, can be implemented 
online by solving a standard convex optimiza¬ 
tion problem whose complexity (in terms of the 
numbers of decision variables and affine inequal¬ 
ities and equalities) is 0(N 2 ). The corresponding 


RMPC synthesis ensures directly that the termi¬ 
nal constraint set X/ is robustly exponentially 
stable, and it also induces indirectly the robust 
exponential stability of the generalized origin 

X 0 . 

The previously dominant control policy pa- 
rameterizations include time-invariant affine state 
feedback (TIASF), time-varying affine state feed¬ 
back (TVASF), and affine in the past distur¬ 
bances feedback (APDF) control policies. All of 
these parameterizations are subsumed by the SSF 
control policy, as all of them induce additional 
structural restrictions on the parameterizations 
of the predicted state and control actions spec¬ 
ified in (27) and on the associated dynamics 
given by (28), (29) and (30). In particular, the 
TIASF control policy (Chisci et al. 2001 ; Gossner 
et al. 1997) imposes structural restrictions that, 
for each relevant k , 

U(j,k) = Kx(jX) for j = 1,2,... ,k, (30) 

where K is the local control matrix of (26). The 
TVASF control policy (Lofberg 2003) induces 
less restrictive requirements that, for each rele¬ 
vant k, 

u {j,k) — K(j\k)X(jj) for j ' = 1,2 ,..., k, (31) 

where the matrices K(j t k) € are part of 

the decision variable. The APDF control policy 
(Goulart et al. 2006; Lofberg 2003) is an alge¬ 
braic reparameterization of the TVASF control 
policy, which requires the conditions that, for 
each relevant k , 

U(j,k) = M(j t k)X(j'k) for j = 1,2,...,/:, (32) 

where the matrices M(j t k) e are part of 

the decision variable. A comprehensive trade¬ 
off analysis between the quality of guaranteed 
structural properties and the associated compu¬ 
tational complexity and a theoretically meaning¬ 
ful ranking of the existing RMPC syntheses in 
the linear-polytopic setting is reported in the 
recent plenary paper (Rakovic 2012). Therein, it 
is demonstrated that the dominant approach is the 
RMPC synthesis utilizing the SSF control policy 
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(Rakovic 2012) (also known as the parameterized 
tube MPC (Rakovic et al. 2012)). 

Summary and Future Directions 

The exact RMPC synthesis has reached a re¬ 
markable degree of theoretical maturity in the 
general setting. The corresponding theoretical 
advances are, however, accompanied with the im¬ 
peding computational complexity. On the bright 
side of the things, a number of rather sophisti¬ 
cated RMPC synthesis methods, which are both 
computationally efficient and theoretically sound, 
have been developed for the frequently encoun¬ 
tered linear-poly topic case. 

The further advances in the RMPC field might 
be driven by the utilization of more structured 
types and models of the uncertainty. The chal¬ 
lenge of devising a computationally efficient and 
theoretically sound RMPC synthesis might need 
to be tackled in several phases; the initial steps 
might focus on adequate RMPC synthesis for 
particular classes of nonlinear control systems. 
Finally, it would seem reasonable to expect that 
the lessons learned in the RMPC field might play 
an important role for the research developments 
in the fields of the stochastic and adaptive MPC. 

Cross-References 

► Nominal Model-Predictive Control 

► Stochastic Model Predictive Control 


Recommended Reading 

The recent monograph (Rawlings and Mayne 
(2009)) provides an in-depth systematic exposure 
to the RMPC field and is also a rich source 
of relevant references. The invaluable overview 
of the theory and computations of the maxi¬ 
mal and minimal robust positively invariant sets 
can be found in (Artstein and Rakovic (2008), 
Kolmanovsky and Gilbert (1998), Rakovic et al. 
(2005), and Blanchini and Miani (2008)). The 
important paper (Scokaert and Mayne (1998)) 


points out the theoretical benefits of the use of the 
control policy, but it also indicates indirectly the 
computational impracticability of the associated 
feedback min-max RMPC. The early tube MPC 
synthesis (Mayne et al. 2005) is both compu¬ 
tationally efficient and theoretically sound, and 
it represents an important step forward in the 
linear-polytopic setting. The so-called homoth- 
etic tube MPC synthesis (Rakovic et al. 2013) is 
a recent improvement of the first generation of 
the tube MPC synthesis (Mayne et al. 2005), and 
it has a high potential to effectively handle the 
parametric uncertainty of the matrix pair ( A , B). 
The current state of the art in the linear-polytopic 
setting is reached by the RMPC synthesis using 
the SSF control policy (Rakovic 2012; Rakovic 
et al. 2012). The output feedback RMPC synthe¬ 
sis in the linear-polytopic setting can be handled 
with direct extensions of the tube MPC syntheses 
(Mayne et al. 2009). 
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Abstract 

This entry provides a brief summary of the syn¬ 
thesis and analysis tools that have been developed 
by the robust control community. Many software 
tools have been developed to implement the ma- 
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jor theoretical techniques in robust control. These 
software tools have enabled robust synthesis and 
analysis techniques to be successfully applied to 
numerous industrial applications. 
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Introduction 

Robust control is a methodology to address 
the effect of uncertainty on feedback systems. 
This approach includes techniques and tools to 
model system uncertainty, assess stability and/or 
performance characteristics of the uncertain 
system, and synthesize controllers for uncertain 
systems. The theory was developed over a 
number of years. The foundational results can 
be found in classical papers Packard and Doyle 
(1993a), Desoer et al. (1980), Doyle (1978, 
1982), Doyle et al. (1989), Doyle and Stein 
(1981), Megretski and Rantzer (1997), Safonov 
(1982), Willems (1971), and Zames (1981) 
and more recent textbooks Boyd et al. (1994), 
Desoer and Vidyasagar (2008), Dullerud and 
Paganini (2000), Francis (1987), Skogestad and 
Postlethwaite (2005), Vidyasagar (1985), and 
Zhou et al. (1996). It should be emphasized that 
this entry is not meant to be a survey and more 
complete references to the literature can be found 
in the cited textbooks. The remainder of this entry 
discusses the main theoretical and computational 
tools for robust synthesis and robustness analysis. 

Notation 

R and C denote the set of real and complex 
numbers, respectively. R mXn and C mXn denote 
the sets of mxn matrices whose elements are in 
R and C, respectively. A single superscript index 
is used for vectors, e.g., R” denotes the set of 
n x 1 vectors whose elements are in R. For a 
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matrix M e C mxn , M T denotes the transpose 
and M* denotes the complex conjugate trans¬ 
pose. A matrix M is Hermitian (Skew-Hermitian) 
if M = M* (M = — M*). The maximum 
singular value of a matrix M is denoted by <r(M). 
The trace of a matrix M, denoted tr[M], is the 
sum of the diagonal elements. M = M* is a 
positive semidefinite matrix, denoted M >: 0, 
if all eigenvalues are nonnegative. M = M* 
is negative semidefinite, denoted M < 0, if 
-M >_ 0. C 2 [0,oo) is the space of functions 
u : [0, oo) —> R 7t satisfying \\u\\ < oo where 
\\u\\ := [/ 0 °° u(t) T u(t) dt]°' 5 . For u e jC 2 [0,oo), 
ut denotes the truncated function ur(t) = u(t ) 
for t < T and u(t ) = 0 otherwise. The extended 
space, denoted C 2e , is the set of functions u such 
that ut e C 2 for all T > 0. The Fourier transform 
v := F(v) maps the time domain signal v e 
C 2 [0, 00 ) to the frequency domain by 

/>oo 

v(jco) := / e~ ja)t v(t)dt ( 1 ) 

Jo 

Capital letters are used to represent dynamical 
systems. For linear systems, the same letter is 
used to represent the system, its convolution 
kernel, as well as its frequency-response function. 
Lowercase letters denote time-signals, and 
co represents the continuous-time frequency 
variable. For an m x n system G, define the 
H 0 o and H 2 norms as ||G||oo = sup w & (G(jco)) 

and ||G|| 2 = o tr[G(j(o)*G{j(o)]d(o. 

The C\ norm of G is defined as ||G||i = 

max 1 <i< m T!j = 1 /o°° \gij(0\dt where g t j(t) is 
the response of the / th output due to a unit 
impulse in the yth input. The entry describes 
continuous-time systems. Most results carry over, 
in a similar form, to discrete-time systems. 


Theoretical Tools 

Uncertainty Modeling 

In order to analyze and/or design for the de¬ 
grading effects of uncertainty, it is imperative 
that explicit models of uncertainty be charac¬ 
terized. Two distinct forms of uncertainty are 


considered: signal uncertainty and model uncer¬ 
tainty. Signal uncertainty represents external sig¬ 
nals (plant disturbances, sensor noise, reference 
signals) as sets of time functions, with explicit 
descriptions. For example, a particular reference 
input might be characterized as belonging to the 
set : d e C 2 , \\d\\ 2 < l}. This set is often 

referred to as a weighted ball in C 2 . The transfer 
function is called a weighting function and 
it shapes the normalized signals d, in a manner 
that its output represents the actual traits of the 
reference inputs that occur in practice. 

Model uncertainty represents unknown or par¬ 
tially specified gains (more generally, operators) 
that relate pairs of signals in the model. For ex¬ 
ample, z and w are signals within a model and are 
related by an operator M as w = N(z). Typical 
partial specifications either constrain A f to be 
drawn from a specified set or describe the set of 
signals (w, z) that J\T allows. An uncertain pa¬ 
rameter 8 is modeled as time-invariant (i.e., con¬ 
stant), belonging to the interval [< a , b] and relating 
z and w as w(t) = 8z(t). An uncertain linear 
dynamic element , A is modeled as linear, time- 
invariant, causal system, described by a convolu¬ 
tion kernel 8 whose frequency-response function 


(i.e., Fourier transform) satisfies max & 


< 50 ) 


1, and relating z and was w = 8 ★ z- More 
generally, consider an C 2 bounded, causal oper¬ 
ator, mapping C 2 , e —> C 2 , e relating the signals as 
w = A(z). The behavior of A is unknown but 
constrained by a family of multipliers, {Tl a } ae ^. 
Specifically, each Tl a is a Hermitian, matrix¬ 
valued function of frequency, and for any z e £ 2 , 
the mapping A is known to satisfy 



n a (<w) 


z(jo ) 

W’O) 


dco > 0 


This is called an integral quadratic constraint 
(IQC) description of A, as the input/output pairs 
of A satisfy a family of quadratic, integral con¬ 
straints. These different descriptions of model 
uncertainty are related. For example, if w(t ) = 
Sz(t), with w(t) e R ;t and z(t) e R ; \ and 
5 € R,\8\ < 1, then for any Hermitian-valued 
A : R -* C nxn with X(co) ± 0 for all co e R and 
Skew-Hermitian Y : R — > C nxn , 
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r z(jv )' 

* 

■ X(a>) 

Y(co) ' 

" 2(;«)" 

[w(jco)_ 


Y*(a>) 

-X(co)_ 

_w(jco)_ 


z*(ja>) [(l-8 2 )X(co)]z(jco)dco 



which is always >: 0. Hence, the uncertain pa¬ 
rameter can be recast as an operator satisfying an 
infinite family of IQCs. Nonlinear operators may 
also satisfy IQCs and it is common to “model” 
known nonlinear elements (e.g., saturation) by 
enumerating IQCs that they satisfy (Megretski 
and Rantzer 1997). An uncertain dynamic model 
is made up of an interconnection of these un¬ 
certain elements with a known (usually) linear 
system G. 

Performance Metric 

The main goal of robustness analysis is to assess 
the degrading effects of uncertainty. For this, a 
concrete notion of performance is needed, re¬ 
sulting in a mathematical/computational exercise 
to quantify the average or worst-case effects of 
the two types of uncertainty, signal and model, 
described earlier. In the robust control frame¬ 
work, adequate performance is characterized in 
terms of the variability of possible behavior of 
particular signals. For instance, in the presence 
of reference inputs and disturbance inputs, as 
well as parameter uncertainty, it is required that 
tracking errors ( e ) and control inputs (u) re¬ 
main small. A common measure of smallness 
is the £2 norm of signals. Typically, frequency- 
dependent weighting functions are used to prefer¬ 
entially weight one frequency range over another 
and/or to weight one signal relative to another. 
In this way, adequate performance be defined 
as || W [ e u \|| 2 < 1, where W is a stable, linear 
system, called the “output” weighting function. 
Weighting functions are often used to transform 
a collection of performance objectives into a 
single norm bound objective in the robust control 
framework. 

Robustness Analysis 

Robustness analysis refers to the task of ascer¬ 
taining the stability and/or performance char¬ 
acteristics of the uncertain system, given the 


limited knowledge about the uncertain informa¬ 
tion. The main result from Megretski and Rantzer 
(1997) concerns the stability of the interconnec¬ 
tion shown in Fig. 1, where G is a known, stable, 
linear system and A is an operator that satisfies 
the IQC defined by II. Under some important 
technical conditions, the theorem states “if there 
exists an c > 0 such that 


G{jw) 

1 


nM 


G(jco) 

I 


( 2 ) 


for all co e R, then the interconnection is stable.” 
Stability here refers to finite £2 gain from inputs 
(ri, 7 * 2 ) to loop signals (u\,U 2 ). 

Multiple IQCs satisfied by A can be incorpo¬ 
rated into the analysis. In particular, assume that 
A satisfies the IQCs defined by the multipliers 
{TU}f =r Then A satisfies the IQC defined by 
any mutliplier of the form II a := Y^k =1 a k^lk 
where ctk > 0. The stability test amounts to 
a semi-infinite, semidefinite feasibility problem: 
find nonnegative scalars {otk}^ =l such that for 
some € > 0, 


G(jco) 

I 


n a (co) 


G(jco) 

I 


( 3 ) 


for all co € R. This infinite family of matrix in¬ 
equalities (one for each frequency) can be equiv¬ 
alently expressed as a finite-dimensional linear 
matrix inequality (LMI) under some additional 
restrictions. 

The structured singular value (/z) approach 
provides an alternative robust stability test in 
the case of only linear, time-invariant uncertainty 
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(parametric or dynamic). Suppose A is drawn 
from a set of matrices, A c C mxw of the form 

A = {diag [S',/ (| ,..., S' v I lv , 

8{I ri ,...,8lI rs ,A u ...,A F ]: 
8 r k e R,<5 ( c e C, A ; e C m J xn J) 

The inclusion of complex-valued, uncertain ma¬ 
trices within A may seem unusual and hard to 
motivate. However, in terms of their effect on 
stability, these are equivalent to the uncertain 
linear dynamic element introduced earlier in the 
Uncertainty Modeling section. This is discussed 
in more detail in the entry ► Structured Singular 
Value and Applications: Analyzing the Effect 
of Linear Time-Invariant Uncertainty in Linear 
Systems. 

Using the Nyquist stability criterion, the 
(G, A) interconnection is stable for all A G A, 
with d (A) < /3 if and only if G is stable, and 

det(/ — G(jco) A) ^ 0 

for all A G A with g (A) < /3 and all co G R 
including co = oo. The importance of the nonva¬ 
nishing determinant condition warrants a defini¬ 
tion of its own, the structured singular value. Lor 
a matrix M e C nxm , and A as given, define 


^ v '' min {d (A):A g A , det (/ — MA) = 0} 

unless no A G A makes (/ — M A) singular, 
then := 0. In this parlance, the (G, A) 

interconnection is stable for all A G A, with 
g (A) < P if and only if 


/m(G(») < 


1 

P 


for all co G R including co = oo. 

In summary, the structured singular value ap¬ 
proach employs a Nyquist-based argument, re¬ 
sulting in a nonvanishing determinant condition, 
which must hold over all frequency and all pos¬ 
sible frequency-response values of the uncertain 
elements. However, checking the nonvanishing 
determinant is difficult, and sufficient conditions, 


in the form of semidefinite programs (Doyle 
1982; Lan et al. 1991) to ensure this are derived. 
This results in semidefinite feasibility problems 
which must hold at all frequencies. It is common 
to verify these only on a finite grid of frequencies, 
which is equivalent to ensuring that the closed- 
loop poles cannot migrate across the stability 
boundary at these frequencies. Semidefinite pro¬ 
grams can be defined which carve out inter¬ 
vals around these fixed frequencies to completely 
guarantee stability. 


Robust Synthesis 

Synthesis refers to the mathematical design of the 
control law. The nominal synthesis problem (with 
no uncertainty) is formulated using the generic 
feedback structure shown in Lig. 2. The various 
signals in the diagram are the control inputs u , 
measurements y, exogenous disturbances d , and 
regulated variables e. P is a generalized plant that 
contains all information required to specify the 
synthesis problem. This includes the dynamics 
of the actual plant being controlled as well as 
any frequency domain weights that are used to 
specify the performance objective. The objective 
of an optimal control problem is to synthesize 
a controller K that minimizes the closed-loop 
(e.g., H 2 , #oo, £\) norm from disturbances (d) 
to regulated variables ( e ), i.e., solve 


min 

allowable K 


\\Fl(P,K)\\ 


where Fl(P , K ) denotes the system obtained by 
closing the controller K around the lower loop 
of P. The # 2 , H 0 o, and C\ optimal control 
problems refer to the choice of the specific norm 
|| Fl(P, ^)|| used to specify the performance. A 
generalization of the H 0c performance objective 
is simply to require that the closed-loop map 
from d —> e satisfy an IQC defined by a given 
multiplier n, called the performance multiplier, 
Apkarian and Noll (2006). The H 2 , //oo and C\ 
optimal control problems formulated as in Lig. 2 
only involve signal uncertainty. In other words, 
these design problems do not explicitly account 
for the effects of model uncertainty. 
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Robust Synthesis and Robustness Analysis Tech¬ 
niques and Tools, Fig. 2 Feedback interconnection for 
H 2 , Hoo, and C\ optimal control 

Robust synthesis refers to control design that 
explicitly accounts for model uncertainty. It is 
usually formulated as a worst-case optimization, 
where the controller is chosen to minimize the 
worst-case effect of the signal and model uncer¬ 
tainty, loosely 

min max \\T(d,A,K)\\ 

allowable K allowable d, A 

where d is a set of exogenous disturbances 
and A corresponds to the model uncertainty 
set. T represents the closed-loop relationship 
between d , A and the controller K. /x-synthesis 
is a specific technique developed to synthesize 
control algorithms which achieve robust 
performance, i.e., performance in the presence 
of signal and model uncertainty. The objective 
of /x-synthesis is to minimize over all stabilizing 
controllers K , the peak value of /xa (Fl(P, K)) 
of the closed-loop transfer function defined by 
the interconnection in Fig. 3. P is the generalized 
plant model. The A block is the uncertain element 
from the set A, which parameterizes all of the 
assumed model uncertainty in the problem. The 
/x-synthesis optimization has high computational 
complexity (so-called NP-hard problem), though 
practical algorithms and software have been 
developed to design controllers using this control 
technique (Balas et al. 2013). Alternative robust 
synthesis approaches exist and often involve 
nonlinear optimization algorithms (Apkarian and 
Noll 2006). Drastic simplification regarding the 
models and uncertainty can be made resulting 
in problems that can be solved using LMI and 



Robust Synthesis and Robustness Analysis Tech¬ 
niques and Tools, Fig. 3 Feedback interconnection for 
/x synthesis 


semidefinite programming techniques (Boyd and 
Barrat 1991; Boyd et al. 1994). 


Computational Tools 

The MATLAB Robust Control Toolbox is a 
commercially available software product that 
is part of the Math works control product line. 
It is tightly integrated with Control System 
Toolbox and Simulink products (Balas et al. 
2013). The Robust Control Toolbox includes 
tools to analyze and design multi-input, multi¬ 
output control systems with uncertain elements. 
The primary building blocks, called uncertain 
elements or atoms, are uncertain real parameters 
and uncertain linear, time-invariant objects. 
These can be used to create coarse and simple 
or detailed and complex descriptions of model 
uncertainty. The uncertain object data structure 
eliminates the need to generate models of 
uncertainty and control analysis and design 
problem formulations, thereby allowing the 
practicing engineer to apply advanced robust 
control theory to their applications. Functions are 
available to analyze the robust stability, robust 
performance, and worst-case performance of 
uncertain multivariable system models using 
the structured singular value, /x. The Robust 
Control Toolbox also includes multivariable 
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control synthesis tools to compute controllers that 
optimize worst-case performance and identify 
worst-case parameter values. 

The IQC-Beta Toolbox is a publicly avail¬ 
able robust analysis toolbox based on the IQC 
framework (Jonsson et al. 2004). A wide range 
of robust stability and performance analysis tests 
are available for uncertain, nonlinear, and time- 
varying systems. IQC-Beta is written in MAT- 
LAB and works seamlessly with the Control 
System Toolbox objects and basic interconnec¬ 
tion functions. The Users manual nicely com¬ 
plements the literature on IQCs. The Computer 
Aided Control System Design package in Scilab, 
an open source numerical computation software, 
includes functionality for robustness analysis and 
the synthesis of robust control algorithms for 
multivariable systems (http://www.scilab.org/). 


Conclusions 

Robust control analysis and synthesis software 
tools are widely available and have been 
extensively used by industry since the late 
1980s. The availability of software tools for 
robustness analysis and synthesis played a major 
role in their wide and ubiquitous adoption in 
industry. They have been successfully applied 
to a variety of applications including aircraft 
flight control, launch vehicles, satellites, compact 
disk players, disk drives, backhoe excavators, 
nuclear power plants, helicopters, thin film 
extrusion, gas- and diesel-powered engines, 
missile autopilots, heating and ventilation 
systems, process control, and active suspension 
systems. 


Cross-References 

► LMI Approach to Robust Control 

► Optimization Based Robust Control 

► Structured Singular Value and Applications: 
Analyzing the Effect of Linear Time-Invariant 
Uncertainty in Linear Systems 
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Abstract 

Robustness analysis is the process of checking 
whether a system’s function is maintained despite 
perturbations. Robustness analysis of biological 
models is typically applied to differential equa¬ 
tion models of biochemical reaction networks. 
While robustness is primarily a yes-or-no ques¬ 
tion, for many applications in biological mod¬ 
els, it is also desired to compute a quantitative 
robustness measure. Such a measure is usually 
defined to be the maximum size of perturbations 
that the system can still tolerate. In addition, it 
is often of interest to specifically compute fragile 
perturbations, i.e., perturbations for which the 
system loses its function. 

Keywords 

Biochemical reaction networks; Fragile perturba¬ 
tions; Parametric uncertainty; Robustness mea¬ 
sure; Structural uncertainty 

Introduction 

In biological systems analysis, robustness is the 
property that a system maintains its function 
in the face of internal or external perturbations 
(Kitano 2007). For a robustness analysis, one 


therefore needs to specify the system to be ana¬ 
lyzed, the function that should be maintained, and 
the perturbation class. 

The models to which robustness analysis is 
applied are mostly differential equation models 
of biochemical reaction networks. They are gen¬ 
erally written as 

x = Sv(x), (1) 

where v e W 1 is the vector of intracellular 
concentrations; S e M"*" 2 i s the stoichiometric 
matrix, containing the information how the in¬ 
dividual network components participate in the 
reactions; and v(x) e M m is the reaction rate 
vector, in most cases a nonlinear function of the 
concentrations v. 

The biological functions that are being studied 
by robustness analysis are very broad, pertain¬ 
ing to the wide range of biological functions 
implemented by biochemical reaction networks. 
Specific problems being considered are: 

1. The occurrence of qualitative dynamical pat¬ 
terns such as sustained oscillations or multi¬ 
stability, where the system converges to one 
of multiple stable steady states depending on 
initial conditions or external stimuli (Eissing 
et al. 2005; Ma and Iglesias 2002). 

2. The steady-state concentration value for a sub¬ 
set of the biochemical network’s components 
(Shinar and Feinberg 2010; Steuer et al. 2011). 

3. Quantitative measures derived from the net¬ 
work’s dynamics, for example, the period of 
sustained oscillations (Stelling et al. 2004). 
For the perturbation classes, two approaches 

can be distinguished. In parametric robustness 
analysis, a parametrized biological model is 
given, and the perturbation consists in varying 
the values of the parameters away from their 
nominal value. In structural robustness analysis, 
perturbations to the interaction structure of the 
network or the functional form of the reaction 
rate functions v(x) are considered. Robustness 
analysis with these perturbation classes is 
presented in more detail below. 

The perturbation class is also relevant for 
two applications of robustness analysis which 
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go beyond simply deciding whether a system 
is robust or not. First, it is often of interest to 
get a better quantification of robustness than a 
binary decision. Then, it is common to define 
a robustness measure, which usually quantifies 
how large perturbations can be without affecting 
the system’s function (Ma and Iglesias 2002; 
Morohashi et al. 2002). Such a measure requires 
an appropriate definition of the perturbation size. 
With parametric perturbations, norms in parame¬ 
ter space are often useful (Ma and Iglesias 2002; 
Waldherr and Allgower 2011). With structural 
perturbations, the proximity of interaction func¬ 
tions in function space (Breindl et al. 201 1) or the 
number of changes in the interaction structure can 
be evaluated. 

Second, one often desires to compute spe¬ 
cific non-robust perturbations, i.e., perturbations 
within the given class for which the system loses 
the considered functionality. There is a close rela¬ 
tion between non-robust perturbations and the ro¬ 
bustness measure, in that the norm of the smallest 
non-robust perturbation is equal to the robustness 
measure. Yet, it is often easier to compute a ro¬ 
bustness measure than a non-robust perturbation. 
Especially algorithms that give a lower bound on 
the robustness measure will usually not provide a 
non-robust perturbation. 

An illustration of the key characteristics in 
robustness analysis is shown in Fig. 1. This 
also illustrates that any norm-based robustness 
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Illustration of key characteristics in robustness analysis 


measure depends on the nominal situation, where 
no perturbation is present. 

When performing robustness analysis on a 
mathematical model of the considered system, 
the potential mismatch between model and sys¬ 
tem has to be kept in mind. By comparing the 
mathematical analysis results to experimental ob¬ 
servations, robustness analysis methods are also 
useful for the validation or invalidation of biolog¬ 
ical network models (Bates and Cosentino 2011). 

Robustness Analysis with Parametric 
Perturbations 

Robustness analysis with parametric perturba¬ 
tions is applied to parametrized differential equa¬ 
tion models of biochemical reaction networks, 
which are described by an equation of the form 

x = Sv(x, /z), (2) 

where \i e is a vector of parameters. Such 
parameters may, for example, represent the total 
expression level of proteins involved in the re¬ 
action network, where usually a large variability 
due to the stochastic process of gene expression 
occurs. 

This entry focuses on two specific system 
functionalities for robustness with respect 
to parametric perturbations, the qualitative 
dynamical behavior, and the steady-state level of 
a subset of the network’s components. These are 
particularly relevant for biological models: the 
dynamical behavior often represents qualitative 
biological regulatory mechanisms, whereas the 
steady-state level of network components with 
a downstream regulatory effect is important for 
the stimulus-response relation of a biological 
network. 

The Qualitative Dynamical Behavior 

Considering the qualitative dynamical behavior, 
it is of interest to distinguish situations of a 
globally stable equilibrium point, multiple locally 
stable equilibrium points, or sustained oscilla¬ 
tions due to a limit cycle or more complex attrac¬ 
tors. Since changes in these dynamical patterns 
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correspond to the occurrence of bifurcations in 
the model dynamics (2), this type of robustness 
analysis is closely related to bifurcation analysis. 
In the case of scalar, positive parameters /x, a cor¬ 
responding robustness measure DOR has been 
defined by Ma and Iglesias (2002) as 

DOR = 1 — max { —, oj , (3) 

(/x 0 /X ) 

where /x and /x are the closest bifurcation points 
smaller and larger than /x, respectively. The ro¬ 
bustness measure DOR is between 0 and 1 and 
indicates how much the parameter can be varied 
before reaching a bifurcation: for any multiplica¬ 
tive perturbation of less than (1 — DOR )~ l , 
no bifurcation will occur. A generalization to 
multiparametric models has been proposed in 
Waldherr and Allgower (2011): their robustness 
measure £ is defined as 

q = sup{$ > 1 | no bifurcation occurs in the 
hyperrectangle [g - Vo ,6Vo ]} • (4) 

The measure q directly gives the multiplicative 
parameter variation up to which no bifurcation 
occurs. 

In general, the information required for a 
bifurcation-based robustness measure will only 
be available from a complete bifurcation analysis 
of the model. When restricting the types of 
bifurcations that are considered to bifurcations 
of equilibrium point, one can however check 
robustness by studying linear approximations 
at the system’s equilibrium points. Since the 
reaction rates v(x,/jl) are usually modeled as 
polynomial or rational functions, polynomial 
programming methods can be applied to compute 
a robustness measure (Waldherr and Allgower 
2011) in this case. 

The Steady-State Output Concentration 

In biochemical network analysis, mostly linear 
outputs of the form 

y = Cx, (5) 


with C G R qxn are considered. A common 
special case is that the rows of C are a subset of 
the rows of the identity matrix in M", i.e., 

C = WW (6) 

and X y C {1,2,is the index set defining 
the output concentrations. 

A biochemical network has a robust steady- 
state output concentration, if the steady-state out¬ 
put y is independent of the parameters /x (Steuer 
et al. 2011). For a steady-state map y = 
this corresponds to the condition 

h'Qi) = 0. (7) 

For the special case of an output given 
by (6), a sufficient and necessary condition 
for steady-state output robustness has been 
discovered by Steuer et al. (2011). The 
condition amounts to checking that a vector 
P, which describes the perturbation of the 
reaction rates under parameter variations, is 
in a subspace X = im M + ker S diag(a) 
for any a in the kernel of S , where M is a 
matrix composed of the normalized derivatives 
of the reaction rates with respect to the 
concentrations which do not appear in the output. 
A notable underlying assumption here is that 
the network’s steady state does not undergo 
any local bifurcations within the considered 
parameter region, which directly relates back to 
the robustness analysis discussed in the previous 
section. 

For the special case where parameters are the 
concentrations of conserved chemical species, a 
sufficient condition for steady-state output ro¬ 
bustness has also been discovered by Shinar and 
Feinberg (2010). They propose the term absolute 
concentration robustness for this property. Here, 
the assumption that no local bifurcations occur 
within the considered parameter region is not 
required a priori but rather is also a consequence 
of the proposed condition. 
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Robustness Analysis with Structural 
Perturbations 

Robustness analysis with parametric perturba¬ 
tions is based on the assumption that the re¬ 
action rate expressions are exact and that all 
perturbations are captured by parameter varia¬ 
tions. This assumption can hardly be justified 
for many practical models, and an analysis with 
structural perturbations becomes necessary. Such 
analyses have discovered models which are very 
robust against parametric perturbations but non- 
robust against structural perturbations (Jacobsen 
and Cedersund 2008). 

The biological functions for which rigorous 
results on structural robustness are available are 
again related to the nonoccurrence of bifurcations 
in the model. For the restriction to local bifur¬ 
cations of equilibria, linear systems theory offers 
efficient analysis tools for structural robustness. 

In a first step, a structural perturbation of 
the network’s interaction graph was suggested 
(Jacobsen and Cedersund 2008). This approach 
considers the network’s Jacobian 

dv 

A = S — (x ) (8) 

ox 

evaluated at a steady state x. The Jacobian is then 
perturbed to 

A = diag A + (A - diag A)(I + A), (9) 

where diag ,4 is the diagonal of A and A is a 
perturbation matrix, containing uncertain time- 
invariant linear systems as elements. 

As an alternative approach, Waldherr et al. 
(2009) have suggested a structural perturbation 
of the reaction rate expressions. Thereby, the 
network’s Jacobian is perturbed to 

A = S (J(x) + a) . (10 ) 

In the case of real A, this perturbation simply 
corresponds to a change in the reaction rate slopes 
at steady state. 


With both approaches, robustness analysis 
with structured singular values can be applied 
to test for changes in the local dynamics at the 
considered equilibrium point. This allows to 
evaluate a model’s robustness against this type 
of structural perturbations and also yields non- 
robust perturbations. 

Summary and Future Directions 

Robustness analysis of biological models is well 
established in biological network theory. Math¬ 
ematical methods rooted in systems and control 
are particularly beneficial for approaching this 
task. 

While this entry focuses on models of bio¬ 
chemical reaction networks given by differential 
equations, the robustness analysis problem has 
also been studied in other model frameworks, 
for example, discrete dynamical models (Chaves 
et al. 2006). Yet, beyond simulation-based stud¬ 
ies, robustness analysis is still an open problem 
in many practically relevant biological model 
classes. This concerns, for example, stochastic 
models or models on the cell population level. 

In a similar manner, it will be important to 
extend the perturbation classes that are being 
considered and to include, for example, time- 
varying or other perturbations that are relevant 
for biological models. Concerning the biological 
function, most robustness analysis methods focus 
on the steady-state behavior. In the future, it 
will be of interest to also take, for example, the 
transient dynamics into account. 

In linear systems theory, the concept of robust 
performance is well established. While efforts 
have been made to transfer that concept to bio¬ 
chemical networks (Doyle and Stelling 2005), 
it remains difficult to quantify performance of 
such networks, thus impeding the development 
of stringent robustness analysis tools. One of 
the reasons for this difficulty is certainly that 
biological performance is more naturally defined 
in the time domain than in the frequency domain, 
which narrows the conclusions that could be 
drawn from a direct application of classical robust 
performance analysis methods. 



Robustness Issues in Quantum Control 


1243 


Cross-References 

► Computational Complexity Issues in Robust 
Control 

► Deterministic Description of Biochemical Net¬ 
works 

► Structured Singular Value and Applications: 
Analyzing the Effect of Linear Time-Invariant 
Uncertainty in Linear Systems 


Bibliography 

Bates D, Cosentino C (2011) Validation and invalidation 
of systems biology models using robustness analysis. 
IET Syst Biol 5(4):229-244 

Breindl C, Waldherr S, Wittmann DM, Theis FJ, Allgower 
F (2011) Steady-state robustness of qualitative gene 
regulation networks. Int J Robust Nonlinear Con¬ 
trol 21(15): 1742-1758. doi:10.1002/mc.l786, http:// 
dx.doi.org/10.1002/mc.1786 

Chaves M, Sontag ED, Albert R (2006) Methods of 
robustness analysis for Boolean models of gene 
control networks. IEE Proc Syst Biol 153(4): 154- 
167. doi:10.1049/ip-syb:20050079, http://dx.doi.org/ 
10.1049/ip-syb:20050079 

Doyle FJ, Stelling J (2005) Robust performance in bio¬ 
physical networks. In: Proceedings of the 16th IFAC 
World Congress, Prague 

Eissing T, Allgower F, Bullinger E (2005) Robustness 
properties of apoptosis models with respect to param¬ 
eter variations and intrinsic noise. IEE Proc Syst Biol 
152(4):221-228. doi: 10.1049/ip-syb:20050046 

Jacobsen EW, Cedersund G (2008) Structural robustness 
of biochemical network models—with application to 
the oscillatory metabolism of activated neutrophils. 
IET Syst Biol 2(l):39-47. http://link.aip.org/link/? 
SYB/2/39/1 

Kitano H (2007) Towards a theory of biological robust¬ 
ness. Mol Syst Biol 3:137. doi:10.1038/msb4100179, 
http: //dx. doi. org/10.103 8/msb4100179 

Ma L, Iglesias PA (2002) Quantifying robustness of bio¬ 
chemical network models. BMC Bioinform 3:38 

Morohashi M, Winn AE, Borisuk MT, Bolouri H, Doyle 
J, Kitano H (2002) Robustness as a measure of plau¬ 
sibility in models of biochemical networks. J Theor 
Biol 216(1): 19-30. doi:10.1006/jtbi.2002.2537, http:// 
dx.doi.org/10.1006/jtbi.2002.2537 

Shinar G, Feinberg M (2010) Structural sources of ro¬ 
bustness in biochemical reaction networks. Science 
327(5971): 1389-1391. doi: 10.1126/science. 1183372, 
http://dx.doi.Org/10.l 126/science. 1183372 

Stelling J, Gilles ED, Doyle III FJ (2004) Robustness 
properties of circadian clock architectures. 

Proc Natl Acad Sci 101(36): 13210-13215. 

doi: 10.1073/pnas .0401463101, http ://dx. doi. org/ 


10.1073/pnas.0401463101 

Steuer R, Waldherr S, Sourjik V, Kollmann M (2011) 
Robust signal processing in living cells. PLoS Corn- 
put Biol 7(ll):el002218. http://dx.doi.org/10.1371 
% 2Fj oumal .pcbi. 1002218 

Waldherr S, Allgower F (2011) Robust stability 
and instability of biochemical networks with 
parametric uncertainty. Automatica 47:1139-1146. 
doi: 10.1016/j. automatic a .2011.01.012, http: //dx. doi. 
org/10.1016/j .automatica.2011.01.012 

Waldherr S, Allgower F, Jacobsen EW (2009) Kinetic 
perturbations as robustness analysis tool for biochem¬ 
ical reaction networks. In: Proceedings of the 48th 
IEEE Conference on Decision and Control, Shang¬ 
hai, pp 4572-4577. doi:10.1109/CDC.2009.5400939, 
http://dx.doi.org/10.1109/CDC.2009.5400939 


Robustness Issues in Quantum 
Control 

Ian R. Petersen 

School of Engineering and Information 
Technology, University of New South Wales, the 
Australian Defence Lorce Academy, Canberra, 
Australia 


Abstract 

Robust quantum control theory is concerned with 
the design of controllers for quantum systems 
taking into account uncertainty is the model of the 
system. The robust open-loop control of quantum 
systems is discussed in this entry. Also discussed 
is the robust stability analysis problem for quan¬ 
tum systems, and two forms of quantum small 
gain theorem are presented. In addition, the entry 
discusses the design of robust quantum feedback 
control systems. 

Keywords 

Ensemble controllability; H 00 control; Minimax 
control; Quantum control; Robustness; Robust 
stability 


This work was supported by the Australian Research 
Council (ARC). 






1244 


Robustness Issues in Quantum Control 


Introduction 

The control of systems whose dynamics are gov¬ 
erned by the laws of quantum mechanics is the 
subject of quantum control theory. The topic of 
quantum control theory is covered in the com¬ 
panion article Petersen (2014). As in the case 
of classical control theory, the models used in 
quantum control are often subject to uncertain¬ 
ties. This motivates the study of robust quantum 
control, in which the quantum systems to be 
controlled are modeled as uncertain quantum sys¬ 
tems, e.g., see Mabuchi and Khaneja (2005). A 
related problem is the problem of robust estima¬ 
tion and filtering for uncertain quantum systems, 
e.g., see Yamamoto and Bouten (2009). The issue 
of robust stability is particularly important in the 
case of quantum feedback control since in this 
case, there is always the possibility of instability. 
An important area of quantum control theory is 
open-loop quantum control; see Petersen (2014). 
Since uncertainties arise in the quantum system 
models being considered, the robustness of open- 
loop quantum control systems is also important, 
e.g., see Li and Khaneja (2009), Rabitz (2002), 
and Owrutsky and Khaneja (2012). 

This entry surveys some of the important re¬ 
search results on robust quantum control which 
have arisen in various application areas. These 
include some recent results on robust open-loop 
control of quantum systems; see Zhang and Ra¬ 
bitz (1994). Also considered are some recent 
results on robust stability analysis results for 
uncertain quantum systems, which amount to 
quantum versions of the classical small gain the¬ 
orem; see Petersen et al. (2012). Finally, the 
entry looks at robust quantum feedback controller 
design; see James et al. (2008) and Dong et al. 
(2009). 


Robust Open-Loop Control of 
Quantum Systems 

In the robust open-loop control of quantum sys¬ 
tems, the quantum system is modeled in the 
Schrodinger picture. The models can be given 


either in terms of the Schrodinger equation for the 
system state | 


‘i m)) 


m 

Hq + Uk ( t) H k 

k =l 


m))w 


or the master equation for the system density 
operator p\ 


pit) = -i 


H 0 + Uk{t)H k J , p(t) 


k =1 


( 2 ) 


e.g., see Petersen (2014). In these equations, Ho 
is the free Hamiltonian of the system and H k are 
corresponding control Hamiltonians. In the ro¬ 
bust open-loop control of quantum systems, these 
quantities are assumed to be uncertain and the 
control law u k it) is to be designed to guarantee 
an adequate level of performance for all possible 
values of the uncertainties. Here, performance is 
measured in terms of the fidelity between the 
actual final state or density matrix of the system 
and the desired final state or density matrix, e.g., 
see Nielsen and Chuang (2000). 

In the minimax optimal control approach to 
robust open-loop control of quantum systems, the 
uncertainties in the Hamiltonian are represented 
in terms of a vector quantity w which is subject 
to constraints. Then, the robust control problem 
is the minimax optimal control problem 

min max Jiu,w) 

u w 


where Jiu, w) is a suitable cost function, and the 
problem is subject to the constraints defined by 
the system dynamics (1) and the constraints on 
the uncertainty w; see Zhang and Rabitz (1994). 
Some standard numerical procedures have been 
proposed to solve this minimax optimal control 
problem with applications in chemical physics; 
see Zhang and Rabitz (1994). 

Related to the robust open-loop control of 
quantum systems is the control of inhomoge¬ 
neous quantum ensembles. In this problem, 
the same control signal u k it) is applied to 
a large number of quantum particles in an 
ensemble. Also, the Hamiltonians corresponding 
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to individual particles may have different 
parameter values, and so this problem is 
equivalent to a robust open-loop quantum 
control problem, e.g., see Li and Khaneja 
(2006). In studying this problem, the issue of 
controllability has been considered (see, e.g., Li 
and Khaneja 2009) as in the standard open-loop 
quantum control problem; see Petersen (2014). 
Also, numerical methods have been proposed 
for constructing an optimal control law for 
inhomogeneous ensembles, e.g., see Ruths and Li 
(2012) and Owrutsky and Khaneja (2012). This 
approach has arisen in applications to chemical 
physics. 


Robustness Analysis for Uncertain 
Quantum Systems 


see James et al. (2008) and Petersen (2014) for 
more details on this class of quantum system 
models. Here, x(t) are vector system variables 
which are operators on the underlying Hilbert 
space of the system. Also, the input and output 
fields are decomposed as du{t) = /3 u (t)dt +du(t) 
and dy (t) = /B y (t)dt +d y(t) where /3 u (t), f} y (t) 
denote the signal parts of the quantities du{t ), 
dy (t), respectively. Furthermore, du(t), dy(t) de¬ 
note the noise parts of the quantities du(t), dy(t), 
respectively, e.g., see James et al. (2008). Such a 
system is stable and has a finite gain g > 0 if 
there exist constants /x > 0 and A > 0 such that 

[‘(\\p y (z)\\ 2 )dt < II + Xt 
Jo 

+ [ (IIA,(r)|| 2 )df Vf > 0; 

Jo 


The problem of robust stability analysis for un¬ 
certain quantum systems was considered in the 
paper D’Helon and James (2006) which was con¬ 
cerned with the feedback interconnection of two 
quantum optical systems as shown in Fig. 1. In 
this interconnection, each of the quantum systems 
is linear quantum optical systems described in the 
Heisenberg picture by linear quantum stochastic 
differential equations (QSDEs) of the form 

d x(t) = Ax(t)dt + Bdu(t); 

d y(t) = Cx(t)dt + Ddu(t ); (3) 


Zi 



Robustness Issues in Quantum Control, Fig. 1 

Feedback interconnection of two quantum optical systems 


e.g., see D’Helon and James (2006) and James 
and Gough (2010). Here, (•) denotes quantum 
expectation. 

The two quantum optical systems shown in 
Fig. 1 are interconnected via beam splitters which 
are described by equations 


mi = <?iwi - y l -j 2 ; zi =y l - + ei y 2 ; 

U 2 = <? 2 w 2 - yj 1 - ejyi ; z 2 = yj 1 - efw 2 + e 2 ji 

where €\ e ( 0 , 1 ) and 62 e ( 0 , 1 ) are given 
constants. The quantum small gain theorem es¬ 
tablished in D’Helon and James (2006) shows 
that if each of the quantum systems in Fig. 1 is 
stable and has finite gains g 1 > 0 and g 2 > 0 re¬ 
spectively such that yj 1 — €\g\g 2 < 1 , 

then the feedback interconnected system will also 
be stable and have a finite gain. This result can be 
thought of as a stability robustness result if the 
first quantum system is regarded as the nominal 
quantum system and the second quantum system 
is regarded as being the uncertain part of the 
system subject to the given finite gain constraint. 

An alternative approach to the robust stability 
analysis of uncertain quantum systems considers 
an uncertain quantum system described using the 
(S,L,H) description (see Petersen (2014) and 
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Gough and James (2009) for more details on 
this class of quantum systems). Here, the system 
Hamiltonian is described in terms of vectors of 
annihilation and creation operators a and a # , 
respectively, as 


H = - [a* a T ]M 




where M is a known complex Hermitian matrix 
describing the nominal Hamiltonian, A is a com¬ 
plex Hermitian uncertainty matrix subject to the 


norm bound ||A|| < -, and £ = E 


. Also, 


E is a known complex matrix describing the 
uncertainty structure. Furthermore, it is assumed 
that S = I and the coupling operator vector L is 


such that 


L 

L* 


= N 


where A is a known 


complex matrix. This uncertain quantum system 
is robustly mean square stable if the H 00 norm 
bound condition 


E | si + iJM + ^JNUN ) JE f 


< 


is satisfied where J 
etal. (2012). 


/ 0 
0 -/ 


see Petersen 


Robust Feedback Control of Quantum 
Systems 

Schrodinger Picture Approaches to Robust 
Measurement-Based Quantum Feedback 
Control 

A number of results have appeared which use 
Schrodinger picture models (see Petersen 2014) 
in robust measurement-based quantum feedback 
control. These results are based on uncertain 
quantum system models of the form (1) or (2) and 
extend the results mentioned above by allowing 
for measurements of the quantum system in order 
to achieve improved robustness against uncer¬ 
tainties in the system Hamiltonian. For example, 
consider a quantum system of the form (1) with 
uncertainties in the system Hamiltonian. Then a 


measurement feedback robust control scheme can 
be constructed which involves periodic projective 
measurements on the system. In a projective 
measurement of the quantum system (1), the state 
|^(0) collapses to an eigenstate of Hq corre¬ 
sponding to the measurement outcome obtained. 
The sliding mode control algorithm uses open- 
loop time optimal control (see Petersen 2014) to 
steer the state of the system back to a specified 
eigenstate of the system whenever a measurement 
is obtained which does not correspond to this 
desired eigenstate; see Dong and Petersen (2009). 
This desired eigenstate is referred to as the slid¬ 
ing mode domain, and the state of the system 
is guaranteed to stay within the sliding mode 
domain with a specified probability provided that 
the measurement sampling period in the pro¬ 
posed feedback control algorithm is chosen to be 
sufficiently fast; see Dong and Petersen (2009). 
In the case of two-level quantum systems, this 
sliding mode control approach is implemented 
using a Lyapunov method for open-loop quantum 
control to steer the system back to the sliding 
mode domain; see Petersen (2014) and Dong and 
Petersen (2012). In all of these cases, robustness 
is ensured by including uncertainty in the under¬ 
lying quantum system models and then taking 
this into account in the design of the control laws 
and sampling period. 

Another approach to the measurement- 
based robust quantum feedback control problem 
involves an extension of the robust open-loop 
control results considered in section “Robust 
Open-Loop Control of Quantum Systems.” In this 
approach, robust open-loop control results are 
extended to solve the problem of stabilization of 
an ensemble of quantum particles; see Beauchard 
et al. (2012). 

Heisenberg Picture Approaches to Robust 
Quantum Feedback Control 

Consider a quantum linear system modeled in 
the Heisenberg picture by quantum stochastic 
differential equations (QSDEs) as follows: 

d x(t) = Ax(t)dt + Bdw(t); 

dy(t) = Cx(t)dt + Ddw(t); (4) 
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see Petersen (2014) for details on this class of 
quantum system models which arises in the area 
of quantum optics. In the robust quantum feed¬ 
back control problem, the matrices A, B , C, 
D may be uncertain and a feedback controller 
can be designed using the quantum H°° control 
approach to ensure that the resulting closed-loop 
system is robustly stable; see James et al. (2008). 
In the case of measurement-based feedback con¬ 
trol, the controller is a classical system described 
by linear stochastic differential equations of the 
form 

d x K (t) = A K x k (t)dt + B K dy(t) 

P u (t)dt = C K Xk{t)dt\ (5) 

see Petersen (2014). In the case of coherent feed¬ 
back control, the controller is another quantum 
linear system described by QSDEs of the form 

d xk(J) = A K Xk(t)dt + B K dy(t) + B K dwx(t) 
d = CKXk(t)dt + D K dwK(t ); (6) 

see Petersen (2014). 

In this approach to robust quantum feedback 
control, the uncertainty in the quantum system 
being controlled is represented by uncertainty 
in the matrix A as A = A + BAC where 
A is a constant but unknown uncertain matrix 
satisfying the bound A r A < /. The controller, 
which may be either a classical controller or a co¬ 
herent controller, is designed using the quantum 
H°° approach. Then the resulting closed-loop 
system will be robustly stable; see James et al. 
(2008). Similarly, in the case of uncertainty in 
the plant Hamiltonian matrix such as considered 
in section “Robustness Analysis for Uncertain 
Quantum Systems” or uncertainty in the form 
of an uncertain subsystem connected optically to 
the plant in feedback, also as considered in sec¬ 
tion “Robustness Analysis for Uncertain Quan¬ 
tum Systems,” then the quantum H°° approach 
combined with the robust stability analysis results 
of section “Robustness Analysis for Uncertain 
Quantum Systems” shows that the quantum H°° 
method can also be used to design robustly stabi¬ 
lizing controllers in these cases. 


Summary and Future Directions 

To date there have been only a few papers pub¬ 
lished in the general area of robust quantum 
control. The results which were considered in this 
entry covered open-loop and feedback quantum 
control problems along with stability robustness 
analysis problems. A common theme in the re¬ 
sults which were considered is that they were 
based on uncertain quantum mechanical models. 
It is expected that future research in this area will 
intensify as the use of feedback control becomes 
more prevalent in areas of experimental quantum 
technology. 


Cross-References 

► Control of Quantum Systems 

► H-Infinity Control 

► LMI Approach to Robust Control 

► Optimization Based Robust Control 
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Run-to-Run Control in 
Semiconductor Manufacturing 

James Moyne 
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of Michigan, Ann Arbor, MI, USA 

Abstract 

Run-to-run (R2R) control is a form of adap¬ 
tive model-based process control that can be 
tailored to environments where the process is 
discrete, dynamic, and highly unobservable; this 
is characteristic of processes in the semiconduc¬ 
tor manufacturing industry. It generally has, at 


its roots, a rather straightforward approach to 
adaptive model-based control such as a first-order 
linear plant model with moving average weight¬ 
ing applied to adapt the (zeroth-order) constant 
term in the model. Most of the complexity of 
R2R control science lies and will continue to 
lie in extensions to support practical applica¬ 
tion of R2R control in semiconductor manu¬ 
facturing facilities of the future; these exten¬ 
sions include support for weighting and bound¬ 
ing of parameters, run-time modeling of a large 
number of disturbance types, and incorporating 
prediction information such as virtual metrol¬ 
ogy and yield prediction into the control solu¬ 
tion. 


Keywords 

Adaptive control; Advanced process control 
(APC); EWMA control; Feed-forward and 
feedback control; Model-based control; R2R 
control; Run-to-run control; Single-threaded 
control; Virtual metrology; Wafer-to-wafer 
control; Yield prediction 

Introduction 

The semiconductor manufacturing industry 
involves the processing of semiconductor 
“wafers” using a variety of physical and chemical 
processes to produce dies or “chips” that contain 
a number of nanometer size features organized 
in layers. As feature sizes shrink, the industry 
must innovate to maintain acceptable product 
yield and throughput. One effective dimension 
of innovation that has been utilized since the 
early 1990s is model-based process control. 
The use of this technology in semiconductor 
manufacturing has been largely industry 
specific due to unique industry requirements 
and been given the name “run-to-run (R2R) 
control.” 

R2R control is defined as “.. .a form of dis¬ 
crete process and machine control in which the 
product recipe with respect to a particular ma¬ 
chine process is modified ex situ, i.e., between 
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Run-to-Run Control in Semiconductor Manufacturing, Fig. 1 Input/output structure of a typical R2R control 
solution 


machine ‘runs,’ so as to minimize process drift, 
shift, and variability” (Moyne et al. 2000). (The 
“recipe” is the group of process settings for a pro¬ 
cess or process step, e.g., temperature, flow, and 
pressure.) The term “R2R control” was coined in 
the early 1990s in the semiconductor industry as 
the industry struggled to come up with mecha¬ 
nisms to keep critical semiconductor manufactur¬ 
ing processes such as chemical vapor deposition 
(CVD), chemical mechanical polishing (CMP), 
and reactive ion etching (RIE) under control. 
The processes are highly unobservable and are 
subjected to a number of disturbances. However, 
many of these disturbances can be modeled or 
tracked as they create measurable shifts in the 
process (e.g., after a maintenance operation) or 
gradual drifts in the process (e.g., chamber wall 
“seasoning” of an etch process over time, result¬ 
ing in polymer buildup on chamber walls, causes 
changes to the operational effectiveness of the 
tool). Process and product quality is generally 
assessed through metrology measurements made 
ex situ, i.e., after the process is complete; ex¬ 
amples of post-process metrology parameters are 
wafer average deposited or removed film thick¬ 
ness and film uniformity. R2R control generally 
uses statistically developed models of tool pro¬ 
cess operation updated or “tuned” with process 
metrology feedback information on a “run-to- 
run” basis to keep the process under control and 
process quality high, in the face of these process 
drifts and shifts, as shown in Fig. 1. Note that the 
granularity of control could be wafer-to-wafer, or 
batch-to-batch (“lot-to-lot”), etc. 


Run-to-Run Control Approach 

Because the processes are highly unobservable 
and dynamic, rather simple model forms are 
usually employed with filtering techniques 
used to track process shift and drift. The most 
commonly utilized R2R controller in the industry 
is the exponentially weighted moving average 
(EWMA) controller. The algorithm uses a 
linear model with an additional constant term. 
(Equations will use the following notation: 
arrays of vectors will be capitals, vectors 
will be lower case, and indexing within a 
vector or matrix will be lower case with 
subscripts. In addition, the special subscript 
“t” will be reserved for time or run number 
information.) 

Y = Ax + c (1) 

where: 

y = System output, 
x = Input (Recipe), 

A = Slope coefficients for equation, 
c = Constant term for linear model. 

Each output represents a target of control (usu¬ 
ally measured by pre- and post-process metrol¬ 
ogy tools), and each input represents an ad¬ 
justable parameter in the recipe. 

yi = anxi + ai 2 x 2 + .. .ai m x m + ci 

yn — a n iXi + a n2 x 2 + .. .a nm x m + c n (2) 
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The models are generally developed by ex¬ 
ecuting a design of experiments (DOE), where 
the process area is explored with respect to the 
allowed variation of the process inputs by pro¬ 
cessing wafers with various input settings (see, 
e.g., Box and Draper 1987). Statistical packages 
are then used to determine the base model of 
the form described in (1) at the normal process 
operating point. As the processes are dynamic, 
the base model is updated on a “run-to-run” basis 
to compensate for model error. The algorithm 
operates under the assumption that the underlying 
process is locally approximated by a first-order 
linear polynomial model in the form of equa¬ 
tion (1) and that this polynomial model can be 
maintained near a local optimal point solely by 
updating the constant term “c.” 

The control process involves updating the 
model and then using that model to compute 
a recipe update. The model is updated by first 
comparing the actual process output, Y t , to the 
model-predicted process output, AX t . Using an 
EWMA filtering technique as an example, the 
constant term, c t can be updated as follows: 

c t = a(y t - Ax t ) + (7 - a)c t -i (3) 

where a is a weighting factor between 0 and 
1, often called a “forgetting factor.” Note that 
because of the additive nature of the EWMA 
series, the C t calculation only requires knowledge 
(and storage) of the previous run measurements; 
this, combined with its relative simplicity, led to 
the widespread adoption of EWMA as the R2R 
controller filter of choice in this industry during 
the 1990s and early 2000s. 

Once the model is updated, the process recipe 
is calculated. Since there are generally more in¬ 
puts that can be tuned than outputs measured, the 
process is underdetermined and there is an infi¬ 
nite solution space. Approaches such as Lagrange 
multipliers are used to determine the solution that 
is closest to the previous solution (Moyne et al. 
2000 ). 

Many extensions and alternatives to this 
basic approach have been developed and 
deployed over the past 10 years. These include 
(1) the replacement of EWMA filtering with 


other approaches such as the more general 
Kalman filtering, (2) explicitly modeling drift 
(termed “predictor corrector”), (3) modeling 
updates to first-order terms (in the “A” 
matrix), and (4) leveraging phenomenological 
models that capture process knowledge in 
equation forms, customized and tuned with 
statistical data. Perhaps the most important 
extensions to the basic approach involve 
addressing the practical issues associated 
with control systems application in this 
area. For example, providing capabilities for 
addressing bounding, weighting, and granularity 
(e.g., integer) of input and output settings 
often requires much more programming 
effort than supporting the core algorithm 
(Moyne et al. 2000). 


Current Status and Future Extensions 

Over the past 10 years, R2R control has evolved 
from a value-added capability applied to a few 
processes, to a required component to achieve 
cost and productivity competitiveness in most 
processes in the semiconductor manufacturing 
industries (ITRS 2014). As part of this evolution, 
a number of common trends in the R2R control 
space have emerged: 

Support for fab-wide reusable and re configurable 
solutions for R2R control: As the benefits of 
R2R control were proven across multiple pro¬ 
cesses in semiconductor fabrication facilities, the 
focus turned to reusable and reconfigurable inte¬ 
grated “fab-wide” solutions for R2R control. The 
event-based capabilities described in Chapter 9 of 
Moyne et al. (2000) were leveraged to provide 
these solutions as they allow for integration and 
configuration of R2R control solutions to the 
particular application environment. This event- 
based approach has also been used to integrate 
R2R control with other capabilities such as fault 
detection and classification (FDC), work schedul¬ 
ing, and “virtual metrology” (see below), to pro¬ 
vide another level of benefits towards improved 
product yield and throughput (Khan et al. 2007; 
Moyne 2004, 2009). 
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Movement to more granular control : The 
evermore stringent requirements on product 
quality are being addressed in large part by a 
movement from batch-level control (often called 
“lot-based control” in this domain), to wafer- 
level control (usually called “wafer-to-wafer” 
(W2W) control), to within-wafer (WIW) control. 
Although the granularity has changed, the basic 
approach to control has not. It is important to 
note that the improvement in quality associated 
with this trend results mostly from the use 
of pre-(process) metrology to reject incoming 
product disturbances, rather than post metrology 
to address the dynamics of the plant model (ITRS 
2014; Moyne et al. 2000). 

Support for control across multiple recipes using 
“single-threaded control Semiconductor man¬ 
ufacturing process control systems are character¬ 
ized by a number of disturbance types that usually 
can be modeled as independent from the base pro¬ 
cess model and from each other. Perhaps the most 
common type of disturbance that is addressed 
is recipe or product change. When there is a 
change in product and related product recipe, a 
single-process model must be adjusted to capture 
this disturbance while maintaining knowledge of 
process drift and/or shift. Oftentimes this pro¬ 
cess disturbance can be modeled as a shift to 
the overall process. Thus, the process model of 
equation ( 1 ) can be adjusted to the following: 

Y = Ax + Ci + C 2 + C 3 + ... + c n (4) 

where: 


ci, C 2 ,...c n —1 = constant terms associated 

with modeled disturbances such as product 
c n = constant term associated with process 
dynamics (drift and shift) 
ci + c 2 + .. .c n = c in Eq. (1) 

Approaches have been devised for the assess¬ 
ment of ci associated with a particular disturbance 
type (Edgar et al. 2004; Zou 2013); the result is 
that a single-control model can be used across 
multiple product recipes and other disturbance 
types. 

Enhancing R2R control with “virtual metrology ”: 
Ex-situ metrology plays a crucial role in semicon¬ 
ductor manufacturing as it is often the only source 
of product quality data before and after a process. 
However, given its high capital equipment cost 
and cycle time impact on critical processes, op¬ 
timizing metrology by minimizing wasteful use 
and optimizing measurement value is important. 
Virtual metrology (VM) is a new technology 
rapidly gaining acceptance in the marketplace as 
an efficient and cost-effective way to optimize 
and augment metrology value. VM is a model¬ 
ing and metrology prediction solution whereby 
process and product data, such as in situ fault 
detection (FD) information and upstream metrol¬ 
ogy information, is correlated to post-process 
metrology data. This same data can then be used 
to predict metrology information when conven¬ 
tional metrology data is not available (Cheng 
et al. 2011; Khan et al. 2007). 

One of the uses of VM that is expected to 
become prominent over the next decade is in 
support of enhanced R2R control. As shown in 
Fig. 2, fault detection (FD) summary information 


Run-to-Run Control in 
Semiconductor 
Manufacturing, Fig. 2 

Virtual metrology 
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is used along with adaptive VM modeling to pre¬ 
dict metrology information. The VM predictions 
are then used to fill in the measurement gaps in 
feed-forward and feedback control thus enabling 
wafer-to-wafer or even within-wafer control. One 
of the research challenges is to optimally tune the 
control to best utilize both the real and predicted 
metrology information. This requires that VM 
data contain information on predicted measure¬ 
ment data quality (Khan et al. 2007). 
u(n) Tunable process inputs 
v(n) FD summary information 
y(k) Metrology measurement data for mea¬ 
sured wafers 

y (n) Predicted metrology measurements for all 
wafers 

ai (n) Feedback filter coefficient for feedback of 
measured data 

«2 in) Feedback filter coefficient for feedback of 
predicted data 

Movement towards interprocess and eventually 
fab-wide control'. The generally accepted vision 
of the future of advanced process control (APC) 
in general is a fabrication-wide fully integrated 
solution that incorporates all of the APC ca¬ 
pabilities (R2R control, FDC, fault prediction, 
and statistical process control) as well as pre¬ 
dictive capabilities such as predictive scheduling, 
predictive maintenance, virtual metrology, and 
predictive yield (ITRS 2014). Opportunities for 
research and development exist with the inte¬ 
gration of these technologies, especially as the 
powers of the predictive domain are tapped. For 
example, it is expected that R2R control will 
eventually incorporate predicted yield as a target 
with feedback to multiple coordinated process 
controllers (Moyne and Schulze 2010). Thus, the 
future of research in R2R control, while evolving, 
should remain strong in the coming years. 

Summary and Future Directions 

R2R control is a form of adaptive model-based 
process control that is tailored to environments 
where the process is discrete, dynamic, and 
highly unobservable; this is characteristic of 
processes in the semiconductor manufacturing 


industry. R2R control has evolved from a strictly 
research effort in the early 1990s to a required 
facility-wide capability in all of semiconductor 
manufacturing. It generally has, at its roots, 
a rather straightforward approach to adaptive 
model-based control. Most of the complexity of 
R2R control science lies and will continue to 
lie in extensions to support practical application 
of R2R control in semiconductor manufacturing 
facilities of the future. 

The science of R2R control will continue to 
expand as the academic and industry communi¬ 
ties look to incorporating capabilities that will 
allow R2R control to continue to be an integral 
part of the fabrication facility of the future. One 
key research direction over the next decade is the 
development of approaches for incorporating vir¬ 
tual metrology and yield prediction into control 
solutions. Other focus areas will likely include 
hybrids of R2R control and continuous process 
control, learning mechanisms for single-threaded 
control in “high-mix” environments where there 
are a large number of disturbances that should be 
modeled, phenomenological R2R control mod¬ 
els, and model libraries that combine stochastic 
information with process physics and chemistry 
knowledge, control solutions that are more di¬ 
rectly optimized to financial parameters such as 
yield and throughput, and R2R control solutions 
that incorporate other analysis capabilities, such 
as FDC, either algorithmically or via event-based 
control rule approaches. Each of these topics 
provides significant opportunity for research as 
well as benefit in application to semiconductor 
manufacturing facilities. 

Cross-References 

► Adaptive Control, Overview 

► Controllability and Observability 

► Event-Triggered and Self-Triggered Control 

► Experiment Design and Identification for Con¬ 
trol 

► Fault Detection and Diagnosis 

► Kalman Filters 

► Moving Horizon Estimation 

► Nominal Model-Predictive Control 
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► Robust Model-Predictive Control 

► Stochastic Model Predictive Control 
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Abstract 

Hoo optimization is central in robust control. 
When controllers are implemented by computers, 
sampled-data control systems arise. Designing 
Hoo -optimal controllers in purely continuous 
time or in purely discrete time is standard 
in robust control; in this entry, we discuss 
the process of sampled-data optimization, 
namely, designing digital controllers based on 
a continuous-time Hoo performance measure. 

Keywords 

Computer control; Hoo discretization; Robust 
control; Sampled-data systems 

Introduction 

Robust control deals mainly with controller de¬ 
sign against uncertainties in system modeling 
and disturbances. The central tool used is Hoo 
optimization. 


In continuous time, consider the standard 
setup in Fig. 1, where G is the generalized plant 
and K is the controller; G has two inputs (w, 
the exogenous input, and u , the control input) 
and two outputs (z, the output to be controlled, 
and y, the measured output); K processes y to 
generate u. The % 00 -optimal control problem 
is to design K to stabilize G and minimize the 
Hoo norm of the closed-loop system in Fig. 1 
from w to z, denoted T zw . When both G and K 
are continuous-time, linear time-invariant (LTI), 
the Hoo norm, ||T ZW ||, relates to the frequency 
response matrix T zw (jco ) as follows: 

\\T ZW \\ = sup a , 

0) L J 

where a indicates the maximum singular value. 
This Hoo -optimal control problem in the LTI 
case is solvable by many techniques, e.g., Riccati 
equations and linear matrix inequalities - see 
robust control textbooks by Zhou et al. (1996) and 
Dullerud and Paganini (2000). 


Sampled-Data Control 

When controllers are implemented by digital 
computers, periodic samplers and zero-order 
holds are used to model analog-to-digital and 
digital-to-analog conversion. Replacing K in 
Fig. 1 by sampler S (with period h), discrete¬ 
time controller Kj, and zero-order hold H 
(synchronized with S ), we obtain a sampled-data 
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Sampled-Data H-Infinity Optimization, Fig. 1 

Standard control setup in continuous time 



Sampled-Data H-Infinity Optimization, Fig. 2 

Sampled-data control setup 

control system shown in Fig. 2; here, S converts 
y into a discrete-time sequence Kj, a real¬ 
time algorithm in the computer, inputs ifr and 
computes another sequence u, which is converted 
by H into u. 

There are in general three approaches to de¬ 
sign a digital controller Kd : design a continuous¬ 
time controller K and then implement digitally 
via approximation, discretize the plant and then 
design Kd in discrete time, and finally, design Kd 
directly based on continuous-time performance 
specifications (Chen and Francis 1995). The last 
approach is followed in the Hoo optimization 
framework. 


Sampled-Data HLoo Discretization 

The sampled-data Hoo control problem is to de¬ 
sign Kd directly to stabilize G in Fig. 2 and 
minimize ||r zw ||. Notice that even if G is LTI in 
continuous time and Kd is LTI in discrete time, 
the closed-loop system T zw is no longer LTI, due 
to the presence of S and H in the control loop; 



Sampled-Data H-Infinity Optimization, Fig. 3 The 

equivalent discrete-time system 

in this case, the H 0 o norm is interpreted as the 
C 2 -induced norm: 

\\T ZW \\ = sup{||z|| 2 : Mb = 1}; 

here, || • H 2 represents the C 2 norm on signals. 

The sampled-data Hoo control problem has 
been shown to be equivalent to a purely discrete¬ 
time Hoo control problem (Kabamba and Hara 
1993; Bamieh and Pearson 1992; Toivonen 
1992); the process is known as sampled-data 
Hoo discretization: for y > 0, construct an LTI 
discrete-time system G eq 4 connected to Kd as 
in Fig. 3; the two systems, T zw in Fig. 2 and 
Tfa : co i-> £ in Fig. 3, are equivalent in that 
II TwW < y if || Tty || < y, where the latter norm is 
£ 2 -induced, and since is LTI in discrete time, 

it equals the Hoo norm of the corresponding 
transfer function T^ a) (z). Thus, pure discrete¬ 
time techniques are immediately applicable. 

There are several ways to present this dis¬ 
cretization. However, the computation is quite 
involved and hence is not given here; interested 
readers can find details in the papers by Kabamba 
and Hara (1993), Bamieh and Pearson (1992), 
and Toivonen (1992), or the book by Chen and 
Francis (1995). Note that the Hoo discretization 
process is not quite exact in the sense that G eq ,d 
depends on y (Chen and Francis 1995). 

Summary and Future Directions 

In sampled-data Hoo optimization, the key idea 
is to address the hybrid nature of the problem, 
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considering intersample behavior in formulation; 
the main tool is the so-called continuous lifting 
(Yamamoto 1994; Bamieh and Pearson 1992), 
making use of periodicity of sampled-data 
systems. 

The ideas and tools developed in sampled- 
data control theory are still being used in emerg¬ 
ing areas such as hybrid systems and networked 
control systems. For example, in event-triggered 
control systems, information exchange and con¬ 
trol updating are not time driven but are done 
by certain event-triggering schemes, resulting in 
necessarily nonlinear and time-varying closed- 
loop dynamics; the analysis and synthesis issues 
in such systems are still challenging. 

Cross-References 

► H-Infinity Control 

► LMI Approach to Robust Control 

► Optimal Sampled-Data Control 

► Optimization Based Robust Control 

Recommended Reading 

The continuous-time Hoo control problem and 
its solutions are discussed extensively in several 
textbooks, e.g., Zhou et al. (1996) and Dullerud 
and Paganini (2000). The discrete-time Hoo con¬ 
trol problem was solved via the approach of Ric- 
cati equations in Iglesias and Glover (1991). The 
sampled-data Hoo control problem was solved si¬ 
multaneously with different methods in Kabamba 
and Hara (1993), Bamieh and Pearson (1992), 
and Toivonen (1992); details of the solution dis¬ 
cussed here can be found in the book by Chen and 
Francis (1995). 
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Abstract 

For digital devices to interact with the physical 
world, an interface is needed that transforms the 
signals from analog to digital and vice versa. 
Ideal samplers and zero-order hold devices are 
incorporated to derive discrete-time models of 
continuous-time systems. State variable descrip¬ 
tions and transfer functions are used. 
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control; Discrete-time approximations; Quan¬ 
tization; Reconstruction; Sampled-data systems; 
Sampling 

Introduction 

Sampled-data systems are discrete-time models 
of continuous-time processes useful in the digital 
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control of continuous-time systems. A digital 
controller cannot communicate directly with a 
continuous system and an interface is needed. 

Consider a continuous-time system having 
u(t) as its input and y (t) as its output. 

A/D Converter: The continuous-time signal 
y(t) is converted into a discrete-time signal 
{y (/:)}, k > 0, k e Z, which is a sequence 
of values {y ( 0 ), y ( 1 ), • • •} determined by the 
relation 

y(k) = y(t k ). ( 1 ) 

This is the ideal A/D (analog to digital) con¬ 
verter that samples y(t ) at times 4 , t\, 4 * * • 
producing the sequence {y ( 4 ), y ( 4 ), • • •} also 
denoted as {y( 4 )}. 

D/A Converter: The D/A (digital to analog) 
converter receives as its input a sequence 
{u(k)}, k = 0 , 1 , 2 , ••• and outputs a 

(piecewise) continuous-time signal u(t) 
determined by 

u(t ) = u(k), tk <t < 4 + 1 , k = 0,1,2, • • • . 

( 2 ) 

That is, this D/A converter keeps the value of 
u(t) constant at the last value of the sequence 
entered, until a new value comes in. Such 
a device is called a zero-order hold (ZOH) 
device. 


Higher-Order Hold 

The ZOH device described above implements a 
particular procedure of data reconstruction or 
extrapolation. The general problem is as follows: 

Given a sequence of real numbers {/(£)}, 
k = ko, ko + 1 , • • • derive / (t), t > to so that 

f(t k ) = f(k), k = ko,ko + l,--- 

Clearly, there is a lot of flexibility in assigning 
values to f(t) in between the samples /( k)\ in 
other words there is a lot of flexibility in assigning 
the intersample behavior in /( t ). 

A way to approach the problem is to start by 
writing a power series expansion of f(t) for t , 
tk <t < 4 +i, namely, 


m =f(tt ) + - t k ) + 

(t — tk) 2 + • • • 

where f {n) (t k ) = \t=t k , that is, the nth 

order derivative of / (t) evaluated at t = tk 
(assuming that the derivatives exist). 

Now if the function f(t ) is approximated in 
the interval tk < t < 4+1 by the constant value 
/( 4 ) taken to be equal to / ( k ), then 

fit) = fit k ) (= /(*)), 4 < t < 4+1 


which is exactly the relation implemented by a 
ZOH. Note that here the zero-order derivative 
of the power series is used which leads to an 
approximation by a constant which is a zero- 
degree polynomial. 

It is clear that more than the first term in the 
power series can be taken to approximate f(t). 
If, for example, the first two terms are taken, then 


m = f(tk) + f (l \t k )(t-t k ) 

=/(,»>+ /( ++-> <,-„) 

4 4—i 

= f(k) + -----0 - t k ) 

tk 4—1 


for 4 < t < 4 + 1 , where an approximation for 
the derivative f^\t) has been used. The approx¬ 
imation between 4 and 4+1 is a ramp with slope 
determined by /( 4 ) = / (k) and the previous 
value /( 4 _i) = f (k — 1). Here the first-order 
derivative of the power series is used which leads 
to an approximation by a first-degree polynomial. 
A device that implements such approximation is 
called a first-order hold (FOH). Similarly, we can 
define a second-order hold. Note that the formula 
of the above FOH is derived if we decide to use 
a first-degree polynomial to approximate /(/) on 
4 < t < 4+1 and then enforce /( 4 ) = f (k) 
and f(tk~\) = f (k — 1). This approach is known 
as polynomial interpolation. 

Obtaining a continuous (or piecewise contin¬ 
uous) function from given discrete values may 
be seen as a continualization procedure. Contrast 
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this with the discretization procedure introduced 
by sampling earlier in this section. 



The continuous-time system with input u(t ) 
and output y(t) together with the interface A/D 
and D/A converters can be seen as a system that 
receives a sequence of values {u(k)} as its input 
and produces a sequence of output values {y (k)}. 
A digital controller can receive the system output 
{y (k)} as input and produce a {u{k)}. 

Quantization : The sampled output y(k) e M 
and it can take on an infinite number of values. In 
a digital device, however, a variable can take on 
only a finite number of values - this is because of 
the finite wordlength that is of the finite number 
of bits in the registers. So for {y (k)} to be used by 
a digital controller, an additional step is needed, 
that is, y(k) needs to be quantized. Under quan¬ 
tization, for example, values 2.315, 2.308, 2.3 
with a 0.1 quantization step are all represented 
as 2.3. Quantization is an approximation and for 
short wordlengths, fewer number of levels, may 
lead to significant errors. Here we do not consider 
quantization. 


x(t) = e A{t ~ tk) x(k) + 


[ e A{t ~ x) Bdz 
Jt k 


u(k), 

(5) 


where x(k) = v(4), u{k) = w(4). For t = 4 + 1 , 
(5) becomes 


x(k + 1) = A(k)x(k) + B(k)u(k) (6) 
where A(k ) = e A dk+\-tk ) an( j B(k) = 

ftk+1 e A(t k+l -T) Bdr ' 

Jt k 

Consider now the output y(t) and assume that 
it is sampled at times t' k that do not necessarily 
coincide with the instants 4 at which the input 
is adjusted (4 < t' k < 4 + 1 ). Then if y(k) = 

y(tp 

y(k) = C(k)x(k ) + D(k)u(k ), (7) 


where 


C(k) = Ce A(t k~ tk) 


D(k ) = C 


f k e Ml '^dx 
Jt k 


B + D. 


In the case when all k = 0,1,2, • • •, t' k = 4 
and 4+1 — 4 = T a constant period, called the 
sampling period. Then the sampled-data system 
is given by 


Discrete-Time Models 

Let a linear, continuous-time, time-invariant sys¬ 
tem be described by 


x(k + 1) = Ax(k) + Bu(k) 
y(k) = Cx(k) + Du(k ) 


where 


( 8 ) 


x(t) = Ax(t) + Bu(t), 

(3) 

y(t) = Cx(t ) + Du(t). 

If we consider some initial time 4 , its state 
response for t > 4 is 

x(t ) = e Mt - tk) x(t k )+ f e A(, - r) Bu(x)dT. (4) 

Jt k 

In view of (2), in a ZOH the input u(t) will remain 
constant and equal to w( 4 ) (= u(k)) for a time 
period 4+1 — 4 . So 


A = 


AT 



Jo 


B, 


C = C, D = D. 


The intersample behavior of the continuous sys¬ 
tem can be determined using (5). 

Example 1 Let the continuous-time system be 
given by (3) where 




, C = [1 0],D=0, 
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and let T denote the sampling period. The trans¬ 
fer function of the continuous-time system is 
H(s) = C(sl - A)~ l B = l/s 2 , the double 
integrator. The discrete-time state-space repre¬ 
sentation of the system, which represents the 
continuous-time system preceded by a zero-order 
hold (D/A converter) and followed by a sampler 
[an (ideal) A/D converter], both sampling syn¬ 
chronously at a rate of 1/T, is given by v (k +1) = 
Ax(k) + Bu(k ), y(k) = Cx(k ), where 

oo 

A = e AT = ^(T'/jDA' = 

7=1 


"0 

r 

T = 

"1 

T~ 

0 

o_ 


0 

1 



T T 2 /2~ 

"0" 


"r 2 /2" 

0 T 

_ 1 _ 


T 


C = C = [ 1 0]. 

The transfer function (relating y to u ) is given by 


1 0 
0 1 


H(z) = C(zl -A)~ l B 

T 2 / 2" 
T 

ri m r i/(z_ !) r /(^-!) 2 ' 

1 J L o i/fe-1) . 

~T 2 / 2" 

T 

T 2 (z+1) 

2 (z - l) 2 ‘ 


= [1 0] 


z - 1 -T 
0 z-1 


If we focus on single-input, single-output sys¬ 
tems and consider ideal sampler A/D and ZOH 
D/A, then given the transfer function G(s ) of 
the continuous system, there is a direct formula 
to determine the transfer function of its discrete 
approximation H(z ), namely, 


H(z) = {\-z~ l )Z{G{s)/s}. (9) 

Here Z{G(s)/s} means that first the inverse 
Laplace transform of G(s)/s is taken to obtain 
f{t ) = [£ -1 (G(s)/s)]. The function fit) is then 
sampled to obtain fikT),k = 0,1,2, • • • and the 
z-transform of fikT) is evaluated. To illustrate, 
in the above example G(s) = 4^, Gis)/s = jj, 
and fit) = G _1 (-^-) = 4 1 2 , t > 0. Then 

H(z) = (1 -z~ l )Z{ l -(kT) 2 } 

= (1 -z-^Zik 2 } 

- Tl z +i 

2 (z - l) 3 


as before. 


Summary 

Sampled-data systems arise in the digital con¬ 
trol of systems and include both continuous and 
discrete-time dynamics. Discrete-time approxi¬ 
mations of continuous-time systems using ideal 
samplers and ZOH devices were derived using 
state variable descriptions. Extensions include 
quantization and lead to hybrid dynamical sys¬ 
tems which include both continuous and discrete 
variable dynamics. 

A variation of the approach described in this 
entry of deriving sampled-data systems uses the 
discrete-time delta operator. This approach has 
the advantage that as the sampling period T —> 
0, the discrete-time model reverts to the orig¬ 
inal continuous-time model, which is not the 
case with the more common approach described 
above. 


Cross-References 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 
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Recommended Reading 

State variable and transfer function descriptions 
are covered in a variety of textbooks including 
Antsaklis and Michel (2006), Kailath (1980), 
Chen (1984), and DeCarlo (1989). For addi¬ 
tional material on sampled-data systems, refer to 
Astrom and Wittenmark (1990), Franklin et al. 
(1998), Jury (1958), and Ragazzini and Franklin 
(1958). 
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Abstract 

Spacecraft control systems are described for 
single and distributed space systems. The attitude 
dynamics is formulated including flexible and 
sloshing phenomena, followed by a description 


of attitude sensors and actuators. Hoo and robust 
controls are formulated as signal-based two 
degree-of-freedom control architectures. The 
equations are given for the relative motion 
dynamics between spacecraft on elliptical orbits 
with the generic Yamanaka-Ankersen state 
transition matrix. Formulations are provided for 
rendezvous and docking scenarios and formation 
flying control, maneuvers, avionics, and laser 
metrology systems together with the onboard 
autonomy needs. 

Keywords 

Flexible modes; Formation flying; Fractionated 
spacecraft; Hoo control; Multivariable systems; 
Relative dynamics; Rendezvous and docking; 
Robust control; Sloshing; Spacecraft attitude 
control; Spacecraft position control 

Introduction 

This entry explains the control needs of space¬ 
craft after they have been separated from the 
launch vehicle and injected onto their initial orbit. 

Actuators and sensors are explained followed 
by the control objectives. The state-of-the- 
art control techniques and architectures are 
addressed. 

Spacecraft are classically well-known physi¬ 
cal systems that can be described by first princi¬ 
ples. The advantage is fairly precise plant models 
and uncertainty characterization of physical pa¬ 
rameters. This is well suited for a model-based 
control design approach. 

Mission Types 

From a control point of view, space missions can 
be split into two main categories according to 
which physical states need to be controlled: 
Attitude Control: This is needed by any space¬ 
craft irrespective of the mission objectives. 
Such missions are typically low earth orbit 
(LEO) missions for astronomy, observations, 
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and, in higher orbits, constellations for navi¬ 
gation and communication. Further, there are 
interplanetary and planetary exploration sci¬ 
ence missions. The pointing requirements vary 
from a few degrees to milli-arc seconds. 

Relative Position Control: Within distributed 
space systems, this is relevant for rendezvous 
and docking (RVD) and formation flying 
(FF) missions. It leads to a 6 degree-of- 
freedom (DOF) control problem as the 
relative attitude is also needed. The former is 
mostly for missions to space station logistics 
infrastructures and the latter for scientific 
missions. Relative position can also be 
required during the final stages of controlled 
planetary landings. Another category is 
missions with ultrahigh control performance 
requirements, where the spacecraft platform 
and the science instrument need to be 
considered as one coupled system. 


Attitude Control 


N = 


d*(ho*) 

dt 


T- co x \co 


( 1 ) 


where I is the constant inertia matrix, co is the 
inertial angular velocity, and N is the torque 
acting on the spacecraft (Wie 1998). 

The kinematics can be described by one of the 
12 sets of Euler angles (can have singularities) or 
the hypercomplex quaternion vector (no singular¬ 
ities) (Hughes 1986). 

The dynamics and kinematics equations need 
to be linearized and are in the general form of a 
coupled 12th order system. It is the fundamental 
model for the rigid body spacecraft control de¬ 
sign. 

Most modern spacecraft have large flexible 
appendices in the form of solar panels and large 
antennae reflectors. Fuel sloshing is a similar 
lightly damped oscillatory phenomena, which 
often needs to be taken into consideration. 
The incorporation of dynamic elements such 
as flexible panels, antennae, and sloshing fuel 
can be modeled by Eqs. (2) and (3) provided the 
overall rotation rate co and linear accelerations x 
are not too large. 


Fundamentally the three attitude angles 0 and 
angular rates co need to be controlled to a certain 
reference. See Fig. 1 for definition. 

The general rigid body dynamics expressed 
in a rotating frame(*), which is mostly the case 
when orbiting a central body, can be expressed as 



Liy 


( 2 ) 


Vk + 2^ k Q. k r] k + Q. 2 k r] k =-L 

m k 


T 



(3) 


Satellite Control, Fig. 1 

Spacecraft body (black) 
and reference (red) frames. 
The frames coincide for 

0=0 
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where 

M t 

: rigid body mass/inertia matrix 

x, 6) 

: linear and angular acceleration 

F, N 

: forces and torques on the spacecraft 


: the kth flexible state 

ik 

: the kth flexible damping factor 


: the kth flexible eigen frequency 

m k 

: the kth modal mass (normalized to 1) 

L 

: participation matrix of the kth mode 


For attitude only the second row of Eq. (2) is 
needed, but translation is included here for the 
sake of completeness and later use. 

The sensors utilized are typically gyroscopes 
for measuring the inertial angular rate, sun sen¬ 
sors to measure orientation at low accuracy, and 
star trackers for high-precision angular attitude 
measurements. All of those sensors are linear 
in their normal operational range and it suffices 
to use bias noise models for synthesis. Gyros 
do need a drift estimation and compensation to 
function properly over longer time. All sensors 
utilize redundancy for providing measurements 
around all three axes as well as providing fault 
tolerance. Some scientific observatory spacecraft 
use their telescopes for attitude measurements in 
order to obtain the required precision beyond the 
capability of star trackers. 

The actuators producing pure torques are 
magnetic torquers, reaction wheels, and control 
momentum gyros. The last can produce large 
torques used for rapid slew maneuvers with 
little power. The last two types have nonlinear 
issues around low to zero speed due to friction 
issues. They accumulate angular momentum 
from asymmetric disturbances. This leads to a 
need for thrusters for angular momentum off¬ 
loading. Thrusters are also used to control the 
attitude directly on many spacecraft. They are 
mostly of on-off type, though continuous ones 
exist, and will need to be pulse width modulated 
(PWM) to obtain quasi-linear behavior. The 
nonlinear on-off nature needs to be taken into 
account for the control closed loop analysis. It 
is done by use of the negative inverse describing 
function (Ogata 1970) for stability analysis and 
nonlinear modeling for verification simulations in 
the time domain. For larger numbers of thrusters, 
an optimization-based selection algorithm is 
applied to the controller output. 


Before using the plant model in Eq. (2) for a 
flexible spacecraft, a simpler multivariable model 
of a rigid spacecraft is used as in Eq. (4): 


OB* 
0A d 



N 


(4) 


where x = [9 X , 6 y , 6 Z , co x , co y , co z ] T , B* is identity, 
B d = I -1 , and Aj is the general Jacobian for 
the dynamics having a real right half-plane (RHP) 
pole. See Ankersen (2011). The model describes 
the angular deviation from some reference frame, 
whose orientation can be arbitrary. It uses the 
Euler (3,2,1) rotation in the kinematics. 

The state of the art of attitude control is today 
mostly based on T-L 00 type of robust controllers 
with synthesis performed in the frequency 
domain. Requirements are often specified in 
the time domain, but formal methods exist to 
transform them into frequency domain weighting 
functions (ESA Handbook 2011) enhancing both 
synthesis and analysis. System uncertainties can 
be formulated as structured linear fractional 
transformations (LFT) with a general control 
configuration as illustrated in Fig. 2. 

Commonly the Hoo controller K is designed, 
and the lower loop in Fig. 2 is closed via a lower 
LFT such that N = F/ (P, K) and robust stability 
(RS) and robust performance (RP) analysis is 
performed on the N, A system (Skogestad and 
Postleth waite 1996). 



Satellite Control, Fig. 2 Robust control formulation, 
where A is the structured uncertainty, K is the controller, 
P the partitioned formulation of the plant with weights, 
and w and z are exogenous inputs and outputs, respec¬ 
tively 
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On high performance pointing spacecraft, ac¬ 
tive vibration suppression of, e.g., cryocoolers 
is needed. The implementation of control design 
and recursive system identification can achieve 
significantly better attenuation compared to clas¬ 
sical passive isolation techniques. 

Lately optimization-based codesign of struc¬ 
tures and control has been performed success¬ 
fully. A joint performance function is formu¬ 
lated (mass, stiffness, pointing, fuel, etc.) and an 
optimization is performed (differential evolution 
algorithm) iterating on control design and finite 
element models (FEM). A /x-synthesis controller 
is synthesized, the pointing performance is ful¬ 
filled, and 15-20% mass saving is obtained on 
the flexible structures. The entire process is fully 
automated (Falcoz et al. 2013). 

Relative Position Control 

For all distributed space systems, relative dy¬ 
namics is important. Rendezvous and formation 
flying missions need tracking or maintenance of 
the desired relative separation, orientation, and 
position between or among the spacecraft. This 
is common and independent of the mission type 
and will be described in general terms ahead of 
the specific RVD and FF missions. 

The general relative position dynamics 
between centers of mass (COMs) is in Eq. (5), 
where it is observed that the in-plane motion 
(x, z) is decoupled from the out-of-plane 
motion (y). 

2 3 1 

x — co x — 2coz — coz + kco 2 x = — F x 

m c 

y + kco^y = — F y (5) 
m c 

z — co 2 z + 2 cox + cox — 2 kco 5 z = — F z 

m c 

where co = co(t) is the orbital angular rate, m c is 
the chaser mass, Fxyz is the force on the chaser, 
and k is a constant determined by the orbit and 
is valid for any Keplerian orbit with eccentricity 
£ < 1 . 

The Yamanaka-Ankersen equations (Ya- 
manaka and Ankersen 2002) provide the 


generalized homogeneous solution in the form 
of the transition matrix 0, where the solution can 
be written as 


x(f) = A 1 (v)$(v)$ 0 1 (v 0 )A(v 0 )x(fo) (6) 


where v is the orbital true anomaly and A are 
transformation matrices to and from the time 
domain. The elements of 0 in Eq. (6) are detailed 
in (Ankersen 2011), where relevant particular so¬ 
lutions are also to be found. Equation (6) reduces 
to the well-known Clohessy-Wiltshire equations 
for circular orbits (e = 0) (Clohessy and Wilt¬ 
shire 1960). Equation (6) is used for feedfor¬ 
ward control and trajectory propagation in the 
guidance function. During the final approach (see 
Fig. 3), a model accounting for the docking port- 
to-port relative position and the couplings from 
the relative attitude to the position is utilized and 
formulated in Eqs. (7) and (8) (Ankersen 2011): 


x = 


0 

0 A c 


0 

0 B c 


u 


(7) 


y = 


I 0B dci 0 

0 1 0 B dC2 
00 1 0 
00 0 I 


( 8 ) 


where x = [x p ,x p , 0 Cy co c ] T , y = [x pp ,x pp ,0 c , 
co c ] T , index p refers to COM positions, index c to 
chaser attitude, index pp to port-to-port position, 
and Bj Ci ,Bj C 2 are the coupling matrices of the 
docking port. 

A relative motion scenario for a typical RVD 
mission looks like in Fig. 4. During the final 
approach (<300 m range), the chaser relative 
attitude and relative position are controlled. 
During the other phases, the chaser attitude 
is Earth pointing and the relative position is 
controlled at the station-keeping (SK) points, 
so,-’ ,S 4 in Fig. 4. The trajectories are typically 
open loop feedforward controlled (often with 
midcourse corrections). 

The avionics sensors for the attitude control 
part are generally similar to those described 
earlier under attitude control in connection 
with Fig. 1. Active laser CCD type of sensors 
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Satellite Control, Fig. 3 

Definition of 
COM-to-COM and 
port-to-port positions, s 
and s pp , respectively, 
between two spacecraft 




Satellite Control, Fig. 4 This figure shows the phases of typical relative motion approach. The shaded area is a keep- 
out zone (KOZ) defined for safety reasons. V-bar is the x-axis and R-bar is the z-axis 


is used to measure the relative position (range 
and line-of-sight (LOS) angles) and at short 
range (<50m) the relative attitude. They 
require a target pattern to provide precise 
measurements at short range. Accelerometers 
are used, particularly for pulsed maneuvers. 
The next generation of RVD GNC systems, test 
flown, will utilize Lidar, infrared cameras, and 
visual cameras in combination with advanced 
image processing providing RVD capabilities 
with both cooperative and passive target 
spacecraft. 

The actuators are mostly thrusters arranged to 
achieve controllability for all the 6DOF maneu¬ 
vers needed. Based upon the controller output, 
the active thrusters are selected by means of some 


type of fuel optimization algorithm. The selected 
thrusters are then pulse width modulated (PWM) 
within the sampling time. 

The controllers are frequently of multivariable 
1~Loq type. They are similar to what is described 
in connection with Fig. 2. Flexible modes and 
in particular sloshing need to be taken into ac¬ 
count using Eq. (2). Sloshing pendulum mod¬ 
els are used during boost maneuvers and spring 
mass damper models during other modes. The 
couplings between relative attitude and relative 
position in Eq. (8) can be analytically decoupled 
setting the matrix C to identity and premultiply¬ 
ing with a decoupling matrix Vj, such that 

\ d C = I O Y d = C -1 


(9) 
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Satellite Control, Fig. 5 

Principal structure of the 2 

degree-of-freedom 

controller 



and by the inversion theorem for partitioned ma¬ 
trices the upper right partition just changes sign. 
The designed controller then needs to be premul¬ 
tiplied by 1 , which facilitates a simpler control 
design maintaining the 6DOF performance after 
2 times 3DOF synthesis. 

A 2 degree-of-freedom control architecture as 
in Fig. 5 is beneficial since much of the perfor¬ 
mance is achieved by controller Ki. The structure 
of the synthesis formulation is a signal-based 
model-reference configuration for the 'Hoo con¬ 
trol rather than the more classical mixed sensitiv¬ 
ity type. It has proven to have higher robustness 
and performance for this type of applications. 
As an example, consider a controller that has to 
follow a sawtooth motion of the docking port 
of the International Space Station (ISS) with an 
amplitude of 0.4 m and reversal times of 8 s. The 
signal-based model-reference controller manages 
to track such a motion with errors less than 
0.01 m compared to the best operational perfor¬ 
mance of 0.08 m. 

Formation flying usually includes more than 
two spacecraft with the need to be controlled 
relative to each other. The objective of FF is to 
form an instrument in space, not possible with 
fixed structures, like a synthetic aperture or an 
interferometer of large size. 

The performance needs are high and require 
innovative high-precision (<l|jim) metrology 
sensors. They are based on divergent laser beams 
for the coarse part to be able to transit from 
lower to higher accuracy. The fine metrology 
uses a laser beam and internal interferometers 
to reach the \xm domain. Actuators are in the 
range of p,N thrust, which can be achieved 
with either cold gas or electrical propulsion 
thrusters. 

The maneuvers realized by entire formations 
are rotation, resizing, and slew while maintaining 
the formation in most cases (Alfriend et al. 2010). 


Formation flying missions with the highest 
performance requirements have optical payloads, 
which need to have internal control loops at com¬ 
ponent level. To reach the performance required 
for applications such as optical interferometry, 
the formation and payload must be considered 
as one system. The synthesis of a multivariable 
controller then handles all the cross couplings in 
the system needed to reach performance. Beyond 
flexible modes, such systems might also have a 
need for active vibration damping for systems 
using cryocoolers. 

The GNC architecture is often centralized for 
nominal science operational modes. For the for¬ 
mation deployment and contingency situations, a 
decentralized control architecture is needed. This 
leads to a dual architecture GNC system in gen¬ 
eral for formation flying systems. The onboard 
autonomy needs to be fairly high in order to cope 
with the contingencies in the formation without 
ground intervention. 

Finally there is an emerging concept of 
fractionated spacecraft. There, a formation 
consists of a large number of small simple 
vehicles maneuvering relative to each other fully 
autonomously based upon the nearest neighbor 
knowledge and not necessarily information about 
the entire formation (Cornford 2012). 

Summary and Future Directions 

The control of spacecraft has been described for 
pure attitude control needs and for spacecraft per¬ 
forming relative proximity maneuvers like ren¬ 
dezvous and formation flying. The focus has been 
on sensors, actuators, dynamics, and the robust 
control methods applied today. 

The further development direction of the field 
is expected to be increased on board autonomy 
with replanning capabilities and fault-tolerant 
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GNC designs. Model predictive control (MPC) 
will enter in particular on the guidance functions. 
More integrated GNC system-level designs, of 
multidisciplinary nature, are expected. 


Cross-References 

► Fault-Tolerant Control 

► H-Infinity Control 

► Model-Predictive Control in Practice 

► Nominal Model-Predictive Control 
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Abstract 

For manufacturers operating batch plants, pro¬ 
duction scheduling is a critical and challeng¬ 
ing problem. A thorough understanding of the 
problem and the variety of solutions approaches 
is needed to achieve a successful application. 
This entry will present a brief overview of batch 
operations and the state of the art of batch plant 
scheduling for nonexperts in the field. 

Keywords 

Dispatching rules; Optimization; Process 
networks; Production sequencing; Product wheel 

Introduction 

Batch plants, manufacturing operations com¬ 
posed of unit operations that operate in batch 
mode, are the primary manufacturing operations 
for the production of high margin products 
such as pharmaceuticals, specialty chemicals, 
and advanced materials. The scheduling of the 
sequence of operations over time has a significant 
impact on the overall performance of a batch 
plant (White 1989). The economic importance of 
batch plants, and the importance of scheduling 
for batch plants, has spawned a large body of 





1268 


Scheduling of Batch Plants 


research on the topic and a variety of commercial 
offerings. 

The Nature of Batch Plants 

In batch operations, the material transformation 
takes place in stages and the operation of each 
stage occurs over a specified time while the 
material remains in a particular unit operation 
performing that stage of production. (A familiar 
batch operation is baking a cake. Ingredients and 
their amounts, specified by a recipe, are com¬ 
bined and then subjected to a constant tempera¬ 
ture over specified period of time to produce a 
cake.) A batch plant may have parallel units for 
some stages. Other stages may be operated in a 
continuous flow mode with a storage unit feeding 
the stage and another storage unit receiving the 
stage output. The path through the unit operations 
may be product dependent. Batch plants have 
highly diverse operational characteristics. 

There are two broad categories of batch pro¬ 
cesses: (1) sequential where a batch moves from 
one stage to another without losing its identity 
and (2) networked where batches can be com¬ 
bined or split to feed downstream units (Mendez 
et al. 2006). Sequential processes can be further 
classified as single stage, multi-stage, or multi¬ 
purpose. 

The nature of a batch process and the different 
process structures can be explored by referring 
to the process depicted in Fig. 1 (Chu et al. 
2013). As drawn, this batch plant operates as 
a multi-stage sequential process where a batch 
starts in raw material preparation stage (selected 


raw materials are loaded and then blended for a 
specified time), moves to the reaction stage with 
two parallel units (prepared raw materials plus 
additives react at a constant temperature for a 
specified period of time), moves to the finish¬ 
ing stage (intermediate product is subjected to a 
vacuum for a specified period of time to remove 
volatile by-products), and finally is processed in 
the drumming stage (finished product is packaged 
in drums). If finished product storage tanks were 
placed between finishing and drumming to allow 
the drumming stage to be scheduled indepen¬ 
dently of the first three stages, then the drum¬ 
ming operation would represent a single stage 
sequential process. If we further assume that for 
some finished products Reactor 1 produces a 
batch of precursor for Reactor 2 and that some 
products produced in the reactors bypass the 
finishing stage and go directly to drumming, then 
the underlying plant would be a multi-purpose 
sequential process. Finally, if intermediate stor¬ 
age tanks exist for storing multiple batches of 
the precursors produced by Reactor 1 and the 
contents of the tanks are drawn off to produce 
multiple, subsequent batches in both reactors then 
the underlying plant is a networked process. 

Besides the general structure of a batch 
plant, the specific processing requirements, 
resources needs, and process constraints have 
significant impact on the complexity of the 
scheduling problem. One important aspect 
is limited resources that are shared between 
different operations. The availability and capacity 
of shared resources place a severe constraint on 
the timing of competing operations. Another 
significant factor is intermediate storage between 



Scheduling of Batch Plants, Fig. 1 Example batch plant {Solid lines represent material flows from limited inventory. 
Dashed lines represent material flow from unlimited inventory) 
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stages and the inventory policies that are 
enforced. Like shared resources, intermediate 
storage places hard constraints on the timing 
of upstream and downstream stages, especially 
when no storage is available. A third important 
constraint on scheduling is product transition 
policies that dictate what operations need to 
be performed to move from one product to 
another in a given stage. Such operations, 
sometimes called setups, might involve cleaning, 
or producing buffer batches to isolate the 
chemistry of one product from another. These 
operations involve costs and subtract from the 
productive use of the equipment so they have 
significant impact on the sequencing of products 
through the plant. 

Production Scheduling of Batch 
Plants 

Production scheduling in a batch plant involves 
three fundamental decisions: (1) determining the 
size of each batch in each stage, (2) assigning 


a batch to a processing unit in each stage, and 
(3) determining the sequence and timing of pro¬ 
cessing on each unit. These decisions are well 
illustrated by a graphical planning board or Gantt 
chart as shown in Fig. 2 (Chu et al. 2013). Per¬ 
sonnel charged with creating and managing pro¬ 
duction schedules often rely on such a graphical 
tool to construct, analyze and report the schedule. 
Generally production schedules are determined 
using the information listed in Table 1 . 

The scope of the scheduling decisions is de¬ 
fined by the level of process detail considered 
in the scheduling problem. This idea can be 
examined by referring to Figs. 1 and 2. Such a 
Gantt chart could apply to a batch plant with four 
stages of production: raw material preparation, 
reaction, finishing, and drumming, with two par¬ 
allel reactors in the reaction stage. If dedicated 
finished product storage exists with large enough 
capacity to cover the process lead time then one 
schedule could be confined to the first three stages 
of production and a different schedule applied to 
the drumming stage. The scope of the scheduling 
problem could be further reduced if raw material 



Scheduling of Batch Plants, Fig. 2 Gantt chart of a production schedule 


Scheduling of Batch Plants, Table 1 Information generally used to construct a production schedule 


Scheduling information 

Examples 

Detailed production recipes 

Batch times, processing rates, unit ratios, sequence dependencies 

Equipment data 

Capacities, availabilities, product suitability 

Facility information 

Shared resource availability and capacities, storage capacities 

Production costs 

Raw materials, utilities, setups, cleanings, manpower 

Production targets 

Inventory replenishments, customer orders with due dates 

Current process status 

Current inventories, operations in progress, schedule items fixed in future time 
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preparation only takes place just in time to load a 
reactor rather than execute as soon as possible. In 
this situation, the time for raw material prepara¬ 
tion could be added to the reactor batch time and 
the schedule would involve only the reactors and 
the finishing system with the raw material unit 
or units schedule implied by the reactor sched¬ 
ule. At a higher level still, the first three stages 
of production could be considered a production 
train and scheduling could then be reduced to 
planning campaigns of batches for each product 
over time with the detailed synchronization of 
the individual stages left to operations personnel. 
Obviously with each level of abstraction some 
efficiency in the schedule is lost and subsequently 
the opportunity to increase throughput of the 
plant. 

In most batch plants a person with a title such 
as “production scheduler” is charged with the 
scheduling decisions. In general, the production 
scheduler is responsible for delivering a produc¬ 
tion schedule that meets customer orders on time 
and maintains finished product inventory while 
dealing with rush orders, late deliveries, equip¬ 
ment breakdowns and other contingencies. Gen¬ 
erally schedulers develop and publish a schedule 
to manufacturing on a regular basis (e.g., every 
2 days, once a week, etc.) and then monitor 
ongoing circumstances (e.g., actual production 
vs. plan, new demand, etc.) to determine if minor 
adjustments to the schedule are needed or if a 
complete new schedule needs to be published. 
The construction of a schedule can be an iter¬ 
ative process involving negotiations with manu¬ 
facturing, supply chain, sales, maintenance and 
logistics. The tools available to the production 
scheduler can have a significant impact on the 
quality of schedules they produce. 

It is evident from the description above that 
production scheduling of batch plants is really 
carried out as an exercise in rescheduling in 
response to disturbances identified through feed¬ 
back from the process and market. Under these 
circumstances production scheduling serves as 
a form of high level feedback control of the 
process. In this regard the manipulated variables 
are the production amounts for each product and 
the controlled variables are the inventory levels 


and customer service levels for each product. 
A scheduling problem can be converted to a 
state-space formulation and compared to model 
predictive control (Subramanian et al. 2012). 

Solution Approaches 

The solution approaches applied to scheduling 
batch plants cover a wide spectrum of sophis¬ 
tication. A very simple form is nothing more 
than a sequence of batches maintained on a white 
board in the plant control room. A level above 
this would be the use of custom spreadsheets for 
arranging batches chronologically and computing 
finished product inventory. Another step up is the 
use of a manually manipulated Gantt chart as 
illustrated in Fig. 2, possibly pre-populated by an 
automated planning application that determines 
the volume to be produced across the units of 
production while leaving the detailed sequencing 
and timing decisions to the production scheduler. 
The highest level of sophistication involves an 
automatically generated schedule with the appli¬ 
cation retrieving all the necessary data from the 
appropriate business databases and plant control 
system. 

Regardless of the level of sophistication, all 
solution approaches rely on two fundamental 
components for developing a schedule. One is the 
modeling paradigm used to represent the physical 
system in a more abstract way. The primary 
components are: material balances in terms of 
batches or units of measure (e.g., pounds), and 
timing information as either precedence-based 
describing the order of operations or time grid- 
based describing the instant at which any opera¬ 
tion takes place. Time can either be described by 
a continuous representation or divided into dis¬ 
crete increments. Within these two aspects of the 
modeling framework, significant freedom exists 
to describe the scheduling problem. The second 
fundamental component is the solution method 
used to generate the schedule. Each method has 
its strengths; therefore solutions combining meth¬ 
ods are also used. The essential problem is to 
produce the information needed to draw the Gantt 
chart in Fig. 2 given the information in Table 1. 
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Product Wheel 

While the primary objective of production 
scheduling is to meet customer orders while 
managing finished product inventory, other 
operational issues need to be managed, such as 
minimizing product transition costs, minimizing 
variability in manufacturing operations, keeping 
the scheduling process simple, and balancing 
the tradeoff between production lead times, 
inventory, and transition losses. The product 
wheel is a practical approach widely used in 
industry to address these competing issues. A 
product wheel is a regular repeating sequence 
of products made on a specific unit operation or 
an entire production process. A product wheel 
is typically depicted as a pie chart as shown 
in Fig. 3. Segments of the pie, called spokes 
of the wheel, represent a production campaign 
of a particular product. The size of the spoke 
represents the length of the campaign relative to 
the overall duration, or cycle time, of the wheel. 

A product wheel has specific design parame¬ 
ters to address various operations objectives. The 
sequences is fixed and optimized for minimum 
transition costs. The overall cycle time is fixed 
and optimized to balance lead time and inventory 
costs. The campaign size or spokes for each 
product are sized to match average demand for 
each product. The fixed pattern of the product 
wheel provides manufacturing with a predictable 
operational rhythm and the production scheduler 
with a very structured decision framework. Refer 
to King and King (2013) for a complete treatment 
of product wheels. 

In practice, the duration of a campaign for a 
given product will vary from cycle to cycle as it 
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will be sized to replenish any inventory consumed 
in the previous cycle. Low volume products may 
not be made on every cycle, although they will 
have a fixed location in the sequence. This same 
approach applies to make-to-order products that 
are not inventoried but produced to fill specific 
orders. Thus, in some cases a product wheel may 
be composed of several different but repeating 
cycles. 

Dispatching Rules Used in Discrete 
Manufacturing 

Batch processes are closely related to discrete 
manufacturing. Batches processed on a unit are 
analogous to jobs processed on a machine. Much 
of the literature on machine scheduling has fo¬ 
cused on the analysis of the specifics encountered 
in general classes of problems such as single 
machines, parallel machines, flow shops and job 
shops, and developing constructive scheduling 
rules where a schedule is built up by adding one 
job at a time (Blackstone et al. 1982). Under cer¬ 
tain circumstances these rules used for machine 
scheduling can be applied to scheduling batch 
plants. This allows one to take advantage of a 
great body of literature, and at times, very simple 
scheduling rules that have proven optimality or 
worst case performance limits. 

Consider again the batch process referred to in 
Fig. 1 which has two parallel reactors. The two 
reactors can be modeled as a single stage process 
and scheduled like parallel machines using the 
simple shortest processing time first (SPT) rule if 
the following circumstances hold: (1) raw mate¬ 
rial preparation can be included in the batch time 
of reactors, (2) significant storage exists between 
the reactors and finishing to essentially isolate 
the two stages, (3) product specific batch times 
are identical for both reactors, (4) the number 
of batches of each product is given (perhaps the 
result of an inventory policy for make to stock 
products), and (5) the objective is to minimize 
the total completion time for all batches. The SPT 
rule is simply to select, whenever a reactor is free, 
the batch with the shortest processing time from 
those yet to be processed. This can be proven 
to produce an optimal schedule for the given 
conditions. 
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Another simple dispatching rule to mention is 
the earliest due date first (EDD) rule. This rule 
is designed for single stage processes without 
parallel units where each batch has an associated 
due date. The rule simply orders the batches in 
increasing order of their due dates to minimize 
the maximum lateness of all orders. 

The conditions needed for the SPT rule or 
the EDD rule to produce an optimal schedule 
can be quite restrictive when considering batch 
processes, however these rules and others found 
in the machine scheduling literature (Baker and 
Trietsch 2009) can still produce a good initial 
schedule even in cases where optimality condi¬ 
tions are not satisfied. Once generated, the sched¬ 
ule can be improved by manual manipulation of 
the Gantt chart or the application of improvement 
heuristics. 

Improvement Heuristics 

Improvement heuristics try to improve the current 
schedule by searching for alternative solutions 
either in the neighborhood of the current schedule 
or by broadly exploring the solution space. The 
behavior of these algorithms is determined by 
tuning parameters that balance the use of the two 
search techniques and the underlying algorithm 
that performs the search. Improvement heuristics 
generally have the following basic procedure: 
Step 1: Initialize - determine a starting schedule 
Step 2: Generate alternatives - build modifica¬ 
tions to the current schedule 
Step 3: Check for improvements in modified 
schedule - if no improvement is found return 
to Step 2 otherwise proceed to Step 4 
Step 4: Check for termination - terminate the al¬ 
gorithm if the number of iterations is exceeded 
or minimal improvement is obtained. 

Many improvement heuristics are inspired by 
processes found in nature. Two of the more pop¬ 
ular heuristics are simulated annealing which 
mimics the crystal formation during the cooling 
process of dense matter (Ryu et al. 2001) and 
genetic algorithms that mimic the evolution of a 
species over time (Lohl et al. 1988). A key aspect 
of improvement heuristics is the representation 
of the schedule in context of the algorithm used. 
For problems with complicated constraints this 
becomes a challenge. Nevertheless, when tuned 


properly and used where they fit the problem, 
improvement heuristics can produce very good 
schedules quickly. 

Tree Search Methods 

The scheduling solutions considered so far have 
taken a relatively simple view of a batch process 
as a single stage process or a flow shop. In 
situations where a batch plant involves shared 
resources, complicated transition rules or is a 
process network, tree search methods are better 
suited because they can deal with a large number 
of degrees of freedom and many types of con¬ 
straints. Tree search methods rely on representing 
alternative schedules as the final nodes in a tree 
where intermediate nodes represent partial solu¬ 
tions of the schedule. To be practical, these meth¬ 
ods must be able to effectively search through the 
tree while pruning non promising branches (see 
Fig. 4). Three of the most popular techniques are 
mathematical programming, constraint program¬ 
ming, and beam search. 

Mathematical programming solution tech¬ 
niques for scheduling generally convert the 
problem to a mixed integer linear programming 
(MILP) formulation where branching at nodes of 
the tree represent alternative values of the integer 
or binary variables. The tree is searched by a 
branch-and-bound algorithm which eliminates a 
node and the branch that emanates from it if the 
lower bound of the objective function represented 
by the terminal nodes of the branch is larger than 
the current best schedule. The MILP formulation 
can be stated generically as 

min z = cx + fy 
s.t. Ax + By > b 
x e?d\,y e {0,1}^ 

where c, f,b are vector of constants, A and B are 
matrices of constants, and the solution is defined 

A°o°o o°oo°o 

Scheduling of Batch Plants, Fig. 4 Trimming the solu¬ 
tion tree 



Scheduling of Batch Plants 


1273 


by the vector variables x and y . A key feature of 
using mathematical programming is to represent 
the relationships implied in Table 1 and Fig. 1 in 
terms of algebraic descriptions. The advantage of 
this approach is that a proven optimal solution 
exists for a problem stated this way. This provides 
the means to assess the quality of the solution 
and the impact of implementing the solution. The 
drawback of this approach is that since binary 
variables are used to represent the assignment of 
a batch to a processing unit, and the sequence 
and timing of processing on each unit, their num¬ 
ber grows rapidly with the number of units and 
the length of the scheduling horizon. However, 
the performance of modem computing hardware 
and commercial solvers for MILP problems has 
allowed industrial size problems to be tackled. 

A large variety of modeling paradigms have 
been developed to produce a MILP solution 
(Floudas and Lin 2004; Mendez et al. 2006). 
They address both sequential and networked 
processes using continuous time or discrete 
time representations. For sequential processes, 
time slot approaches have been developed. For 
networked processes, the resource task network 
and the state task network have been investigated 
by many researchers and have been used in 
industrial applications. 

Constraint programming (CP) formulates a 
problem by writing constraints; but unlike the 
MILP method, the CP method stresses the feasi¬ 
bility of solutions rather than optimality. Another 
important difference is that constraints in the CP 
method do not have to be formulated as algebraic 
relationships but can be a more general form, thus 
making it easier in CP to represent complicated 
constraints. CP processes the constraints sequen¬ 
tially to reduce the space of possible solutions. At 
each node in the tree, CP processes one constraint 
after another, reducing the search space at each 
constraint. Being much newer than mathemati¬ 
cal programming, constraint programming has a 
smaller body of literature to review but excellent 
performance has been reported in the literature 
(Baptiste et al. 2001). 

In the beam search method, the branch-and- 
bound algorithm is modified to only evaluate the 
most promising nodes at any given level of the 
search tree (Ow and Morton 1988). The number 


of nodes evaluated is called the beam width and it 
is a key tuning parameter of the method. Another 
important element of the method is the technique 
used to retain nodes for complete evaluation. The 
technique must balance speed versus thorough 
evaluation to keep the method practical without 
discarding promising nodes. The beam search 
method applied to scheduling has been investi¬ 
gated by many authors (Sabuncuoglu and Bayiz 
1999). 

Simulation 

The simulation approach to scheduling batch 
plants relies on representing the plant and the 
relationships inferred by Table 1 in a computer 
program whose algorithms recreate the behavior 
of the plant when executed. Generally, the sim¬ 
ulators used for batch operations apply discrete 
event simulation (DES) where entities that have 
attributes like size, due date, priority, etc. are 
operated on by activities for a specified duration. 
Fundamental to DES are the use of queues to hold 
entities until conditions in the simulation allow 
them to proceed to their next activity. Time in a 
DES does not proceed in a continuous manner but 
rather advances when activities occur. Simulation 
has the advantage of being able to describe pro¬ 
cesses and operating policies of arbitrary com¬ 
plexity and model variability in the process oper¬ 
ation. Simulators can be used to evaluate manu¬ 
ally created schedules or can be combined with 
optimization and heuristics to produce schedules 
by simulation-based optimization (Pegden 2011). 

An alternative to DES for batch scheduling 
is the use of multi-agent simulators which are 
composed of semiautonomous agents assigned 
to represent the operation of the process and the 
associated decision making. Each agent has a 
local goal and communicates with other agents to 
accomplish it. Like DES, multi-agent simulators 
are capable of describing very complicated 
processes. A production schedule can be built 
through negotiations between agents (Chu et al. 
2013). 

Selecting a Solution Approach 

The selection of the approach for a given batch 
plant should be value-based, balancing improved 
revenue with long term cost of ownership by 
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considering such factors as the technical compe¬ 
tency of the production scheduler, the expected 
capacity utilization of the plant, the operational 
complexity of the plant, and the cost to main¬ 
tain the scheduling application. The key is to 
obtain the least complicated solution by reducing 
the scheduling problem to the highest level of 
abstraction and by using the simplest solution 
method that provides an effective schedule. See 
Harjunkoski et al. (2013) and Pinedo (2008) for a 
survey of methods and recommendations for their 
practical application. 

Summary and Future Directions 

While there are a great variety of solution 
methods for scheduling, there are still promising 
research areas to be investigated. The recent 
introduction of sophisticated, object oriented 
process control systems with ties to enterprise 
management systems sets the stage for the 
development of automatic, real time scheduling. 
It is here that the principles of feedback control 
can be applied to batch plant scheduling. Pursuit 
of this goal will require continued development 
of fast, adaptive scheduling methods, real time 
assessment techniques of schedule performance, 
and tight integration of scheduling with the 
process control. 

Cross-References 

► Control and Optimization of Batch Processes 

► Models for Discrete Event Systems: An 
Overview 
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Abstract 

Singular trajectories arise in optimal control as 
singularities of the end-point mapping. Their im¬ 
portance has long been recognized, at first in the 
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Lagrange problem in the calculus of variations 
where they are lifted into abnormal extremals. 
Singular trajectories are candidates as minimizers 
for the time-optimal control problem, and they 
are parameterized by the maximum principle via 
a pseudo-Hamiltonian function. Moreover, be¬ 
sides their importance in optimal control theory, 
these trajectories play an important role in the 
classification of systems for the action of the 
feedback group. 

Keywords 

Abnormal extremals; End-point mapping; 
Martinet flat case in sub-Riemannian geometry; 
Pseudo-Hamiltonian 


Introduction 

The concept of singular trajectories in optimal 
control corresponds to abnormal extrema in op¬ 
timization. Suppose that a point v* e X ~ W 1 
is a point of extremum for a smooth function 
C : W 1 —> M under the equality constraints 
F(x) = 0 where F : X —> Y is a smooth 
mapping into Y ~ R p , p < n. The Lagrange 
multiplier rule (Agrachev et al. 1997) asserts the 
existence of nonzero pairs (Ao, A*) of Lagrange 
multipliers such that Ao£'( jc*) + A*F / (v*) = 0 . 
The normality condition is given by Ao 7 ^ 0, and 
the abnormal case corresponds to the situation 
when the rank of F'(x*) is strictly less than p. 

Abnormal extremals have played an impor¬ 
tant role in the standard calculus of variations 
(Bliss 1946). Indeed, consider a classical La¬ 
grange problem: 

dx 

— (t) = F(x(t), u(t)), min / L(x(t),u(t))dt 
dt k(.) Jo 

x(0) = xo,x(T) = jci, 

where x(t ) e X ~ M”, u(t ) e M m , F and L 
are smooth. Using an infinite dimensional frame¬ 
work, the Lagrange multiplier rule still holds and 
an abnormal extremum corresponds to a singular¬ 
ity of the set of constraints. 


Definition 

Consider a system of W 1 : ^r(t) = F(x(t),u(t)) 
where F is a smooth mapping from W 1 x M m 
into W 1 . Fix xo E M” and T >0. The end¬ 
point mapping is the mapping E Xo,T : u(.) e 
U -> x(T,x 0 , u) where U C L°°[0, T] is the set 
of admissible controls such that the correspond¬ 
ing trajectory x(.,xq ,u) is defined on [0, T]. A 
control u(.) and its corresponding trajectory are 
called singular on [0, T] if u(.) e U is such 
that the Frechet derivative E' Xo,T of the end-point 
mapping is not of full rank n at w(.). 


Frechet Derivative and Linearized 
System 

Given a reference trajectory x(.), t e [0, T], 
associated to u(.) with v(0) = xo, and solution 
of ^(0 = F(x(t ), u(t )), the system 

8x(t ) = A(t)8x(t ) + B(t)8u(t ) 

with 


A(t) = —(x(t),u(t)), B{t) = —(x(t),u(t)) 

OX ou 

is called the linearized system along the control- 
trajectory pair (u(.), x(.)). 

Let M(t) be the fundamental matrix, t e 
[0, T] solution of 

M(t) = A(t)M(t ), M( 0) =I n . 

Integrating the linearized system with <5v(0) = 0, 
one gets the following proposition. 

Proposition 1 The Frechet derivative ofE X0,T at 
u(.) is given by 

E' X0 ’ t (v) = M(T ) f M~\t)B(t)v(t)dt. 

Jo 
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Computation of the Singular 
Trajectories and Pontryagin 
Maximum Principle 

According to the previous computations, a con¬ 
trol u(.) with corresponding trajectory x(.) is 
singular on [0, T] if the Frechet derivative E' x0,T 
is not of full rank at uf). This is equivalent to 
the condition that the linearized system is not 
controllable (Lee and Markus 1967). 

Such a condition is difficult to verify directly 
since the linearized system is time-depending and 
the computation is associated to the Maximum 
Principle (Pontryagin et al. 1962). 

Let p* be a nonzero vector such that p* 
is orthogonal to Im (E' Xo,T ) and let p(t) = 
p*M(T)M~ l (t); then p(.) is solution of the 
adjoint system 

3 F 

Pit) = -p(t) — (x(t) 9 u(t)) 
ou 

and satisfies almost everywhere the equality 
3 F 

= 0 . 


Introduce the cost-extended pseudo-Hamilton¬ 
ian: H(x,p,u) = (p, F(x,u)) + poL(x,u ); it 
follows that the maximum principle is equivalent 
to the Lagrange multiplier rule presented in the 
introduction: 

dx dH „ x dp 3 H x 

-77 = p, u), — = - — (x, p, u) 

dt dp dt ox 

dH x M 
— (x,p 9 u) = 0 
ou 

where x = (x, x°) is the extended state variable 
solution of % = F(x,u),^- = L(x,u) and 
p = (p, po) is the extended adjoint vector. One 
has the condition (p , E' Xo,T (v)) = 0 where E Xo,T 
is the cost-extended end-point mapping. 

The Role of Singular Extremals in 
Optimal Control 

While the traditional treatment in optimization 
of singular extremals is to consider them as a 
pathology, in modern optimal control, they play 
an important role which is illustrated by two 
examples from geometric optimal control. 


Introduce the pseudo-Hamiltonian H(x, p,u) = 
(p, F(x,u)), where (.,.) is the Euclidean inner 
product, one gets the following characterization. 


Proposition 2 If ( x,u ) is a singular control- 
trajectory pair on [0, T], then there exists a 
nonzero adjoint vector p{.) defined on [0, T] 
such that (x, p,u) is solution a.e. of the following 
equations: 


dx 

dt 


3 H_ 

dp 


(x, p, u) 


dp 

dt 


3 H_ 

dx 


(X , p, u) 


dH 
3 u 


(x, p , u) = 0. 


Application to the Lagrange Problem 


Singular Trajectories in Quantum Control 

Up to a normalization (Lapert et al. 2010), the 
time minimization saturation problem is to steer 
in minimum time the magnetization vector M = 
(x,y,z) from the north pole of the Bloch Ball 
N = (0,0,1) to its center O = (0,0,0). The 
evolution of the system is described by the Bloch 
equation in nuclear magnetic resonance (Levitt 
2008) 


dz 

dt 


— = -Tx + u 2 z 
dt 


dl 

dt 


= -r - u\z 


y( 1 —z) + u\y -u 2 x 


Consider the problem 

dx [ T 

— (t) = E(x(t),u(t)), min / L(x(t),u(t))dt 

dt J 0 


where (F, y) are proportional to the inverse of 
the relaxation times and u = (u\,u 2 ) is the 
control radio frequency-magnetic field bounded 
according to \u\ < M. Due to the z- symmetry of 
revolution, one can restrict the problem to the 2D 
single-input case 


with x(0) = xq, x(T) = x\. 
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-=-r,-uz. 7 t =y^-z) + u, 

that can be written as ^ = F(q) + uG(q). 

According to the maximum principle, the 
time-optimal solutions are the concatenations 
of regular extremals for which u(t) = 
Msign(p{t),G(q(t))) and singular arcs where 
{p{t), G(q(t))) = 0, W, and p(t) is solution of 
the adjoint system. Differentiating with respect 
of time and using the Lie bracket notation 
[V Y](g) = § (q)Y(q) - f (q)X(q), we get 

(p,[G,F](q))= 0, 
(p, [[G, F], G](q)) + u{p, [[G, F], F](q)) = 0. 

This leads to two singular arcs: 

• The vertical line y = 0, corresponding to the 
z-axis of revolution 

• The horizontal line z = 2 {y-r) 

The interesting physical case is when 2F > 
3 y where the vertical singular line is such that 

— 1 < y < 0. In this case, the time minimum 

2 O'—r) 

solution is represented on Fig. 1. On Fig. 2 we 
draw the experimental solution in the deoxy- 
genated blood case, compared with the standard 
inversion recovery sequence. 



Singular Trajectories in Optimal Control, Fig. 1 The 

computed optimal solution is the following concatenation: 
bang arc a' M with the horizontal singular arc cr s h followed 
by a bang arc P and finally the singular vertical arc o sv 


Abnormal Extremals in SR Geometry 

Sub-Riemannian geometry was introduced by 
R.W. Brockett as a generalization of Riemannian 
geometry (Brockett 1982; Montgomery 2002) 
with many applications in control (for instance, in 
motion planning (Bellaiche et al. 1998; Gauthier 
and Zakalyukin 2006) and quantum control). 
Its formulation in the framework of control 
theory is 

m pT m 

q(t) = Y'u i (t)F i (q(t)), min / () 

7=i Jo 7=i 

where q e U open set in 1", m < n and 
F \, • • • , F m are smooth vector fields which forms 
an orthonormal basis of the distribution they 
generate. 

According to the maximum principle, normal 
extremals are solutions of the Hamiltonian vector 
field H n , H n = H, = 

{p,Fi(q )) for i = 1 Again abnormal 

extremals can be computed by differentiating the 
constraint Hi = 0 along the extremals. Their first 
occurrence takes place in the so-called Martinet 
flat case: n = 3, m = 2, F \, Fj are given by 

9 y 2 d 3 

Fl ~ + 2 ~ 97 

where q = (x,y,z) e U neighborhood of 
the origin, and the metric is given by ds 2 = 
dx 2 -\-dy 2 . The singular trajectories are contained 
in the Martinet plane M : y = 0 and are 
the lines z = Zo- An easy computation shows 
that they are optimal for the problem. We rep¬ 
resent below the role of the singular trajectories 
when computing the sphere of small radius, from 
the origin, intersected with the Martinet plane 
(Fig. 3). 

Summary and Future Directions 

Singular trajectories play an important role 
in many optimal control problem such as in 
quantum control and cancer therapy (Schattier 
and Ledzewicz 2012). They have to be carefully 
analyzed in any applications; in particular in 
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Singular Trajectories in 
Optimal Control, Fig. 2 

Experimental result. Usual 
inversion sequence in 
green , optimal computed 
sequence in blue 



y 



Singular Trajectories in Optimal Control, Fig. 3 

Projection of the SR sphere on the xz-plane. The singular 
line is x = / and the picture shows the pinching of the SR 
sphere in the singular direction 

Boscain and Piccoli (2006) the authors provide 
for single-input systems in two dimensions a 
classification of optimal synthesis with singular 
arcs. 

Additionally, from a theoretical point of view, 
singular trajectories can be used to compute feed¬ 
back invariants for nonlinear systems (Bonnard 
and Chyba 2003). In relation, a purely mathemat¬ 


ical problem is the classification of distributions 

describing the nonholonomic constraints in sub- 

Riemannian geometry (Montgomery 2002). 

Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Feedback Stabilization of Nonlinear Systems 

► Optimal Control and Pontryagin’s Maximum 
Principle 

► Robustness Issues in Quantum Control 

► Sub-Riemannian Optimization 
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Abstract 

Small signal rotor angle stability analysis in 
power systems is associated with insufficient 
damping of oscillations under small disturbances. 


Rotor angle oscillations due to insufficient 
damping have been observed in many power 
systems around the world. This entry overviews 
the predominant approach to examine small 
signal rotor angle stability in large power systems 
using eigenvalue analysis. 

Keywords 

Eigenvalues; Eigenvectors; Low-frequency oscil¬ 
lations; Mode shape; Oscillatory modes; Partici¬ 
pation factors; Small signal rotor angle stability 

Small Signal Rotor Angle Stability 
in Power Systems 

As power system interconnections grew in num¬ 
ber and size, automatic controls such as voltage 
regulators played critical roles in enhancing reli¬ 
ability by increasing the synchronizing capability 
between the interconnected systems. As technol¬ 
ogy evolved the capabilities of voltage regula¬ 
tors to provide synchronizing torque following 
disturbances were significantly enhanced. It was, 
however, observed that voltage regulators tended 
to reduce damping torque, as a result of which the 
system was susceptible to rotor angle oscillatory 
instability. An excellent exposition of the mech¬ 
anism and the underlying analysis is provided in 
the textbooks (Anderson and Fouad 2003; Sauer 
and Pai 1998; Kundur 1993), and a number of 
practical aspects of the analysis are detailed in 
Eigenanalysis and Frequency Domain Methods 
for System Dynamic Performance (1989) and 
Rogers (2000). Two types of rotor angle oscil¬ 
lations are commonly observed. Low-frequency 
oscillations involving synchronous machines in 
different operating areas are commonly referred 
to as inter-area oscillations. These oscillations 
are typically in the 0.1-2 Hz frequency range. 
Oscillations between local machines or a group 
of machines at a power plant are referred to as 
plant mode oscillations. These oscillations are 
typically above the 2 Hz frequency range. The 
modes associated with rotor angle oscillations are 
also termed inertial modes of oscillation. Other 
modes of oscillations associated with the various 
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controls also exist. With the integration of signifi¬ 
cant new wind and photovoltaic generation which 
are interconnected to the grid using converters, 
new modes of oscillation involving the converter 
controls and conventional synchronous generator 
states are being observed. 

The basis for small signal rotor angle stability 
analysis is that the disturbances considered are 
small enough to justify the use of linear analysis 
to examine stability (Kundur et al. 2004). As a re¬ 
sult, Lyapunov’s first method Vidyasagar (1993) 
provides the analytical underpinning to analyze 
small signal stability. Eigenvalue analysis is the 
predominant approach to analyze small signal 
rotor angle stability in power systems. Commer¬ 
cial software packages that utilize sophisticated 
algorithms to analyze large-scale power systems 
with the ability to handle detailed models of 
power system components exist. 

The power system representation is described 
by a set of nonlinear differential algebraic equa¬ 
tions shown in (1) 


X = f (x,z) 
0 = g (x, z) 


( 1 ) 


where x is the state vector and z is a vector of 
algebraic variables. Small signal stability analysis 
involves the linearization of (1) around a system 
operating point which is typically determined by 
conducting a power flow analysis: 


Ax 


Vj h~ 

Ax 

_ 0 _ 


_ J 3 Ja_ 

_ A z _ 


The power system state matrix can be obtained 
by eliminating the vector of algebraic variables 
Az in (2) 


significant qualitative information. For each 
eigenvalue A j, there exists a vector u\ known 
as the right eigenvector of A which satisfies the 
equation 

Aui = XiUi (4) 

There also exists a row vector V[ known as the left 
eigenvector of A which satisfies 

ViA = XiVi (5) 

For a system which has distinct eigenvalues, the 
right and left eigenvectors form an orthogonal set 
governed by 


vtuj — Any¬ 
where 


kij 7 ^ 0 / — j 
kij — 0 i 7 ^ j 


( 6 ) 


One set (either right or left) of eigenvectors are 
usually scaled to unity and the other set obtained 
by solving (6) with = 1. The right eigen¬ 
vectors can be assembled together as columns 
of a square matrix U, and the corresponding 
left eigenvectors can be assembled as rows of a 
matrix V ; then 


V = U~ l (7) 


and 


VAU = A 


( 8 ) 


where A is a diagonal matrix with the distinct 
eigenvalues as the diagonal entries. The relation¬ 
ship in (8) is a similarity transformation and in 
the case of distinct eigenvalues provides a path¬ 
way to obtain solutions to the linear system of 
equations (3). Applying the following similarity 
transformation to (3) 


Ax = (/i — / 2^4 l Ji) Ax = A Ax (3) 

where A represents the system state matrix. 
Based on Lyapunov’s first method, the eigen¬ 
values of A characterize the small signal 
stability behavior of the nonlinear system in 
a neighborhood of the operating point around 
which the system is linearized. The eigenvectors 
corresponding to the eigenvalues also provide 


Ax = Uz -* A Xi (?) = J2 u ijZje Xjt (9) 
7=1 

Uz = AUz (10) 

z = U~ l AUz = VAUz = Az (11) 

Zi (?) = A iZi =>■ Zi (?) = zt (0) e x,t (12) 

Zi (0) = u ( r Ax (0) (13) 
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Zi(t) = v? Ax(0)e x “ (14) 

From (9) and (14), it can be observed that the 
right eigenvector describes how each mode of the 
system is distributed throughout the state vector 
(and is referred to as the mode shape), and the 
left eigenvector in conjunction with the initial 
conditions of the system state vector determines 
the magnitude of the mode. The right eigenvector 
or the mode shape has been often used to iden¬ 
tify dynamic patterns in small signal dynamics. 
One problem with the mode shape is that it is 
dependent on the units and scaling of the state 
variables as a result of which it is difficult to com¬ 
pare the magnitudes of entries that are disparate 
and correspond to states that impact the dynam¬ 
ics differently. This resulted in the development 
of the participation factors (Perez-Arriaga et al. 
1982) which are dimensionless and independent 
of the choice of units. The participation factor is 
expressed as 

Pik — Vik^ik (15) 

The magnitude of the participation factor mea¬ 
sures the relative participation of the / th state 
variable in the kth mode and vice versa. 


Small Signal Stability Analysis Tools 
for Large Power Systems 

Efficient software tools exist that facilitate the 
application of the methods in section “Small 
Signal Rotor Angle Stability in Power Systems” 
to large power systems (Powertech 2012; Martins 
1989). These tools incorporate detailed models of 
power system components and also leverage the 
sparsity in power systems. The building of the A 
matrix is a complex task for large power systems 
with a multitude of dynamic components. The ap¬ 
proach in Powertech (2012) utilizes a technique 
where state space equations are developed for 
each dynamic component in the system using a 
solved power flow solution and the dynamic data 
description for a given system. These state space 
equations are then coupled based on the system 
topology, and the system A matrix is derived as 
in (3). Reference Martins (1989) takes advan¬ 


tage of the sparsity of the Jacobian matrix in 
(2) and develops efficient algorithms to determine 
the eigenvalues and eigenvectors. The software 
tools also provide the flexibility of a number 
of different options with regard to eigenvalue 
computations: 

1. Calculation of a specific eigenvalue at a spec¬ 
ified frequency or with a specified damping 
ratio 

2. Simultaneous calculation of a group of 
relevant eigenvalues in a specified frequency 
range or in specified damping ratio range 

In addition to the features described above, com¬ 
mercial software packages also provide features 
to evaluate: 

1. Frequency response plots 

2. Participation factors 

3. Transfer functions, residues, controllability, 
and observability factors 

4. Linear time response to step changes 

5. Eigenvalue sensitivities to changes in speci¬ 
fied parameters 

Applications of Small Signal Stability 
Analysis in Power Systems 

Small signal stability analysis tools are used for 
a range of applications in power systems. These 
applications include: 

Analysis of local stability problems - These types 
of stability problems are primarily associated 
with the tuning of control associated with the 
synchronous generator, converter interconnected 
renewable resources, and HVDC link current 
control. In certain cases analysis of local stability 
problems could also involve design of supple¬ 
mentary controllers which enhance the stability 
region. Since the stability problem pertains to a 
local portion of the power system, there is signif¬ 
icant flexibility in modeling the system. In many 
instances local stability problems facilitate the 
use of a simple representation of a power system 
which could include the particular machine or 
a local group of machines in question together 
with a highly equivalenced representation of the 
rest of the system. In cases where controls other 
than generator controls influence stability, e.g., 
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static VAr compensators or HVDC links, the 
system representation would need to be extended 
to include portions of the system where these 
devices are located. Typical small signal stability 
problems that are analyzed include: 

1. Power system stabilizer design 

2. Automatic voltage regulator tuning 

3. Governor tuning 

4. DC link current control 

5. Small signal stability analysis for subsyn- 
chronous resonance 

6. Load modeling effects on small signal stability 
References Eigenanalysis and Frequency Domain 
Methods for System Dynamic Performance 
(1989) and Rogers (2000) provide comprehensive 
examples of the analysis conducted for each of 
the problems listed above. 

Analysis of global stability problems - These 
types of stability problems are associated with 
controls that impact generators located in differ¬ 
ent areas of the power systems. The analysis of 
these inter-area problems requires a more system¬ 
atic approach and involves representation of the 
power system in greater detail. The problems that 
are analyzed under this category include: 

1. Power system stabilizer design 

2. HVDC link modulation 

3. Static VAr compensator controls 
References Eigenanalysis and Frequency Domain 
Methods for System Dynamic Performance 
(1989) and Rogers (2000) again provide details of 
the analysis conducted for each of the problems 
listed under this category. 
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Abstract 

Many biological behaviors require that biochemi¬ 
cal species be distributed spatially throughout the 
cell or across a number of cells. To explain these 
situations accurately requires a spatial description 
of the underlying network. At the continuum 
level, this is usually done using reaction-diffusion 
equations. Here we demonstrate how this class of 
models arises. We also show how the framework 
is used in two popular models proposed to explain 
spatial patterns during development. 
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Introduction 

Cells are complex environments consisting of 
spatially segregated entities, including the nu¬ 
cleus and various other organelles. Even within 
these compartments, the concentrations of vari¬ 
ous biochemical species are not homogeneous, 
but can vary significantly. The proper localization 
of proteins and other biochemical species to their 
respective sites is important for proper cell func¬ 
tion. This can be because the spatial distribution 
of signaling molecules itself confers information, 
such as when a cell needs to respond to a spatially 
graded cue to guide its motion (Iglesias and De- 
vreotes 2008) or growth pattern (Lander 2013). 
Alternatively, information that is obtained in one 
part of the cell must be transmitted to another part 
of the cell, as when receptor-ligand binding at the 
cell surface leads to transcriptional responses in 
the nucleus. Frequently, describing the action of 
a biological network accurately requires not only 
that one account for the chemical interactions 
between the different components but that the 
spatial distribution of the signaling molecules 
also be considered. 


Accounting for Spatial Distribution 
in Models 

Mathematical models of biological networks usu¬ 
ally assume that reactions take place in well- 
stirred vessels in which the concentrations of 
the interacting species are spatially homogeneous 
and hence need not be accounted for explicitly. 
These systems also assume that the volume is 
constant. When the spatial location of molecules 
in cells is important, the concentration of species 
changes in both time and space. 


is assumed to be spatially homogeneous. The 
membranes in these models can be assumed to be 
either permeable or impermeable. In permeable 
membranes, information passes through small 
openings, such as ion channels or nuclear pores, 
which allow molecules to move from one side 
of the membrane to the other. With imperme¬ 
able membranes, information must be transduced 
by transmembrane signaling elements, such as 
cell-surface receptors, that bind to a signaling 
molecule in one side of the membrane and release 
a secondary effector on the other side. Note that 
in this case, the membrane itself acts as a third 
compartment. 

Compartmental models offer simplicity, since 
the reactions that happen in a single region obey 
the same reaction kinetics usually assumed in 
spatially homogeneous models. Even when the 
reactions involve more than one compartment, 
as in ligand-receptor binding, this can still be 
described by the usual reaction dynamics. Care 
must be taken, however, to account properly for 
the different effects on the respective concentra¬ 
tions as molecules move from one compartment 
to another. In models of spatially homogeneous 
systems, there is little practical difference be¬ 
tween writing the ordinary-differential equations 
in terms of molecule numbers or concentrations, 
since the two are proportional to each other ac¬ 
cording to the volume, which is constant. In 
a compartmental model, if the molecule moves 
from one compartment to another, there is con¬ 
servation of molecule numbers, but not concen¬ 
trations. For example, if a species is found in 
two compartments with volumes V\ and V 2 and 
transfer rates k\ 2 and k 2 \ s _1 , then the differential 
equations describing transport between compart¬ 
ments can be expressed in terms of numbers (n 1 
and ft 2 ) as follows: 


Compartmental Models 

One way to account for spatial distribution of 
signaling components is through compartmental 
models. As the name suggests, in these models 
the cell is divided into different regions that are 
segregated by membranes. Within each compart¬ 
ment, the concentration of the network species 


dfti 

d t 


= -k n n\ + k 2 \n 2 


dft 2 
d t 


= +knni — k 2 \n 2 . 


Dividing by the respective volumes (C 1 = n 1 / V\ 
and C 2 = n 2 /V 2 ), we obtain equations for the 
concentrations 
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—= —^ 12^1 + k2\(y^)C2 

—^ = +ku(^)C\ -k 2 \C 2 . 


In the former case, the two equations add to zero, 
indicating that n\(t) + ^(0 = constant. In the 
latter, if V\ ^ V 2 , then C\(t) + C 2 (t) varies over 
time as molecules move from one compartment 
to the other. 


Diffusion and Advection 

If the distribution of molecules inside any sin¬ 
gle compartment is spatially heterogeneous, then 
models must account for this spatial distribu¬ 
tion. At the continuum level, this is done using 
reaction-diffusion equations. The basic assump¬ 
tion is a conservation principle expressed as a 
continuity equation: 


,/adv — vC. 

In biological systems, advection can arise 
because of the movement of the cytoplasm, 
but it can also represent directed transport of 
molecules, such as the movement of cargo along 
filaments by processive motors. In general, 
molecules exhibit both diffusive and advective 
motion: j = j m + j a dv , leading to 

^ + V(-DVC + vC) = f, 

which, under the assumption that the diffusion 
coefficient and the transport velocity are inde¬ 
pendent of spatial location, leads to the reaction- 
diffusion-advection equation: 

^ = DV 2 C - vVC + f. 


% + vy - f 

which relates the changes in the density (p) of a 
conserved quantity (in our case, the concentration 
of a species: p = C) to the flux j and any net 
production /. In biological networks, the latter 
represents the net effect of all the reactions that 
affect the concentration of the species includ¬ 
ing binding, unbinding, production, degradation, 
post-translational modifications, etc. 

In biological models, the flux term usually 
comes from one of two sources: diffusion or 
advection. According to Fick’s law, diffusive flux 
is proportional to the negative gradient of the con¬ 
centration of the species as particles move from 
regions of high concentration to regions of low 
concentration. The coefficient of proportionality 
is the diffusion coefficient, D : 

Jdiff = —DVC. 

Fick’s law describes thermally driven Brownian 
motion of molecules at the continuum level. If 
the species is embedded in a moving field, then 
the flux is proportional to the velocity of the 
underlying fluid. In this case, we have advective 
flow: 


Being a second-order partial differential, the 
solution requires an initial condition and two 
boundary conditions. Common choices for 
the latter include periodic (e.g., in models of 
closed boundaries) or no-flux (to describe the 
impermeability of membranes) assumptions. 

Measuring Diffusion Coefficients 

Invariably, solving the reaction-diffusion equa¬ 
tion requires knowledge of the diffusion coeffi¬ 
cient of the molecule. Experimentally, this can be 
done in a number of ways. In fluorescence recov¬ 
ery after photobleaching (FRAP), a laser is used 
to photobleach normally fluorescent molecules 
in a specific area of the cell. As these “dark” 
molecules are replaced by fluorescent molecules 
from non-bleached areas, the fluorescent inten¬ 
sity of the bleached area recovers. Higher dif¬ 
fusion leads to faster recovery. The time to half 
recovery, n/ 2 , can be used to estimate D. If 
recovery occurs by lateral diffusion, then 



4ti/2 


where ro is the 1 / e 2 radius of the Gaussian profile 
laser beam and y is a parameter that depends on 
the extent of photobleaching, which ranges from 
1 to 1.2 (Chen et al. 2006). 
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These days, it is increasingly common to mea¬ 
sure lateral diffusion coefficients by observing 
the trajectory of single molecules. A molecule 
with diffusion coefficient D undergoing Brow¬ 
nian motion in a two-dimensional environment 
is expected to have mean-square displacement 
(MSD) equal to 


(r 2 } = ADt. 

Thus, the coefficient D can be obtained by mea¬ 
suring how the MSD changes as a function of the 
time interval t. This method can also show if the 
molecule is undergoing advection in which case 

(r 2 ) = ADt + v 2 t 2 . 

This super-diffusive behavior can be seen in the 
concave nature of the plot of (r 2 ) against t. This 
plot will also reveal barriers to diffusion. For 
example, if the molecule is confined to move in 
a circular region of radius a , then, as t increases, 
(r 2 ) cannot exceed a 2 . 

Both these methods work best for molecules 
diffusing on a membrane. For molecules dif¬ 
fusing in the cytoplasm, the three-dimensional 
imaging required is considerably more difficult, 
particularly since the diffusion of particles in the 
cytoplasm (D ~ l-10|im 2 s -1 ) is usually orders 
of magnitude greater than for membrane-bound 
proteins (D ~ 0.01-0.1 |JLm 2 s _1 ). In this case, 
an analytical expression can be used to estimate 
the diffusion coefficient. The diffusion coefficient 
of a spherical particle of radius r moving in a low 
Reynolds number liquid with viscosity rj is given 
by the Stokes-Einstein equation: 


6nr]r 

The exact viscosity of the cell is unknown, but es¬ 
timates that 7] is approximately five times that of 
water lead to diffusion coefficients of cytoplasmic 
proteins that match those measured using FRAR 

Diffusion-Limited Reaction Rates 

Even in compartments that are considered well 
stirred, the diffusion of molecules is necessary for 


reactions to take place. In particular, before two 
molecules can react, they must come together. To 
see how diffusion influences this, suppose that 
spherical molecules of species A and B with radii 
r a and r#, respectively, come together to form a 
complex AB at a rate kd . This rate represents the 
likelihood that molecules of A and B collide at 
random and hence will depend on the diffusion 
properties of the two species. The molecules in 
this complex can dissociate at rate k d or can 
be converted to species C at rate k r . Thus, the 
overall reaction involves two steps: 


A + B AB 

k ' d 

AB^C. 


Assuming that the system is at quasi-steady-state, 
that is, the concentration of A B is constant, the 
effective rate of production C is given by 


k Q ff = 


k d k r 

k'r + k r 


There are two regions of operation. If Kl » kr, 
then k Q ff k f ( kd/k ' d ). In this case production 
is said to be reaction limited. If k d k r , then 
k Q ff ^ kd and production is diffusion limited. In 
this case, it is possible to find kd as a function of 
the species’ diffusion coefficients. 

Assume that species A is stationary, in which 
case the effective diffusion is the sum of the two 
diffusion coefficients: D = Da + Db - The con¬ 
centration of species B depends on the distance 
away from molecules of A. Because we assume 
that the reaction rate is fast, at the point of contact 
(r* = r a A-tb) the concentration is zero since any 
molecules of A B are quickly converted to C. At 
the other extreme, as r oo, the concentration 
approaches the bulk concentration Bo. According 
to Fick’s law, this concentration gradient causes a 
flux density given by j = — D(dB/dr). The total 
flux into a sphere of radius r is then 


9 9 3 B 

J = 4tt r 2 j = -AnDr 2 — , 
dr 
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which, at steady state, is constant. Solving this 
equation for B(r ) using the two boundary equa¬ 
tions leads to a flux 

J = —AirDBor*, 
from which we have that 


kd = 4;rDr*. 


domain of length L, but the morphogen cannot 
exit at the other end. The species diffuses inside 
the domain and also decays at a rate proportional 
to its concentration (/ = —kC). Thus, the 
concentration is governed by the reaction- 
diffusion equation: 


3 C 

"a T 


3 2 C r ^ 

D w~ kc - 


A typical value for kd , using the Einstein-Stokes 
formula, is 


4tt 



k B T 


6nr](r* /2) 


y 


8 k B T 
3 7] 


% 10 3 |i m 1 s 1 


Spatial Patterns 

The effect of spatial heterogeneities has been of 
long interest to developmental biologists, who 
study how spatial patterns arise. Two distinct 
models have been proposed to explain how this 
patterning can arise. Here we introduce these 
models and discuss their relative merits. Though 
usually seen as competing models, there is recent 
evidence suggesting that both models may play 
complementary roles during development (Reth 
etal. 2012). 

Morphogen Gradients 

A morphogen is a diffusible molecule that is 
produced or secreted at one end of an organism. 
Diffusion away from the localized source forms 
a concentration gradient along the spatial dimen¬ 
sion. Morphogens are used to control gene ex¬ 
pression of cells lying along this spatial domain. 
Thus, a morphogen gradient gives rise to spatially 
dependent expression profiles that can account 
for spatial developmental patterns (Rogers and 
Schier 2011). 

The mathematics behind the formation of a 
morphogen gradient are relatively straightfor¬ 
ward. The concentration of the morphogen is 
denoted by C(x,t). There is a constant flux (jo) 
at one end (x = 0) of a finite one-dimensional 


with boundary conditions: Dj^ = — jo atx = 0, 
and Dj^r = 0 at x = L. We focus on the steady 
state: 

d^c = y- 

dx 2 D 

so that the initial condition is not important. In 
this case, the distribution of the species is given 
by 

C ( ^ = Xj0 cosh ([^-*]M) 

W D sinh (L/X) ' 

Thus, the shape of the gradient is roughly ex¬ 
ponential with parameter A = yjD/ k , known 
as the dispersion, which specifies the average 
distance that molecules diffuse into the domain 
before they are degraded or inactivated. Equally 
important in determining the gradient, however, 
is the spatial dimension (L) relative to the dis¬ 
persion, d> = L/X, a ratio known as the Thiele 
modulus. If d> 1, then the concentration will 
be approximately homogeneous. Alternatively, 
0^1 leads to a sharp transition close to the 
boundary where there is flux and a relatively flat 
concentration thereafter. 

Though morphogen gradients are commonly 
used to describe signaling during development, 
where the gradient can extend across a number 
of cells, the mathematics described above are 
equally suitable for describing concentration gra¬ 
dients of intracellular proteins. In this case, the 
dimension of the cell has a significant effect on 
the shape of the gradient (Meyers et al. 2006). 

As discussed above, morphogen gradients are 
established in an open-loop mode. As such, the 
actual concentration experienced at a point down¬ 
stream of the source of the morphogen will vary 
depending on a number of parameters, includ¬ 
ing the flux jo and the rate of degradation k. 
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Moreover, because the concentration of the mor- 
phogen decreases as the distance from the source 
grows, the relative stochastic fluctuations will 
increase. How to manage this uncertainty is an 
active area of research (Rogers and Schier 2011; 
Lander 2013). 

Diffusion-Driven Instabilities 

In 1952, Alan Turing proposed a model of how 
patterns could arise in biological systems (Turing 
1952). His interest was in explaining how an 
embryo, initially spherical, could give rise to a 
highly asymmetric organism. He posited that the 
breaking of symmetry could be a result of the 
change in the stability of the homogeneous state 
of the network which would amplify small fluc¬ 
tuations inherent in the initial symmetry. Turing 
sought to explain how these instabilities could 
arise using only reaction-diffusion systems. 

To illustrate how diffusion-driven instabilities 
can arise, we work with a single two-species 
linear reaction network: 


d 

"cr 

= A 

~c{ 

d 2 n 
-(- D 

"cr 

Y 

C2 


C2 

dx 2 

C2 


where A = [ a a \\ all ] specifies the reaction terms 
and the diagonal matrix D = [ D 0 l ] the diffu¬ 
sion coefficients. 

We assume that, in the absence of diffusion, 
the system is stable, so that det(A) > 0 and 
trace (A) < 0. When considering diffusion in 
a one-dimensional environment of length L, we 
must consider the spatial modes, which are of 
the form exp (iqx). In this case, stability of the 
system requires that trace (A — q 2 D) < 0 and 
det(A — q 2 D ) > 0. The former is always true, 
since trace(A — q 2 D) = trace(A) — q 2 (D\ + 
D 2 ) < trace(A) < 0. However, the condition on 
the determinant can fail since 

det(A - q 2 D) = D x D 2 q 4 - q 2 (a 22 D x + a n D 2 ) 
+ det(A). (1) 

Since det(A) > 0, diffusion-driven instabilities 
can only occur if the term a 22 D\ + a\\D 2 > 0, 
by which it follows that at least one of an or a 22 


must be positive. Since traceA < 0, it follows 
that the diagonal terms must have opposite sign. 
Usually, it is assumed that an > 0 and that a 22 < 
0. Since det(A) > 0, it follows that a X2 and a 2X 
must also have opposite sign. 

These requirements in the sign pattern 
of the two molecules lead to one of two 
classes of systems. In the first class, known 
as activator/inhibitor systems, the activator 
(assume species 1) is autocatalytic (an > 0) 
and also stimulates the inhibitor (a 2 \ > 0), which 
negatively regulates the activator (a X2 < 0). In 
the other class, known as substrate-depletion 
systems, a product (species 1) is autocatalytic 
(an > 0), but in its production consumes 
(a 2 \ < 0) the substrate (species 2) whose 
presence is needed for formation of the product 
(a 12 > 0). Note that both systems involve an 
autocatalytic positive feedback loop (an > 0), as 
well as a negative feedback loop involving both 
species (a\ 2 a 2 \ < 0). 

The stability condition also imposes a nec¬ 
essary condition on the dispersion of the two 
species, (A/ = y/Di/\au\), since 

a 22 D\ + a\\D 2 > 0 =>* — X, 2 + X 2 > 0 

Thus, the species providing the negative feed¬ 
back (inhibitor or substrate) must have higher 
dispersion (A 2 > Ai). This requirement is usually 
referred to as local activation and long-range 
inhibition. 

These conditions are necessary, but not suffi¬ 
cient. They ensure that the parabola defined by 
Eq. 1 has real roots. However, when diffusion 
takes place in finite domains, the parameter q can 
only take discrete values q = 2nn/L for integers 
n. Thus, for a spatial mode to be unstable, it must 
be that det(A — q 2 D) < Oat specific values of q 
corresponding to integers n. If the dimension of 
the domain is changing, as would be expected in 
a growing domain, the parameter q 2 will decrease 
over time suggesting that higher modes may lose 
stability. Thus, the nature of the pattern may 
evolve over time. 

Over the years, Turing’s framework has been 
a popular model among theoretical biologists 
and has been used to explain countless patterns 
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seen in biological systems. It has not had the 
same level of acceptance among biologists, likely 
because of the difficulty of mapping a complex 
biological system involving numerous interacting 
species into the simple nature of the theoretical 
model (Kondo and Miura 2010). 

Summary and Future Directions 

Spatial aspects of biochemical signaling are 
increasingly playing a role in the study of cellular 
signaling systems. Part of this interest is the 
desire to explain spatial patterns seen in sub- 
cellular localizations observed through live cell 
imaging using fluorescently tagged proteins. The 
ever-increasing computational power available 
for simulations is also facilitating this progress. 
Specially built spatial simulation software, such 
as the Virtual Cell, is freely available and 
tailor-made for biological simulations enabling 
simulation of spatially varying reaction networks 
in cells of varying size and shape (Cowan et al. 
2012 ). 

Of course, cell shapes are not static, but evolve 
in large part due to the effect of the underlying 
biochemical system. This requires simulation 
environments that solve reaction-diffusion 
systems in changing morphologies. This has 
received considerable interest in modeling cell 
motility (Holmes and Edelstein-Keshet 2013). 

Another aspect of spatial models that is only 
now being addressed is the role of mechanics in 
driving spatially dependent models. For example, 
it has recently been shown that the interaction be¬ 
tween biochemistry and biomechanics can itself 
drive Turing-like instabilities (Goehring and Grill 
2013). 

Finally, we note that our discussion of spa¬ 
tially heterogeneous signaling has been based 
on continuum models. As with spatially invari¬ 
ant systems, this approach is only valid if the 
number of molecules is sufficiently large that the 
stochastic nature of the chemical reactions can be 
ignored. In fact, spatial heterogeneities may lead 
to localized spots requiring a stochastic approach, 
even though the molecule numbers are such that 
a continuum approach would be acceptable if the 


cell were spatially homogeneous. The analysis 
of stochastic interactions in these systems is still 
much in its infancy and is likely to be an increas¬ 
ingly important area of research (Mahmutovic 
et al. 2012). 


Cross-References 

► Deterministic Description of Biochemical Net¬ 
works 

► Monotone Systems in Biology 

► Robustness Analysis of Biological Models 

► Stochastic Description of Biochemical 
Networks 
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Abstract 

For more than half a century, spectral factoriza¬ 
tion is encountered in various fields of science 
and engineering. It is a useful tool in robust 
and optimal control and filtering and many other 
areas. It is also a nice control-theoretical concept 
closely related to Riccati equation. As a quadratic 
equation in polynomials, it is a challenging alge¬ 
braic task. 
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Polynomial Spectral Factorization 

As a mathematical tool, the spectral factoriza¬ 
tion was invented by Wiener in 1940s to find 
a frequency domain solution of optimal filtering 
problems. Since then, this technique has turned 
up numberless applications in system, network 
and communication theory, robust and optimal 
control, filtration, prediction and state reconstruc¬ 
tion. Spectral factorization of scalar polynomials 
is naturally encountered in the area of single¬ 
input single-output systems. 

In the context of continuous-time problems, 
real polynomials in a single complex variable s 
are typically used. For such a polynomial p(s ), 
its adjoint p*(s ) is defined by 

( 1 ) 


which results in flipping all roots across the imag¬ 
inary axis. If the polynomial is symmetric, then 
p*(s) = p(s) and its roots are symmetrically 
placed about the imaginary axis. 

The symmetric spectral factorization problem 
is now formulated as follows: Given a symmetric 
polynomial b(s ), 

b*(s)=b(s), ( 2 ) 

that is also positive on the imaginary axis 

b(ico) > 0 for all real co , (3) 

find a real polynomial v(s), which satisfies 

v(s)v*(s) = b(s) (4) 

as well as 

v(s) ^ 0, Res > 0. (5) 

Such an v (s) is then called a spectral factor of 
b(s). By (5), the spectral factor is a stable poly¬ 
nomial in the continuous-time (Hurwitz) sense. 

Obviously, (4) is a quadratic equation in poly¬ 
nomials and its stable solution is the desired 
spectral factor. 

Example 1 Given 

b(s ) = 4 + / = (1 + j + s) (1 - j + s) 

(1 +j- 5 )(i-j- s ), 

(4) results in the spectral factor 

* 0 ) = 2 + 2s + s 2 = (1 + j + s) (1 - j + s ). 

When the right-hand side polynomial b(s) 
has some imaginary-axis roots, the problem 
formulated strictly as above becomes unsolvable 
since (3) does not hold and hence (5) cannot be 
fulfilled. A more relaxed formulation may then 
find its use requiring only b(ioo) > 0 instead 
of (3) and x(s) 7 ^ 0 only for Res > 0 instead 
of (5). Clearly, the imaginary-axis roots of b(s) 
must then appear in x(s) and x*(s) as well. 


P*(s) = P(s), 
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In the realm of discrete-time problems, one 
usually encounters two-sided polynomials , which 
are polynomial-like objects (In fact, one can stay 
with standard one-sided polynomials (either in 
nonnegative or in nonpositive powers only), if 
every adjoint p*(z) is multiplied by proper power 
of z to create a one-sided polynomial p{z ) = 
p*(z)z n .) with positive and/or negative powers 
of a complex variable z, such as, for example, 
p(z) = z~ l + 1+2 z. Here, the adjoint p*(z) 
stands simply for 

p*(z) = p(z~ l ) (6) 

and the operation results in flipping all roots 
across the unit circle. If the two-sided polynomial 
is symmetric , then p*(z) = p(z) and its roots are 
symmetrically placed about the unit circle. 

In its discrete-time version, the spectral fac¬ 
torization problem is stated as follows: Given a 
symmetric two-sided polynomial b(z) that meets 
the conditions of symmetry 

b\z) = b(z) (7) 

and positiveness (here on the unit circle) 

b(e l(0 ) > 0 real co, —n < co < n, (8) 

find a real polynomial x (. z ) in nonnegative powers 
of z to satisfy 

x(z)x*(z) = b(z) (9) 

and x(z) ^0, |z|>l. (10) 

By (10), the spectral factor is a stable polynomial 
in the discrete-time (Schur) sense. 

Example 2 For 

b(z) = 2z~ 2 + 6z _1 + 9 + 6z + 2z 2 

= 2z~ 2 (z + 0.5 + 0 ,5J)(z + 0.5 - 0.57) 

(z + 1 +j)(z+ 1-7) 

= 4 (z + 0.5 + 0.57)(z + 0.5 - 0.57) 
x (z ' + 0.5 + 0.57) 

(z -1 + 0.5 - 0.57) 


(9) yields 

x(z) =1 + 2z + 2 z 2 — 2(z + 0.5 + 0.5j) 

(z + 0.5 — 0.5y) 

as the desired spectral factor. 

When the right-hand side b (z) possesses some 
roots on the unit circle, this problem turns out 
to be unsolvable as (8) fails. If necessary, a 
less restrictive formulation can then be applied 
replacing (8) by b(e l0J ) > 0 and with x(z) ^ 0 
only for \z\ > 1 instead of (10). Clearly, the unit- 
circle roots of b (z) must then appear both in v (z) 
and v*(z). 

When formulated as above, the spectral fac¬ 
torization problem is always solvable and its 
solution is unique up to the change of sign (if v is 
a solution, so is — v and no other solutions exist). 

Polynomial Matrix Spectral 
Factorization 

Matrix version of the problem has been encoun¬ 
tered since 1960s. In the world of continuous¬ 
time problems, real polynomial matrices in a 
single complex variable s are used. For such a 
real polynomial matrix P(s ), its adjoint P*(s) is 
defined as 

P*(s) = P T (-s). (11) 

A polynomial matrix P (s) is symmetric or, more 
precisely, para-Hermitian , if P*(s) = P(s). 
Needless to say, only square polynomial matrices 
can be symmetric. 

The matrix spectral factorization problem is 
defined as follows: Given a symmetric polyno¬ 
mial matrix B(s ), 

B*(s) = B(s), (12) 

that is also positive definite on the imaginary axis 
B(ico) > 0 for all real co, (13) 

find a square real polynomial matrix X(s), which 
satisfies 


X(s)X*(s) = B(s) 


(14) 
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and has no zeros in the closed right half plain and positively definite on the unit circle 
Re s > 0. Such an X(s) is then called a left 

spectral factor of Bis). A right spectral factor B(e l0) ) > 0 real co, —n < co < n, (18) 

Y(s ) is defined similarly by replacing (14) with 

find a real polynomial matrix X(z) in nonnegative 
F*(s)F(s) = B(s). (15) powers of z such that 


Example 3 For a symmetric matrix 


X(z)X*(z) = B{z) (19) 


B(s) = 


2 — s 2 -2-s' 
-2 + s 4-s 2 


we have 


X(s) = 


lA + s -0.2 
-1.2 1.6+ j 


and has no zeros on and outside of the unit circle. 
Such an X(z) is then called a left spectral factor 
ofi?(z). A right (The right and the left spectral 
factor are sometimes called the factor and the 
cofactor , respectively, but the terminology is not 
set at all.) spectral factor Y{z) is defined similarly 
by replacing (19) with 


as a left spectral factor and 

TO=r‘+ s 


0 

2 s 


Y\z)Yiz) = Biz) (20) 

Example 4 A symmetric two-sided polynomial 
matrix 


as a right one. 

As in the scalar case, less restrictive definitions 
are sometimes used where the given right-hand 
side matrix Bis) is only nonnegative definite on 
the imaginary axis and so the spectral factor is 
free of zeros in the open right half plain Re s > 0 
only. 

In the kingdom of discrete-time, two-sided 
real polynomial matrices P(z) are used having 
in general entries with both positive and negative 
powers of the complex variable z. For such a 
matrix, its adjoint P * (z) is defined by 


_ r - 2.- 1 + s - 2 ^ 

-l+2z 
has a left spectral factor 


2z 

2z _1 


-i-l 
+ 6 + 2z 




-1.1 + 1.9z 0.55 

—0.8z 0.95 + 2. lz 


and a right spectral factor 

™-rv 


i 

1+2 z 


P*(z) = P T (Z~ 1 ). (16) 

Clearly, if P (z) has only nonnegative powers of z, 
then P * (z) has only nonpositive powers of z and 
vice versa. A square two-sided polynomial matrix 
P (z) is ipara-Hermitian) symmetric if P * (z) = 
Piz). 

Here is the discrete-time version of matrix 
spectral factorization problem. Given a two-sided 
polynomial matrix Biz) that is symmetric 


As before, less restrictive formulations are some¬ 
times encountered where the given symmetric 
Biz) is only nonnegatively definite on the unit 
circle and so the spectral factor must have no 
zeros only outside of the unit circle. 

When formulated as above, the matrix spectral 
factorization problem is always solvable. The 
spectral factors are unique up to an orthogonal 
matrix multiple. That is, if X and X' are two left 
spectral factors of B, then 


s 


B*(z) = B{z) 


(17) 


X' = ux 


( 21 ) 
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where U is a constant orthogonal matrix UU T = 
/, while if Y and Y' are two right spectral factors 
of B , then 

Y f = Y V (22) 

where f is a constant orthogonal matrix 
V T V = I. 


/-Spectral Factorization 

In robust control, game theory and several other 
fields, the symmetric right-hand side in the ma¬ 
trix spectral factorization may have a general 
signature. With such a right-hand side, standard 
(positive or nonnegative definite) factorization 
becomes impossible. Here, a similar yet different 
/-spectral factorization takes its role. 

In the context of continuous-time problems, 
the J-spectral factorization problem is formulated 
as follows. Given a symmetric polynomial matrix 
B(s), 

B*(s) = B(s), (23) 


conditions appear to be known for /-spectral 
factorization. A sufficient condition by Jakubovic 
(1970) states that the problem is solvable if the 
multiplicity of the zeros on the imaginary axis 
of each of the invariant polynomials of the 
right-hand side matrix is even. In particular, 
this condition is satisfied whenever det B(s) 
has no zeros on the imaginary axis. In turn, 
the condition is violated if any of the invariant 
factors is not factorable by itself. An example of 
a nonfactorizable polynomial is 1 + s 2 . 

The /-spectral factors are unique up to a /- 
orthogonal matrix multiple. That is, if A and X' 
are two left /-spectral factors of B , then 

X' = UX , (27) 

where U is a /-orthogonal matrix UJU T = /, 
while if Y and Y' are two right /-spectral factors 
of B , then 

Y' = YV, (28) 

where V is a /-orthogonal matrix V T JV = /. 


find a square real polynomial matrix X(s), which 
satisfies 


X(s)JX*(s) = B(s), (24) 


where X(s) has no zeros in the open right half 
plain Re s > 0 and / is a signature matrix of the 
form 


/ = 


h 0 0 
0 -I 2 0 
0 0 0 


(25) 


Example 5 For 


B(s) = 


0 l-s 
1 + s 2 - s 2 


the signature matrix reads 


/ = 


1 0 
0 -1 


with 1 1 and / 2 unit matrices of not necessarily 
the same dimensions. The bottom right block 
of zeros is often missing, yet it is considered 
here for generality. Such an X(s) is called a 
left J-spectral factor of B(s). A right J-spectral 
factor is defined by 


and the right /-spectral factor is 


Y(s) = 


1 + s 


3 — s 2 "I 
2 


l+s 


l-s 2 

2 - 


Y*(s)JY(s) = B(s) (26) 

instead of (24). For discrete-time problems, the 
/-spectral factorization is defined analogously. 

The J-spectral factorization problem is quite 
general having standard (either positive or 
nonnegative) spectral factorization as a particular 
case. No necessary and sufficient existence 


Nonsymmetric Spectral Factorization 

Spectral factorization can also be non-symmetric. 
For a scalar polynomial p (either in s or in z), this 
means to factor it directly as 

p = p + p 


(29) 
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where p+ is a stable factor of p (having all its 
roots either in the open left half plane or inside 
of the unit disc, depending on the variable type) 
while p~ is the “remaining” that is unstable fac¬ 
tor. Eventual roots of p at the stability boundary 
either associate to p + or to p ~, depending on the 
application problem at hand. 

For a matrix polynomial P , the non- 
symmetric factorization is naturally twofold: 
Either 


p = p+p- 

(30) 

II 

1 

+ 

(31) 


For scalar polynomials, symmetric and non- 
symmetric spectral factors are closely related. 
Given p and having computed a symmetric factor 
x for pp* as in (4) or (9) to get 

x*x = p* p (32) 

Then 

p + = gcd (p,x) and p~ = gcd(/>,x*) (33) 


Algorithms and Software 

Spectral factorization is a crucial step in the 
solution of various control, estimation, filtration, 
and other problems. It is no wonder that a va¬ 
riety of methods has been developed over the 
years for the computation of spectral factors. The 
most popular ones are briefly mentioned here. 
For details on particular algorithms, the reader is 
referred to the papers recommended for further 
reading. 

Factor Extraction Method 

If all roots of the right-hand side polynomial 
are known, the factorization becomes trivial. Just 
write the right-hand side as a product of first and 
second order factors and then collect the stable 
ones to create the stable factor. If the roots are 
not known, one can first enumerate them and 
then proceed as above. Somewhat surprisingly, a 
similar procedure can be used for the matrix case. 
To every zero, a proper matrix factor must be 
extracted. For further details, see Callier (1985) 
or Henrion and Sebek (2000). 


where gcd stands for a greatest common divisor. 
In reverse, 

x = p + (/?“)* and x* = p~ (j? + )* . (34) 

Unfortunately, no such relations exist for the 
matrix case. 

Example 6 For example, 

p(s) = l-s 2 

factorizes into 

p + (s) = l + s, p~(s) = l — s 
while for 


Bauer's Algorithm 

This procedure is an iterative scheme with linear 
rate of convergence. It relies on equivalence be¬ 
tween the polynomial spectral factorization and 
the Cholesky factorization of a related infinite¬ 
dimensional Toeplitz matrix. For further details, 
see Youla and Kazanjian (1978). 

Newton-Raphson Iterations 

An iterative algorithm with quadratic conver¬ 
gence rate based on consecutive solutions of sym¬ 
metric linear polynomial Diophantine equations. 
It is inspired by the classical Newton’s method for 
finding a root of a function. To learn more, read 
Davis (1963), Jezek and Kucera (1985), Vostry 
(1975). 


P(s) = 


1+5 0 

1+s 2 l-s 


we have 


P~(s) 


11 
5 1 


P + (S) 


s -1 

1 1 


Factorization via Riccati Equation 

In state-space solution of various problems, an al¬ 
gebraic Riccati equation plays the role of spectral 
factorization. It is therefore not surprising that the 
spectral factor itself can directly be calculated by 
solution of a Riccati equation. For further info, 
see e.g. Sebek (1992). 
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Spectral Factorization 


FFT Algorithm 

This is the most efficient and accurate procedure 
for factorization of scalar polynomials with very 
high degrees (in orders of hundreds or thou¬ 
sands). Such polynomials appear in some special 
problems of signal processing in advanced audio 
applications involving inversions of dynamics of 
loudspeakers or room acoustics. The algorithm 
is based on the fact that logarithm of a product 
(such as the spectral factorization equation) turns 
into a sum of logarithms of particular entries. For 
details, see Hromcik and Sebek (2007) 

All the procedures above are either directly 
programmed or can be easily composed from 
the functions of Polynomial Toolbox for Matlab , 
which is a third-party Matlab toolbox for polyno¬ 
mials, polynomial matrices and their applications 
in systems, signals, and control. For more details 
on the toolbox, visit www.polyx.com. 

Consequences and Comments 

Polynomial and polynomial matrix spectral fac¬ 
torization is an important step when frequency 
domain (polynomial) methods are used for op¬ 
timal and robust control, filtering, estimation, or 
prediction. Numerous particular examples can be 
found throughout this Encyclopedia as well as 
in the textbooks and papers recommended for 
further reading below. 

Spectral factorization of rational functions and 
matrices is an equally important topic but it is 
omitted here due to lack of space. Inquiring 
readers are referred to the papers Oara and Varga 
(2000) and Zhong (2005). 

Cross-References 

► Basic Numerical Methods and Software for 
Computer Aided Control Systems Design 

► Classical Frequency-Domain Design Methods 

► Computer-Aided Control Systems Design: In¬ 
troduction and Historical Overview 

► Control Applications in Audio Reproduction 

► Discrete Optimal Control 

► Extended Kalman Filters 

► Frequency-Response and Frequency-Domain 
Models 


► H-Infinity Control 

► H 2 Optimal Control 

► Kalman Filters 

► Optimal Control via Factorization and Model 
Matching 

► Optimal Sampled-Data Control 

► Polynomial/Algebraic Design Methods 

► Quantitative Feedback Theory 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 

Recommended Reading 

Nice tutorial books on polynomials and polyno¬ 
mial matrices in control theory and design are 
Kucera (1979), Callier and Desoer (1982), and 
Kailath (1980) 

The concept of spectral factorization was intro¬ 
duced by Wiener (1949), for further informa¬ 
tion see later original papers Wilson (1972) 
or Kwakernaak and Sebek (1994) as well as 
survey papers Kwakernaak (1991), Sayed and 
Kailath (2001) or Kucera (2007). 

Nice applications of spectral factorization in con¬ 
trol problems can be found e.g. in Green et al. 
(1990), Henrion et al. (2003) or Zhou and 
Doyle (1998). For its use of in other engi¬ 
neering problems see e.g. Sternad and Ahlen 
(1993). 
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Stability and Performance 
of Complex Systems Affected 
by Parametric Uncertainty 
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Abstract 

Uncertainty is an inherent feature of all real- 
life complex systems. It can be described in 


different forms; we focus on the parametric de¬ 
scription. The simplest results on stability of lin¬ 
ear systems under parametric uncertainty are the 
Kharitonov theorem, edge theorem, and graphical 
tests. More advanced results include sufficient 
conditions for robust stability with matrix un¬ 
certainty, LMI tools, and randomized methods. 
Similar approaches are used for robust control 
synthesis, where performance issues are crucial. 


Keywords 

Edge theorem; Kharitonov theorem; Linear 
systems; Matrix; Parametric uncertainty and 
robustness; Quadratic stability; Randomized 
methods; Robust and optimal design; Robust 
stability; Tsypkin-Polyakplot 


Introduction 

Mathematical models for systems and control are 
often unsatisfactory due to the incompleteness 
of the parameter data. For instance, the ideas 
of off-line optimal control can only be applied 
to real systems if all the parameters, exogenous 
perturbations, state equations, etc. are known pre¬ 
cisely. Moreover, feedback control also requires 
a detailed information which is not available in 
most cases. For example, to drive a car with four- 
wheel control, the controller should be aware of 
the total weight, location of the center of gravity, 
weather conditions, and highway properties as 
well as many other data which may not be known. 
In that respect, even such a relatively simple real- 
life system can be considered a complex one; in 
such circumstances, control under uncertainty is 
a highly important issue. 

The focus in this article is on the parametric 
uncertainty ; other types of uncertainty can be 
treated in more general models of robustness. 
This topic became particularly popular in the 
control community in the mid- to late 1980s 
of the previous century; at large, the results of 
this activity have been summarized in the mono¬ 
graphs (Ackermann 1993; Barmish 1994; Bhat- 
tacharyya et al. 1995). 
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We start with problems of stability of polyno¬ 
mials with uncertain parameters and present the 
simplest robust stability results for this case to¬ 
gether with the most important machinery. Next, 
we consider stability analysis for the matrix un¬ 
certainty; most of the results are just sufficient 
conditions. We present some useful tools for the 
analysis, such as the LMI technique and random¬ 
ized methods. Robust control under parametric 
uncertainty is the next step; we briefly discuss 
several problem formulations for this case. 

Stability of Linear Systems Subject to 
Parametric Uncertainty 

Consider the closed-loop linear, time invariant 
continuous time state space system 

x = Ax, x(0) = vo, (1) 

where x(t) e W 1 is the state vector, Vo is an 
arbitrary finite initial condition, and A e 
is the state matrix. The system is stable (i.e., no 
matter what Vo is, the solutions tend to zero as 
t -> oo) if and only if all eigenvalues A ? of the 
matrix A have negative real parts: 

ReA ? <0, i = l,... ,n, (2) 

in which case, A is said to be a Hurwitz matrix. 
If it is known precisely, checking condition (2) is 
immediate. For instance, one might compute the 
characteristic polynomial 

p(s) = det (si — A) = ao + a\s +-h 

a n -\s n ~ l + s n (3) 

of A (here, I is the identity matrix) and use any 
of the stability tests (e.g., the Routh algorithm, 
Routh-Hurwitz test, and graphical tests such as 
the Mikhailov plot or Hermite-Biehler theorem), 
see Gantmacher (2000). Alternatively, the eigen¬ 
values can be directly computed using the cur¬ 
rently available software, such as Matlab. 

However, things get complicated if the knowl¬ 
edge of the matrix A is incomplete; for instance, 


it can depend on the (real) parameters q = 
(qi,..., q m ) which take arbitrary values within 
the given intervals: 

A = A{q), q . < qt <q t , i = l,... ,m. 

(4) 

In that case, we arrive at the robust stability 
problem', i.e., the goal is to check if condition (2) 
holds for all matrices in the family (4). 

The two main components of any robust sta¬ 
bility setup are the feasible set Q C M^, in which 
the uncertain parameters are allowed to take their 
values (usually a ball in some norm; e.g., the 
box as in (4)), and the uncertainty structure, 
which defines the functional dependence of the 
coefficients on the uncertain parameters. Of the 
most interest are the affine and multiaffine depen¬ 
dence; typically, more general situations are hard 
to handle. 

Simple Solutions 

In some cases, the robust stability problem ad¬ 
mits a simple solution. Perhaps the most strik¬ 
ing example is the so-called Kharitonov theo¬ 
rem (Kharitonov 1978); also see Barmish (1994), 
where this seminal result is referred to as a spark 
because of its transparency and elegance. 

Namely, consider the interval polynomial 
family 

V = \p(s) = q 0 + qis H-b q„s n , 

q t <qt <<?,-, i =0(5) 

where the coefficients qi are allowed to take 
values in the respective intervals independently 
of each other and distinguish the following four 
elements in this family: 

P\(s) = a 0 + q s + q 2 s 2 + q 3 s 3 + ... 

pi(s) = q 0 + <hs + qis 1 + q/ +... 

P3(s) = % + + q 2 s 2 + q 3 s 3 + ... 

Pa{s) = q 0 + q x s + q_ 2 s 2 + q 3 s 3 + ... 

By the Kharitonov theorem, the interval fam¬ 
ily (5) is robustly stable (i.e., all polynomials 
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in (5) are Hurwitz having all roots with negative 
real parts) if and only if the four Kharitonov 
polynomials , pi, p 2 , P 3 , and P 4 , are Hurwitz. 

A simple and transparent proof of this 
result can be obtained using the value set 
concept (Zadeh and Desoer 1963) and the zero 
exclusion principle (Frazer and Duncan 1929), 
the two general tools which are in the basis of 
many results in the area of robust stability. We 
illustrate these concepts via robust stability of 
polynomials. 

Given the uncertain polynomial family 
V(s,Q) = {p(s,q), qeQ}, 

the set 


lm 

P2(j u ) 

P3CM 



v(M 

- 


pi CM 

pMM 


Re 


■i--<- \ - \ - \ -*-*- 

-1 01234567 


Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty, Fig. 1 The 

Kharitonov rectangular value set 


V(co) = { p(jco,q ): co > 0, q e Q} 

is referred to as the value set , which is, by 
definition, the set on the complex plane obtained 
by fixing the argument s to be jco for a certain 
value of co and letting the uncertain parameter 
vector q sweep the feasible domain. 

The zero exclusion principle states that, un¬ 
der certain regularity requirements, the uncertain 
polynomial family is robustly stable if and only 
if it contains a stable element and the following 
condition holds: 

0 £ V(co) V co > 0. (6) 

To use this machinery, one has to be able 
to compute efficiently the value set and check 
condition (6). For the interval family (5), the 
value set can be shown to be the rectangle with 
coaxial edges and the vertices being the values of 
the four Kharitonov polynomials; see Fig. 1. 

Being an extremely propelling result, the 
Kharitonov theorem is not free of drawbacks. 
First of all, it is not capable of determining 
the maximal lengths of the uncertainty intervals 
that retain the robust stability. This relates to an 
important notion of robust stability margin ; for 
simplicity, we define this quantity for the case 
of the interval family (5). Namely, introduce the 
nominal polynomial po(s) with coefficients 

= (Si +£;)/ 2 , 


and the scaling factors 

«i = % -£,)/ 2 

for the deviations of the coefficients. Then the 
robust stability margin r max is defined as follows: 

v max = sup{r: p(s, q) (5) is stable V qp 
ki ~qf\ <ra t , i = l,...,n}. (7) 

Anther drawback of the Kharitonov result is 
its inapplicability to the discrete-time case (Schur 
stability of polynomials). 

A more flexible graphical test for robust stabil¬ 
ity uses the so-called Tsypkin-Polyakplot (Tsyp- 
kin and Polyak 1991), which is defined as the 
parametric curve on the complex plane: 

z(co) = x(co)~ b jy(oo), j = 0 < co < oo, 


where 


x(co) = 


~ + • • • 
a 0 + Oi 2 C0 2 + . . . ’ 


y(co) = 


qj - q\o?_ + ... 

Oil + Oi^co 2 + . . . ’ 


( 8 ) 


Then, by the Tsypkin-Polyak criterion, the poly¬ 
nomial family (5) is robustly stable if and only 
if the following conditions hold: (i) q J > oto, 
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Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty, Fig. 2 The 

Tsypkin-Polyak plot 

q® > a n , and (ii) as co changes zero to infinity, 
the curve z(co) goes consecutively through n 
quadrants in the counterclockwise direction and 
does not intersect the unit square with the vertices 

(±i. ±7). 

Unlike the Kharitonov theorem, with this test, 
the robust stability margin of family (5) can be 
determined as the size of the maximal square 
inscribed in the curve z(&>); see Fig. 2. More¬ 
over, with minor modifications, this test applies 
to dependent uncertainty structures where the 
coefficient vector q = (qo ,..., q n ) T is confined 
to a ball in l ^-norm, not to a box as in (5). 

On top of that, the Tsypkin-Polyak plot can be 
built for discrete-time systems which do not ad¬ 
mit any counterparts of the Kharitonov theorem. 

It is fair to say that interval polynomial fami¬ 
lies is an idealization, since the coefficients of the 
characteristic polynomial can hardly be thought 
of as the physical parameters of the real-world 
system. As a step towards more realistic formu¬ 
lations, consider the affine polynomial family of 
the form 

m 

PC) = po(s) + J^qiPi(s), \qi\ < 1, 

i = 1 

i = 1,..., m , (9) 

where pi are the given polynomials and the s 
are the uncertain parameters (clearly, they can 


be scaled to take values in the segment [—1,1]). 
The famous edge theorem (Bartlett et al. 1988) 
claims that checking the robust stability of such a 
family is equivalent to checking the edges of the 
uncertainty box, i.e., the points q e M m with all 
but one components being fixed to =b 1, while the 
“free” coordinate varies in [—1,1]. 

Complex Solutions 

Obviously, the affine model (9) covers just a 
small part of problems with parametric uncer¬ 
tainty. Closed-form solutions cannot be obtained 
in the general case; however, many important 
classes of systems can be analyzed efficiently. 

Thus, in the engineering practice, block dia¬ 
gram description of systems is often more conve¬ 
nient than differential equations of the form (1). 
The blocks are associated with typical elements 
such as amplifiers, integrators, lag elements, and 
oscillators, which are connected in a certain cir¬ 
cuit. In this case, transfer functions are the most 
adequate tool for dealing with such systems. For 
instance, the transfer function of the lag element 
is given by 

W(s) = 1 /(Ts + 1), 

where the scalar T is the time constant of the 
element. In terms of differential equations, this 
means that the input u(t) of a block and its 
output x(t) satisfy the equation Tx + x = u. 

Assume now we have a set of m cascade 
connected elements with uncertain time constants 

JU < Ti <Ti , / = 1 ,...,m, (10) 

with known lower and upper bounds. The char¬ 
acteristic polynomial of such a connection em¬ 
braced by the feedback with gain k is known to 
have the form 

p( s )=k + (l + T l s)---(l + T m s). (11) 

Hence, the robust stability problem reduces 
to checking if all polynomials (11) with 
constraints (10) are Hurwitz. Note that the 
coefficients of such a polynomial depend 
multilinearly on the uncertain parameters Ti 
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(cf. linear dependence in (9)), making the 
problem much more complicated. 

The solution of the problem above was ob¬ 
tained in Kiselev et al. (1997) for many impor¬ 
tant special cases; the closely related problem of 
finding the “critical gain” (the maximal value of k 
retaining the robust stability) was also addressed. 

Using the similar technique, closed-form 
solutions can be obtained for a number of 
similar problems such as robust sector stability, 
robust stability of distributed systems, robust 
D -decomposition, to name just a few. 

Difficult Problems: Possible Approaches 

In spite of the apparent progress obtained in the 
area of parametric robustness, the list of unsolved 
problems is still quite large. Moreover, some of 
the formulations were shown to be NP-hard, mak¬ 
ing it hard to believe that any efficient solution 
methods will ever be found. 

One of such fundamental problems is robust 
stability of the interval matrix. Specifically, as¬ 
sume that the entries aij of the matrix A in (1) 
are interval numbers 

Ojj < a^ <aij , i , j = 1,..., n ; 

the problem is to check if the interval matrix 
is robustly stable, i.e., if the eigenvalues of all 
matrices in this family have negative real parts. 
Numerous attempts to prove a Kharitonov-like 
theorem for matrices have failed, and the results 
by Nemirovskii (1994) on NP-hardness showed 
that these generalizations are not possible. It was 
also shown that the edge theorem for matrix 
families is not valid. The other NP-hard problems 
in robustness include the analysis of systems with 
interval delays, parallel connection of uncertain 
blocks, problem (11)— (10) with nested segments 
[T_i, Ti ], and others. 

However, a change in the statement of the 
problem often allows for simple and elegant so¬ 
lutions. We mention three fruitful reformulations. 

In the first approach , the uncertain parameters 
are assumed to have random rather than determin¬ 
istic nature; for instance, they are assumed to be 
uniformly distributed over the respective intervals 
of uncertainty. We next specify an acceptable 


tolerance s, say s = 0.01, and check if the 
resulting random family of polynomials is stable 
with probability no less than (1 — e); see Tempo 
et al. (2013) for a comprehensive exposition of 
such a randomized approach to robustness. 

In many of the NP-hard robustness problems, 
such a reformulation often leads to exact or ap¬ 
proximate solutions. Moreover, the randomized 
approach has several attractive properties even 
in the situations where the deterministic solution 
is available. Indeed, the deterministic statements 
of robustness problems are minimax; hence, the 
answer is dictated by the “worst” element in 
the family, whereas these critical values of the 
uncertain parameters are rather unlikely to occur. 
Therefore, by neglecting a small risk of viola¬ 
tion of the stability, the admissible domains of 
variation of the parameters may be considerably 
extended. This effect is known as the proba¬ 
bilistic enhancement of robustness margins ; it 
is particularly tangible for the large number of 
the parameters. Another attractive property of the 
randomized approach is its low computational 
complexity which only slowly grows with in¬ 
crease of the number of uncertain parameters. 

To illustrate, let us turn back to problem (11)- 
(10) and use the value set approach. In the con¬ 
sidered problem, this set can be efficiently built. 

Assume now that the parameters 7} are inde¬ 
pendent random variables uniformly distributed 
over the respective segments (10) and consider 
the random variable 

m 

'? = Y]{( 0 ) = log (p(jco)-k) =y>g(l+>7}). 

i = \ 

( 12 ) 

The right-hand side of the last relation is the 
sum of independent complex-valued random 
variables; for m large, its behavior obeys the 
central limit theorem, so that the probability 
that rj belongs to the respective confident 
ellipse £ = £(co) is close to unity. In other 
words, we have 

p(jco) & k -\- e £ = Q(co), 

and the set Q (< co ) is referred to as a probabilistic 
predictor of the value set V(<z>); it is the shifted 
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set of points of the form e z ,z £ £ C C. The 
predictor Q (co) constitutes a small portion of the 
deterministic value set V(co), yielding the proba¬ 
bilistic enhancement of the robustness margin. 

Note also that the computation of £ and e s is 
nearly trivial and, in contrast to the construction 
of the true value set V, the complexity does not 
grow with increase of m. 

The second approach to solving “hard” prob¬ 
lems in robust stability relates to the notion of su¬ 
perstability (Polyak and Shcherbakov 2002). The 
matrix A of system (1) (and the system itself) is 
said to be superstable, if its entries aq , /, j = 
1 ,,n, satisfy the relations 

an < 0, min (-an - ^ | aq |) = a > 0. 

j^i 

The following estimate holds for the solutions 
of the superstable system (1): 

IWOlloo < lk(0)||oo^, 

i.e., it is stable, and the (nonsmooth) function 
||x || oo is a Lyapunov function for the system. 
Since the condition of superstability is formu¬ 
lated in terms of linear inequalities on the entries 
of A, checking robust superstability of affine (and 
in particular, interval) matrix families is immedi¬ 
ate. Similar situation holds for so-called positive 
systems. 

The third approach to robustness analysis re¬ 
lates to quadratic stability (Leitmann 1979; Boyd 
et al. 1994). Namely, a family of systems is said 
to be robustly quadratically stable if it possesses 
a common quadratic Lyapunov function V(x) = 
x T Px with positive definite matrix P. In other 
words, an uncertain family of matrices A(q),q e 
Q has to satisfy the following set of the matrix 
Lyapunov-type inequalities: 

A(q)P + PA(q) T < 0, q e Q, P > 0, 

(13) 

where the symbols -<, >- stand for the sign- 
definiteness of a matrix. 

The inequality above is referred to as a linear 
matrix inequality (LMI), (Boyd et al. 1994); there 
exist both efficient numerical methods for solving 


such inequalities ( interior point methods) and 
various software, e.g., Matlab. This approach 
can be directly applied at least in the following 
two cases: (i) the set Q contains a finite number 
of points and (ii) Q is a polyhedron and the 
dependence A(q) is affine. In the general setup 
or in the high-dimensional problems, randomized 
methods can be employed. 

Finding the quadratic robust stability margin 
(by analogy with the stability margin, this is the 
maximum span of the feasible set Q that allows 
for the existence of the common Lyapunov func¬ 
tion) in this problem is also possible; it reduces 
to the minimization of a linear function over the 
solutions of a similar LMI. 

Note that the approaches based on superstabil¬ 
ity and quadratic stability provide only sufficient 
conditions for robustness. 

Robust Control 

So far, of our primary interest was in assessing 
the robust stability of a closed-loop system with 
synthesized linear feedback. A more important 
problem is to design a controller that makes the 
closed-loop system robustly stable and guaran¬ 
tees certain robust performance of the system. 

Robust Stabilization 

Let the linear system 

x = A{q)x + Bu 

depend on the vector q e Q of uncertain param¬ 
eters. In the simplest form, the problem of robust 
stabilization consists in finding the linear static 
state feedback 

u = Kx 

that guarantees the robust stability of the closed- 
loop system. Alternatively, static or dynamic 
output robustly stabilizing controllers can be 
considered in the situations where only the linear 
output y = Cx of the system is available, but not 
the complete state vector x. 

If the number of controller parameters to be 
tuned is small (which is the case for PI or PID 
controllers), then the design can be accomplished 
using the D-decomposition technique. 
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In the general formulation, the problem of 
robust design is complicated; it can, however, 
be addressed with the use of randomized 
methods (Tempo et al. 2013). Other plausible 
approaches include superstability and quadratic 
stability; respectively, the problem reduces 
to solving linear programs or linear matrix 
inequalities in the coefficients of the controller. 

Robust Performance 

Needless to say, the robust stabilization problem 
is not the only one in the area of optimal con¬ 
trol. As a rule, a certain cost function is always 
involved (say, integral quadratic), and its desired 
value should be guaranteed for all admissible 
values of the uncertain parameters. Moreover, 
robust stability is a necessary condition for such 
a guaranteed estimate to exist. This sort of prob¬ 
lems can often be cast in the form of LMIs which 
must be satisfied for all admissible values of 
the parameters. Such robust LMIs can be solved 
either directly or using various randomized tech¬ 
niques presented in Tempo et al. (2013). 

Conclusions 

In spite of the considerable progress attained in 
the parametric robustness of complex systems, 
this topic is still a vivid and active research 
area. To date, randomization, superstability, and 
quadratic stability present the most efficient and 
diverse tools for the analysis and design of sys¬ 
tems affected by parametric uncertainty. 

Cross-References 

► H-Infinity Control 

► LMI Approach to Robust Control 

► Optimization Based Robust Control 

► Randomized Methods for Control of Uncertain 
Systems 
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Abstract 

This entry provides a short introduction to model¬ 
ing of hybrid dynamical systems and then focuses 
on stability theory for these systems. It provides 
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definitions of asymptotic stability, basin of at¬ 
traction, and uniform asymptotic stability for a 
compact set. It points out mild assumptions un¬ 
der which different characterizations of asymp¬ 
totic stability are equivalent, as well as when an 
asymptotically stable compact set exists. It also 
summarizes necessary and sufficient conditions 
for asymptotic stability in terms of Lyapunov 
functions. 

Keywords 

Asymptotic stability; Basin of attraction; Hybrid 
system; Lyapunov function 

Introduction 

A hybrid dynamical system combines continuous 
change and instantaneous change. Instantaneous 
change is the only type of change available for 
variables like counters, switches, and logic vari¬ 
ables. Instantaneous change may also be a good 
approximation of what occurs to velocities in 
mechanical systems at the time of an impact with 
a wall, floor, or some other rigid body. At other 
times, velocities evolve continuously. Continu¬ 
ous change is also natural for position variables, 
continuous timers, and voltages and currents. For 
mathematical convenience, it is typical in the 
analysis of hybrid dynamical systems to embed 
all of these variables into a Euclidean space, 
with the understanding that many points in the 
state space will never be reached. For example, a 
logic variable that naturally takes values in the set 
{off, on} is typically embedded in the real number 
line where its two distinct values are associated 
with two distinct numbers, the only numbers that 
this variable will visit during its evolution. 

A finite-dimensional dynamical system that 
exhibits continuous change exclusively is typi¬ 
cally modeled by an ordinary differential equa¬ 
tion, or sometimes a more flexible differential 
inclusion. A system that exhibits purely instan¬ 
taneous change is typically modeled by a dif¬ 
ference equation or inclusion. Consequently, a 
hybrid dynamical system combines a differential 
equation or inclusion with a difference equation 


or inclusion. A big part of the modeling effort for 
hybrid systems is directed at determining which 
type of evolution should be allowed at each point 
in the state space. To this end, subsets of the state 
space are specified where each type of behavior 
is allowed, like in the description of the heating 
system given above. 

Though the behavior of a hybrid dynamical 
system can be quite complex and nonconven- 
tional, it is still reasonable to ask the same sta¬ 
bility questions for them that might be asked 
about classical differential or difference equa¬ 
tions. Moreover, the same stability analysis tools 
that are used for classical systems are also quite 
useful for hybrid dynamical systems. The empha¬ 
sis of this entry is on basic stability theory for hy¬ 
brid dynamical systems, focusing on definitions 
and tools that also apply to classical systems. 

Mathematical Modeling 

System Data 

A hybrid dynamical system with state x belong¬ 
ing to a Euclidean space W 1 combines a differ¬ 
ential equation or inclusion, written formally as 
x = /(x) or x G F(x ), with a difference 
equation or inclusion x + = g(x) orx + G G(x), 
where x indicates the time derivative and x + 
indicates the value after an instantaneous change. 
The mapping / or F is called the flow map , while 
the mapping g or G is called the jump map. A 
complete model also specifies where in the state 
space continuous evolution is allowed and where 
instantaneous change is allowed. The set where 
continuous evolution is allowed is called the flow 
set and is denoted C, whereas the set where in¬ 
stantaneous change is allowed is called the jump 
set and is denoted D. The overall model, using 
inclusions for generality, is written formally as 

x G C x G F(x) (la) 

x € D x + € G(x). (lb) 

Solutions 

It is natural for solutions of (1) to be functions of 
two different types of time: a variable t that keeps 
track of the amount of ordinary time that has 
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elapsed and a variable j that counts the number 
of jumps. There is a special structure to the types 
of domains that are allowed. A compact hybrid 
time domain is a set E C M>o x Z>o, that 
is, a subset of the product of the nonnegative 
real numbers and the nonnegative integers, of the 
form 

j 

E = {J([ti,t i+ i] x {/}) 

i =0 

for some J G Z>o and some sequence of non¬ 
decreasing times 0 = to < t\ < ••• < tj+\. 
It is possible for several of these times to be the 
same, which would correspond to more than one 
jump at the given time. A hybrid time domain 
is a set E C M>o x Z>o such that for each 
(T, J ) G E , the set E n ([0, T] x {0,..., /}) 
is a compact hybrid time domain. In contrast to 
a compact hybrid time domain, a hybrid time 
domain may have an infinite number of intervals, 
or it may have a finite number of intervals with 
the last one being unbounded or of the form 
[tj , f/+i); that is, it may be open on the right. A 
hybrid arc is a function x, defined on a hybrid 
time domain, such that t i-> x(t,j) is locally 
absolutely continuous for each j\ in particular, 
t i-> x(t,j) is differentiable for almost every 
t where it is defined, and this mapping is the 
integral of its derivative. The notation “dom x” 
denotes the domain of x. Finally, a hybrid arc is a 
solution of (1) if the following two properties are 
satisfied: 

1. For s > 0, (s, j), (,s + s, j ) G dom x implies 
thatx(7,y) G C and x(t,j) G F(x(t,j )) for 
almost al It G [s, s + e]. 

2. (t,j),(t,j + 1) e dom x implies that 
x(t,j ) G D and x(t,j + 1) G G(x(t,j)). 

For a hybrid system with no flow dynamics, each 
solution has a time domain of the form {0} x 
{0,...,/} for some J G Z> 0 or {0} x Z> 0 . 
For a hybrid system with no jump dynamics, 
each solution has a time domain of the form 
[0, oo) x {0}, [0, T] x {0}, or [0, T) x {0} for some 
T > 0. No assumptions are made in this entry to 
guarantee existence of nontrivial solutions since 
stability theory does not hinge on existence of so¬ 
lutions; rather, it simply makes statements about 
the behavior of solutions when they exist. To 


ensure robustness of various stability properties, 
the following basic regularity assumptions are 
usually imposed. 

Assumption 1 The data (C, F, D,G) satisfy the 
following conditions: 

1. The sets C and D are closed. 

2. The set-valued mapping F is outer semi- 
continuous, locally bounded, and F(x ) is 
nonempty and convex for each x G C . 

3. The set-valued mapping G is outer semi- 
continuous, locally bounded, and G(x) is 
nonempty for each x G D . 

To elaborate further, a set-valued mapping, like 
F, is said to be outer semicontinuous if for each 
convergent sequence {(x;, that satisfies 

y t G F(xi) for all i G Z>o, its limit, denoted 
(x, y), satisfies y G F(x). It is said to be locally 
bounded if for each bounded set K\ C W 1 
there exists a bounded set K 2 C W 1 such that, 
for every x G K\, every y G ^(x) belongs 
to K 2 ; the latter condition is sometimes written 
F(K \) C K 2 . If C is closed, / is a function 
/ : C —> R n that is continuous, and F is a set¬ 
valued mapping that has the single value /(x) for 
each x G C and is empty for x ^ C , then F is 
outer semicontinuous, locally bounded, and F(x) 
is nonempty and convex for each x G C . 

Stability Theory 

Definitions and Relationships 

Given a dynamical system, predicting or control¬ 
ling the system’s long-term behavior is of pri¬ 
mary importance. A system’s long-term behavior 
may be more complicated than just converging to 
an equilibrium point. This fact motivates studying 
stability of and convergence to a set of points. 
For simplicity, this entry focuses on stability of 
sets that are compact , that is, they are closed 
and bounded. A variety of stability concepts are 
defined below. Each of these concepts applies 
to continuous-time or discrete-time systems as 
readily as to hybrid systems. 

A compact set A C W 1 is said to be Lyapunov 
stable for (1) if for each s > 0 there exists 8 > 0 
such that for every solution of (1), x (0,0) G 
A + <5B implies x(t, j) G A + sB for all ( t , j) G 
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dom x, where A + <5B indicates the set of points 
whose distance to the set A is less than or equal 
to 8. In order for a compact set to be Lyapunov 
stable for ( 1 ), it must be forward invariant for 
(1), that is, each solution of (1) with x(0,0) G A 
satisfies x(t,j ) G A for all (t, j) e dom x. 
However, forward invariance does not necessarily 
imply Lyapunov stability. 

For a compact set A C M", its basin of 
attraction for ( 1 ), denoted S 4 , is the set of points 
from which each solution to ( 1 ) is bounded and 
each solution to ( 1 ) having an unbounded time 
domain converges to A, the latter being written 
mathematically as lim ?+i ^oo \x (tj) u = 0 
where \x(t, j')\a denotes the distance of x(t,j) 
to the set A. Each point that does not belong to 
CUD belongs to S 4 since there are no solutions 
from such points. A compact set A is said to be 
attractive for ( 1 ) if its basin of attraction contains 
a neighborhood of itself, that is, there exists s > 0 
such that A + sM C S 4 . A compact set A is said 
to be globally attractive if D 4 = W 1 . 

A compact set is said to be asymptotically 
stable for (1) if it is Lyapunov stable and attrac¬ 
tive for (1). A compact set is said to be globally 
asymptotically stable for ( 1 ) if it asymptotically 
stable for (1) and Ba — M". It is useful to know 
that the basin of attraction for an asymptotically 
stable set is always open. 

Theorem 1 Under Assumption 1, if a compact 
set is asymptotically stable for (1), then its basin 
of attraction is an open set. 

A compact set A C W 1 is said to be uniformly 
attractive for ( 1 ) if it is attractive for ( 1 ) and for 
each compact set K C D 4 and each 8 > 0 
there exists T > 0 such that for every solution 
x of (1), x(0,0) G K and t + j > T imply 
x(t,j ) G A + <5B. A compact set is said to 
be uniformly globally attractive for ( 1 ) if it is 
globally attractive and uniformly attractive for 
(1). Uniform attractivity goes beyond attractivity 
by asking that the amount of time it takes each 
solution to get close to A is uniformly bounded 
over initial conditions in compact subsets of the 
basin of attraction. 

A compact set A C W 1 is said to be Lagrange 
stable relative to an open set O D A for (1) if for 
each compact set K\ C O there exists a compact 


set K 2 C O such that for every solution of ( 1 ), 
x(0,0) G K\ implies x(L j) G K 2 for all (t, j ) G 
dom x. In Lagrange stability for the case O = 

, a bound on the initial conditions is given and 
a bound on the ensuing solutions must be found; 
this is in contrast to Lyapunov stability where a 
bound on the solutions is given and a bound on 
the initial conditions must be found. 

A compact set is said to be uniformly asymp¬ 
totically stable for (1) if it is Lyapunov stable, 
attractive, Lagrange stable relative to its basin 
of attraction, and uniformly attractive for ( 1 ). 
A compact set is said to be uniformly glob¬ 
ally asymptotically stable for ( 1 ) if it is uni¬ 
formly asymptotically stable for ( 1 ) and = 
W 1 . There is no difference between asymptotic 
stability and uniform asymptotic stability under 
Assumption 1. 

Theorem 2 Under Assumption 1, a compact set 
is uniformly asymptotically stable for (1) if and 
only if it is locally asymptotically stable for (1). 

As noted earlier, forward invariance does not 
imply Lyapunov stability. However, when cou¬ 
pled with uniform attractivity, Lyapunov stability 
ensues. 

Theorem 3 Under Assumption 1, a compact set 
is uniformly asymptotically stable for (1) if and 
only if it is forward invariant and uniformly 
attractive for (1). 

Asymptotic stability can be converted to 
global asymptotic stability by shrinking the 
flow and jump sets to be compact subsets of the 
basin of attraction. However, global asymptotic 
stability of a compact set A for x G C, x = / (x) 
for each compact set C does not necessarily 
imply global asymptotic stability of A for 
x = fix). 

In some situations it is easier to assert the 
existence of a compact asymptotically stable set 
than it is to find one explicitly. In this direction, 
given a set A C W 1 , consider the set of points 
z with the property that there exist a sequence 
of solutions {x ,}~ 0 to ( 1 ) with initial conditions 
in A and a sequence of times {(^, y ? )}^ 0 with 
(ti , ji ) G dom Xf for each i G Z>o such that 
z = lim/^oo Xj ( tf , ji). This set of points is called 
the -limit set of A for (1) and is denoted £2(A). 
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Theorem 4 Let Assumption 1 hold. For the sys¬ 
tem (1), if X is compact and £2(X) is nonempty 
and contained in the interior of X (i.e., there 
exists £ > 0 such that £2(X) + sB C X), 
then the set £ 2 (X) is compact and uniformly 
asymptotically stable with basin of attraction 
containing X and equal to the basin of attraction 
forX. 

Robustness 

A given model (C, F, D,G) may have some 
mismatch with a physical process that it aims 
to describe. One way to capture some of this 
mismatch is to consider the behavior of solutions 
to a system with inflated data (C 8 , F§, D 8 , G 8 ), 
8 > 0 , defined as follows: 

C 8 :— {A G R" : (A -|- SB) fl C ^ 0 } (2a) 

F 8 (x) := c oF((x + SB) n C) + SB (2b) 

D 8 := {x eR n : (x + SB) HD / 0 } (2c) 

G 8 := G((x + SB) n D) + SB. (2d) 

The notation v + SB indicates a closed ball 
of radius S centered at the point x. Evaluating 
a set-valued mapping at a set of points means 
to collect all vectors that belong to the set¬ 
valued mapping at any point in the set that 
serves as the argument of the set-valued 
mapping. The notation “co F((x + SB) PI C)” 
indicates the closed, convex hull of the set 
{f eR n : f e F(z),z e (x + SB) PC}. Note 
that (Co, F 0 , D 0 , Go) = (C, F, D, G). More 

generally, the components of (C, F, D,G) are 
contained in (C 8 , F 8 , D 8 , G 8 ). The inflation 
data in ( 2 ) satisfy the regularity properties of 
Assumption 1 when (C, F, D, G) do. 

Proposition 1 If the data (C, F, D, G) satisfy 
Assumption 1 then, for each S > 0, the inflated 
data (C 8 , F 8 , D 8 , G 8 ) satisfy Assumption 1. 

From the point of view of asymptotic stability, 
the behavior of solutions to (C 8 , F 8 ,D 8 , G 8 ) for 
S > 0 small is not too different from those of 
(C,F,D,G). 

Theorem 1 Under Assumption 1, if A is asymp¬ 
totically stable with basin of attraction Bj^for the 
hybrid system with data (C, F, D , G), then for 


each e > 0 and each compact set K satisfying 
K C Ba, there exist S > 0 and a compact set 
A 8 C A + sB that is asymptotically stable with 
K C BAg for (C 8 , F 8 ,D 8 , G 8 ). 

The robustness result of Theorem 1 has sev¬ 
eral consequences beyond the observations in the 
preceding examples. One of the consequences is 
the following reduction principle. 

Theorem 2 Under Assumption 1, if A\ is 
asymptotically stable with basin of attraction 
Ba 1 f or the hybrid system with data (C, F, D,G) 
and the compact set A2 C A\ is globally 
asymptotically stable for the hybrid system with 
data (C fl A\, F,C D A 2 , G), then the compact 
set A 2 is asymptotically stable with basin of 
attraction Ba x for the hybrid system with data 
(C, F, D, G). 

Lyapunov Functions 

Arguably the most common method for establish¬ 
ing asymptotic stability is known as Lyapunov’s 
method and uses a Lyapunov function. A function 
V : R n M>o is a Lyapunov function candidate 
for ( 1 ) if it is continuously differentiable on an 
open neighborhood of the flow set C, it is defined 
for all v G C U D U G(D) (dom V denotes the set 
of points where it is defined), and it is continuous 
on its domain. Some of these conditions can be 
relaxed but are imposed in this entry to keep the 
discussion simple. Given a compact set A and an 
open set O satisfying A C O C R n , a Lyapunov 
function candidate for (1) is called a Lyapunov 
function for (A, O) if: 

(LI) For v G (C U D U G(D)) n G, V(x) = 0 
if and only if x e A. 

(L2) For each x e C n O and / G F(x ), 
<VF(*),/> < 0. 

(L3) For each x G D fl O and g e G(x ), V(g) — 
V(x) < 0. 

A Lyapunov function for (A, O) is called a 
proper Lyapunov function for (A, O ) if, in 
addition, 

(L4) lim^oo V(xi) = 00 when the sequence 
{xi }°^ 0 , satisfying Xf G (C U D U G(D )) PI 
O for all i G Z>o, is unbounded or ap¬ 
proaches the boundary of O . 

The next result does not use Assumption 1, 
though the rest of the results in this entry do. 
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Theorem 3 Let A C O C W 1 with A compact 
and O open. If there exists a Lyapunov function 
for (, A , O), then A is Lyapunov stable for (1). 
If there exists a proper Lyapunov function for 
{A, O ) then A is also Lagrange stable with 
respect to O for (1). 

We can also conclude asymptotic stability 
from a Lyapunov function when it is known 
that there are no complete solutions along which 
the Lyapunov function is equal to a positive 
constant. 

Theorem 4 Let A C O C M" with A compact 
and O open. Under Assumption 1, if there exists 
a Lyapunov function for {A, O) and there is no 
solution x of (1) starting in 0\A that has an un¬ 
bounded time domain and satisfies V(x(t, j)) = 
F(x( 0 , 0 )) for all ( t,j ) e dom x, then A is 
uniformly asymptotically stable for (I). If the 
Lyapunov function is a proper Lyapunov function 
for (, A , O), then the basin of attraction for A 
contains O. 

The simplest way to rule out solutions 
that keep a Lyapunov function equal to a 
positive constant is by finding a (proper) strict 
Lyapunov function for (A, O), which is a 
(proper) Lyapunov function for (A, O ) that also 
satisfies: 

(L2 r ) For each v e (C PI <9)\Fland / e F(x), 
(VV(x), f) < 0. 

(L3 r ) For each v e (D PI 0)\A and g e G(x), 
V(g) ~ F(x) < 0. 

Theorem 5 Let A C O C W 1 with A compact 
and O open. Under Assumption I, if there ex¬ 
ists a strict Lyapunov function for (, A , O), then 
A is uniformly asymptotically stable for (1). If 
there exists a proper strict Lyapunov function 
for {A, O), then A is uniformly asymptotically 
stable for (1) with basin of attraction contain¬ 
ing O. 

While a strict Lyapunov function can be dif¬ 
ficult to find, and this fact has motivated other 
more sophisticated stability analysis tools that 
have appeared in the literature, it is reassuring to 
know that whenever A is compact and asymptot¬ 
ically stable, there exists a proper strict Lyapunov 
function for (A, Bjf)- 


Theorem 6 Under Assumption I , if the com¬ 
pact set A is asymptotically stable for (1), then 
there exists a proper strict Lyapunov function 
for (A, BX)- More specifically, for each X > 0 
there exists a smooth function V with dom V = 
Bj± that V(x) = 0 if and only if x e A, 
liny^oo V(xt) = oo when the sequence {x/}°^ 0 , 
satisfying e Ba for all i e Z>o, is un¬ 
bounded or tends to the boundary of BX, cmd 
such that: 

1. For all x e CPI Bj± and f e F(x) , 
(VF(x), /) < —XV(x). 

2. For all x e DC\ Bj± and g e G(x), 
V(g) < exp(—A)F(x). 


Summary and Future Directions 

Under Assumption 1, stability theory for hybrid 
dynamical systems is very similar to stability 
theory for differential equations or difference 
equations with continuous right-hand sides. In 
particular, Lyapunov functions are a very com¬ 
mon analysis tool for hybrid dynamical systems, 
though a Lyapunov function can be difficult to 
find in the same way that they are challenging to 
find for classical systems. With stability theory 
for hybrid dynamical systems firmly in place, 
future research is expected to exploit this theory 
more fully for the development of control algo¬ 
rithms with new capabilities. 
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Abstract 

The notion of stability allows to study the qualita¬ 
tive behavior of dynamical systems. In particular 
it allows to study the behavior of trajectories 
close to an equilibrium point or to a motion. 


The notion of stability that we discuss has been 
introduced in 1882 by the Russian mathematician 
A.M. Lyapunov, in his doctoral thesis; hence, 
it is often referred to as Lyapunov stability. In 
this entry we discuss and characterize Lyapunov 
stability for linear systems. 


Keywords 

Eigenvalues; Equilibrium points; Linear systems; 
Motions; Stability 


Introduction 


Consider a linear, time-invariant, finite¬ 
dimensional system, i.e., a system described by 
equations of the form 


ax = Ax + Bu, 
y = Cx + Du, 


( 1 ) 


with x(t) e IR n , u(t) e IR m , y(t) e IRP and 
A G IR nXn , B g IR nxm , C G IR pXn , and D G 
jRpxm constant matrices. In Eq. (1) ox(t) stands 
for x{t) if the system is continuous-time and for 
x(t + 1) if the system is discrete-time. Since the 
system is time-invariant, it is assumed, without 
loss of generality, that all signals are defined for 
/ > 0 , that is, if the system is continuous-time, 
then t G IR + , i.e., the set of non-negative real 
numbers, whereas if the system is discrete-time, 
then t G Z + , i.e., the set of non-negative integers. 
Lor ease of notation, the argument is dropped 
whenever this does not cause confusion, and we 
use the notation t > 0 to denote either IR + or 
Z + . Linally, we use either x(t,x(0),u) or x(t) 
to denote the solution of the first of equations ( 1 ) 
at a given time t > 0, with the initial condition 
x(0) and the input signal u. The former is used 
when it is important to keep track of the initial 
state and external input u, whereas the latter is 
used whenever there is not such a need. 

Definition 1 (Equilibrium) Consider the 
system (1). Assume the input u is constant, i.e., 
u(t) = uq for all / >0 and for some constant 
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uq. A state is an equilibrium of the system 
associated to the input uo if x e = x(t,x e , uf), for 
all t > 0. 

Proposition 1 (Equilibria of linear systems) 

Consider the system (1) and assume u(t ) = uo, 
for all t, where uo is a constant vector. Then the 
following hold. 

• If uo = 0 then the origin is an equilibrium. 

• For continuous-time systems, if A is invertible, 
for any uo there is a unique equilibrium x e = 
—A~ l Buo. If A is not invertible, the system 
has either infinitely many equilibria or it has 
no equilibria. 

• For discrete-time systems, if I —A is invertible, 
for any uo there is a unique equilibrium x e = 
(/ — A)~ l Buo. If I — A is not invertible, the 
system has either infinitely many equilibria or 
it has no equilibria. 

Proposition 2 Consider the continuous-time, 
time-invariant, linear system 

x = Ax + Bu, 
y = Cx + Du, 

and the initial condition x (0) = xo. Then, for all 
t > 0 , 

x(t) = e At xo + f e A(A ~ z ^ Bu(r)dr (2) 

Jo 

and 

y(t) = Ce At x o+ [ Ce A ^~ z ^Bu(x)dx-\-Du(t). 

Jo 

( 3 ) 

Proposition 3 Consider the discrete-time, time- 
invariant, linear system (to simplify the notation 
we use x + (t) to denote x(t + 1 ) and we drop the 
argument t) 

= Ax + Bu, 
y = Cx + Du, 

and the initial condition x (0) = xo. Then, for all 
t > 0 , 


t-i 

x(t) = A l xo + Ya'-^BuQ) (4) 

i=0 

and 

t -1 

y(t) = CA‘x o + Bu(i) + Du(t). 

i= 0 

( 5 ) 


Definitions 

In this section we provide some notions and defi¬ 
nitions which are applicable to general dynamical 
systems. 

Definition 2 (Lyapunov stability) Consider the 
system ( 1 ) with u(t) = uo, for all t >0 and 
for some constant u$. Let be an equilibrium 
point. The equilibrium is stable (in the sense of 
Lyapunov) if for every € > 0 there exists a 
8 = 8(c) > 0 such that \\x(0) — x e \\ <8 implies 
|| x(t) — x e || < e, for all t > 0 , where the notation 
||-|| denotes the Euclidean norm in W 1 . 

In stability theory the quantity x(0) — is 
called initial perturbation, and x(t) is called per¬ 
turbed evolution. Therefore, the definition of sta¬ 
bility can be interpreted as follows. An equi¬ 
librium point is stable if however we select 
a tolerable deviation c, there exists a (possibly 
small) neighborhood of the equilibrium such 
that all initial conditions in this neighborhood 
yield trajectories which are within the tolerable 
deviation. 

The property of stability dictates a condition 
on the evolution of the system for all t >0. Note, 
however, that in the definition of stability, we 
have not requested that the perturbed evolution 
converge asymptotically, that is, for t —> oo, to 

. This convergence property is very important 
in applications, as it allows to characterize the 
situation in which not only the perturbed evolu¬ 
tion remains close to the unperturbed evolution, 
but it also converges to the initial (unperturbed) 
evolution. To capture this property we introduce 
a new definition. 
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Definition 3 (Asymptotic stability) Consider 
the system (1) with u(t ) = uo , for all t >0 and 
for some constant uo. Let be an equilibrium 
point. The equilibrium is asymptotically stable 
if it is stable and if there exists a constant 
8 a > 0 such that ||x(0) — x e \\ < 8 a implies 
lim ||x(0 — x e \\ = 0. 

t-> oo 

In summary, an equilibrium point is asymp¬ 
totically stable if it is stable, and whenever the 
initial perturbation is inside a certain neighbor¬ 
hood of x e , the perturbed evolution converges, 
asymptotically, to the equilibrium point, which is 
thus said to be attractive. From a physical point 
of view, this means that all sufficiently small 
initial perturbations give rise to effects which can 
be a priori bounded (stability) and which vanish 
asymptotically (attractivity). 

It is important to highlight that, in general, 
attractivity does not imply stability: it is possible 
to have an equilibrium of a system which is not 
stable (i.e., it is unstable), yet for all initial per¬ 
turbations, the perturbed evolution converges to 
the equilibrium. This however is not the case for 
linear systems, as discussed in section “Stability 
of Linear Systems”. We conclude the section with 
two simple examples illustrating the notions that 
have been introduced. 

Example 1 Consider the discrete-time system 
= —x, with x(t) G IR. This system has a 
unique equilibrium at =0. Note that for any 
initial condition xo G IR , one has 

X2t-\ = ~Xq, X2t = Xq, 

for all t > 1 and integer. This implies that the 
equilibrium is stable, but not attractive. 

Example 2 Consider the continuous-time system 

X\ = cox 2 , X 2 = —cox i, 

with co a positive constant. The system has a 
unique equilibrium at = 0. This equilib¬ 
rium is stable, but not attractive. To see this 
note that, along the trajectories of the system, 
X\X\ -h X 2 X 2 = 0, and this implies that, along 
the trajectories of the system, x\(t ) + x\(t) is 


constant, i.e., x\(t ) + x\(t) = Vj(0) + xf(0). 
Therefore, the state of the system remains on 
the circle centered at the origin and with radius 

yjx\(Q) + xf (0), for all t >0: the condition for 
stability holds with 8(e) = €. 

Definition 4 (Global asymptotic stability) 

Consider the system (1) with u{t) = uo, for 
all t >0 and for some constant mq- Let be 
an equilibrium point. The equilibrium is globally 
asymptotically stable if it is stable and if, for all 
x(0), lim \\x(t)—x e \\ = 0. 

/ —>-00 

The property of (global) asymptotic stability 
can be strengthened imposing conditions on the 
convergence speed of \\x(t) — x e ||. 

Definition 5 (Exponential stability) Consider 
the system (1) with u(t ) = uo, for all t >0 and 
for some constant u$. Let be an equilibrium 
point. The equilibrium is exponentially stable if 
there exists A > 0, in the case of continuous-time 
systems, and 0 < A < 1 in the case of discrete¬ 
time systems, such that for all € > 0, there exists 
a 8 = 8(e) > 0 such that ||x(0)— x e \\ < 8 implies 
\\x(t) — x e \\ < ee~ Xt , in the case of continuous¬ 
time systems, and \\x(t) — x e \\ < eh*, in the case 
of discrete-time systems, for al It >0. 

Definition 6 (Stability of motion) Consider the 
system (1). Let 

M = {(t,x(t)) G T x IR n }, 

with x(t) = x(t,Xo,u), for given xo and u , 
and T = IR + , in the case of continuous-time 
systems, and T = Z + , in the case of discrete¬ 
time systems, be a motion. The motion is stable 
if for every e > 0 there exists a <5 = 8(e) > 0 
such that ||x(0) — Xo|| < <5 implies 

\\x(t, v(0), u) — x(t, Xo, u)\\ <6, (6) 

for all t > 0. 

The notion of stability of a motion is sub¬ 
stantially the same as the notion of stability of 
an equilibrium. The important issue is that the 
time-parametrization is important, i.e., a motion 
is stable if, for small initial perturbations, the 
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perturbed evolution is close, for any fixed t > 
0, to the non-perturbed evolution. This does not 
mean that if the perturbed and unperturbed trajec¬ 
tories are close, then the motion is stable: in fact 
the trajectories may be close but may be followed 
with different timing, which means that for some 
t > 0 condition (6) may be violated. 

Stability of Linear Systems 

The notion of stability relies on the knowledge 
of the trajectories of the system. As a result, 
even if this notion is very elegant and useful 
in applications, it is in general hard to assess 
stability of an equilibrium or of a motion. There 
are, however, classes of systems for which it 
is possible to give stability conditions without 
relying upon the knowledge of the trajectories. 
Linear systems belong to one such class. In this 
section we study the stability properties of linear 
systems, and we show that, because of the linear 
structure, it is possible to assess the properties 
of stability and attractivity in a simple way. To 
begin with, we recall some properties of linear 
systems. 

Proposition 4 Consider a linear, time-invariant 
system. (Asymptotic) stability of one motion im¬ 
plies (asymptotic) stability of all motions. In 
particular, (asymptotic) stability of any motion 
implies and is implied by (asymptotic) stability of 
the equilibrium x e = 0. 

The above statement, together with the result 
in Proposition 1, implies the following important 
properties. 

Proposition 5 If the origin of a linear system 
is asymptotically stable, then, necessarily, the 
origin is the only equilibrium of the system for 
u = 0. Moreover, asymptotic stability of the zero 
equilibrium is always global. Finally, asymptotic 
stability implies exponential stability. 

The above discussion shows that the stability 
properties of a motion (e.g., an equilibrium) of a 
linear system are inherited by all motions of the 
system. Moreover, for linear systems, local prop¬ 
erties are always global properties. This means 


that, with some abuse of terminology, we can 
refer the stability properties to the linear system, 
for example, we say that a linear system is stable 
to mean that all its motions are stable. Stability 
properties of a linear, time-invariant system are 
therefore properties of the free evolution of its 
state: for this class of systems, it is possible to 
obtain simple stability tests. 

Proposition 6 A linear, time-invariant system is 
stable if and only if \\e At || < k, for continuous¬ 
time systems, or ||A*|| < k, for discrete-time 
systems, for all t >0 and for some k > 0. It is 
asymptotically stable if and only if lim e At = 0, 

t—>OQ 

for continuous-time systems, or lim Af = 0, for 

t —>oo 

discrete-time systems. To state the next result we 
need to define the geometric multiplicity of an 
eigenvalue. To this end we recall a few facts. Con¬ 
sider a matrix A e IR nxn and a polynomial p(X). 
The polynomial p(X) is a zeroing polynomial for 
A if p(A) = 0. Note that, by Cayley-Hamilton 
Theorem, the characteristic polynomial of A is 
a zeroing polynomial for A. Among all zeroing 
polynomials there is a unique monic polynomial 
Pm (A) with smallest degree. This polynomial is 
called the minimal polynomial of A. Note that 
the minimal polynomial of A is a divisor of the 
characteristic polynomial of A. If A has r < n 
distinct eigenvalues X\, ..., X r , then 

Pm( A) = (A — A.rCA - X 2 ) mi • • • (A - A r )"\ 

where the number mi denotes, by definition, the 
geometric multiplicity of A/, for i = 1, • • • , r. 
This means that the geometric multiplicity of A; 
equals the multiplicity of A/ as a root of Pm(X). 
Recall, finally, that the multiplicity of A/ as a 
root of the characteristic polynomial is called 
algebraic multiplicity. 

Proposition 7 The equilibrium x e = 0 of a 
linear, time-invariant system is stable if and only 
if the following conditions hold. 

• In the case of continuous-time systems, the 
eigenvalues of A with geometric multiplicity 
equal to one have non-positive real part, and 
the eigenvalues of A with geometric multiplic¬ 
ity larger than one have negative real part. 



Stability: Lyapunov, Linear Systems 


1311 


• In the case of discrete-time systems, the eigen¬ 
values of A with geometric multiplicity equal 
to one have modulo not larger than one, and 
the eigenvalues of A with geometric multiplic¬ 
ity larger than one have modulo smaller than 
one. 


Proof Let Ai, A 2 , • • •, A r , with r > 1, be the 
distinct eigenvalues of A, i.e., the distinct roots 
of the characteristic polynomial of A. Then 


n At 


= EE* 

i = 1 k=\ 


fk—l 


ik 


(k - 1)! 




for some matrices Rjk, where is the geometric 
multiplicity of the eigenvalue A/. This matrix 
is bounded if and only if the conditions in the 
statement hold. Similarly, 


^ = EI>* 


tk -1 


i= 1 k =1 


ik- 1)! 


\t-k +1 


for some matrices and this is bounded if and 
only if the conditions in the statement hold. < 


Proposition 8 The equilibrium x e = 0 of a 
linear, time-invariant system is asymptotically 
stable if and only if the following conditions 
hold. 

• In the case of continuous-time systems, the 
eigenvalues of A have negative real part. 

• In the case of discrete-time systems, the eigen¬ 
values of A have modulo smaller than one. 

Proof The proof is similar to the one of the 
previous proposition, once it is noted that, for 
the considered class of systems and as stated in 
Proposition 6, asymptotic stability implies and is 
implied by boundedness and convergence of e At 
or A*. < 


Remark 11A For linear, time-varying systems, 
i.e., systems described by equations of the form 

ax = A(t)x + B(t)u , 
y = C(t)x + D(t)u , 


it is possible to provide stability conditions in 
the spirit of the boundedness and convergence 
conditions in Proposition 6. These require the 
definition of a matrix, the so-called monodromy 
matrix, which describes the free evolution of the 
state of the system. It is, however, not possible 
to provide conditions in terms of eigenvalues 
of the matrix Aft) similar to the conditions in 
Propositions 7 and 8. 

We conclude this discussion with an alterna¬ 
tive characterization of asymptotic stability in 
terms of linear matrix inequalities. 

Proposition 9 The equilibrium x e = 0 of a 
linear, time-invariant system is asymptotically 
stable if and only if the following conditions 
hold. 

• In the case of continuous-time systems, there 
exists a symmetric positive definite matrix 
P = P' such that A!P + PA < 0. 

• In the case of discrete-time systems, there ex¬ 
ists a symmetric positive definite matrix P = 
P' such that A'PA — P < 0. 

To complete our discussion we stress that 
stability properties are invariant with respect to 
changes in coordinates in the state space. 

Corollary 1 Consider a linear, time-invariant 
system and assume it is (asymptotically) 
stable. Then any representation obtained by 
means of a change of coordinates of the form 
xft) = Lx(t), with L constant and invertible, is 
(asymptotically) stable. 

Proof The proof is based on the observation 
that the change of coordinates transforms the 
matrix A into A = L~ l AL and that the matrices 
A and A are similar, that is, they have the same 
characteristic and minimal polynomials. <1 


Summary and Future Directions 

The property of Lyapunov stability is instrumen¬ 
tal to characterize the qualitative behavior of 
dynamical systems. For linear, time-invariant sys¬ 
tems, this property can be studied on the basis of 
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the location, and multiplicity, of the eigenvalues 
of the matrix A. The property of Lyapunov sta¬ 
bility can be studied for more general classes of 
systems, including nonlinear systems, distributed 
parameter systems, and hybrid systems, to which 
the basic definitions given in this article apply. 


Cross-References 

► Feedback Stabilization of Nonlinear Systems 

► Linear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► Linear Systems: Continuous-Time, Time-Vary¬ 
ing State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 

► Linear Systems: Discrete-Time, Time-Varying, 
State Variable Descriptions 

► Lyapunov’s Stability Theory 

► Lyapunov Methods in Power System Stability 

► Power System Voltage Stability 

► Small Signal Stability in Electric Power Sys¬ 
tems 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 


Recommended Reading 

Classical references on Lyapunov stability theory 
and on stability theory for linear systems are 
given below. 
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Abstract 

The information about certain safety or quality 
parameters during a batch process is valuable 
for a variety of reasons. In case a direct mea¬ 
surement is too expensive, too slow or nonex¬ 
isting, a state estimator estimating the desired 
quantities based on a model and various other 
measurements may be a good alternative. The 
most prominent method is calorimetry, where the 
heat of reaction is measured. This entry gives an 
overview of different alternatives that support a 
safe and successful batch operation. 

Keywords 

Calorimetry; Observer; Soft sensor; State 
estimator 

Introduction 

Continuous processes are used to produce a prod¬ 
uct at a constant rate. They are designed to 
operate at constant conditions, i.e., the state of 
the process (conversion, temperatures, pressures, 
concentrations, etc.) does not vary. In contrast, 
(semi-)batch processes execute a recipe which 
means that they are typically operated within a 
wide range of states. The state of the (semi¬ 
batch) process should constantly be monitored. 
This information is useful for several purposes: 

• Process safety: abnormal process states such 
as the accumulation of hazardous substances 
or reactive materials may lead to dangerous 
situations such as runaway reactions. The ear¬ 
lier an abnormal state is detected, the better 
it can be corrected, and the higher is the 
probability that loss can be avoided. 
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• Quality: if the batch is not operated along 
the standard trajectory, off-spec product may 
result which in turn results in extra effort 
and/or second-grade product if this is discov¬ 
ered in time and in a customer complaint if not 
discovered before delivery. 

• Profit: the better the state is known, the less 
conservative the underlying control scheme 
needs to be and the more the process can 
be pushed to its limits. This may lead to 
a higher throughput, less by-products, or 
less energy consumption. Advanced control 
schemes which are typically applied for this 
purpose require knowledge of the state of the 
process. 

The literature offers a wide range of ways 
to monitor a batch process. In some processes, 
the observation of simple measurements like 
temperatures, pressures, and the time that a 
process step takes for execution is sufficient 
to guarantee for safe standard product in 
minimum time. Examples include some melt- 
polymerizations. 

However, as soon as the process is more com¬ 
plex, more information than just temperatures 
and pressures is required to monitor the process 
to meet the goals mentioned above. It may be 
sufficient to measure other easy to measure prop¬ 
erties like conductivities, flow rates, pH values, 
sound velocities, attenuations, etc. However, in 
many cases these measurements do not give the 
complete state of the system. Properties like com¬ 
plex gas phase compositions cannot be measured 
this way. This might require the installation of 
more sophisticated measurements as, e.g., NIR 
spectroscopy, online gas chromatography, Raman 
spectroscopy, or ion mobility spectroscopy. These 
measurements require significant effort in terms 
of installation cost and maintenance. In other 
situations, no online measurement may be avail¬ 
able at all. These cases include the measurement 
of the distribution of the molecular weight in a 
polymer melt. 

In these cases, where direct online measure¬ 
ments are either too expensive or not available at 
all, several methods are available to obtain infor¬ 
mation on the status of the batch (►Estimation, 
Survey on). 


• Statistical Methods 

Experiences from historical batches are used 
in a statistical way to predict whether a batch 
runs normally. This can, e.g., be accomplished 
by defining a golden batch and a correspond¬ 
ing corridor around these trajectories. More 
sophisticated methods use principal compo¬ 
nent analysis (PCA) or partial least squares 
(PLS) to get a hint at abnormal situations. 
These methods are even capable of pointing 
at the origin of a possible problem. They are 
restricted to problem detection and typically 
cannot be used for control purposes. 

• Model-Based State Estimation 

The state of the system (temperatures, 
pressures, concentrations, etc.) is estimated 
online which allows for problem detection 
as well as control applications. This method 
will be described in more detail in the next 
chapter. 

General reviews of state estimation techniques 
can be found in Besancon (2007), Schei (2008), 
and a review of industrial applications is, e.g., 
given in Fortuna et al. (2007). 

Model-Based State Estimation 

The basic idea of a state estimator (which is 
frequently also called observer or soft sensor) is 
to run a mathematical model of the process in par¬ 
allel to the process itself, to compare the available 
measurements to the values which are predicted 
by the model, and to correct the estimated state by 
a suitable function of the observed error, usually 
an additive correction term that depends on the 
error. For a state estimator to converge to the 
true state, the considered system needs to be 
observable. For details, see ► Controllability and 
Observability. The scheme of a state estimator 
is sketched in Fig. 1. The real system processes 
the input u to give the system state x which 
is affected by the system noise §. The mea¬ 
surements y are perturbed by the measurement 
noise (p. The model predicts a system state x 
and a measurement y. The difference between the 
measured value y and predicted value y is then 
fed back to correct the estimated state. 
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State Estimation for 
Batch Processes, Fig. 1 

Principle of a state 
estimator 



For linear systems, the most commonly used 
state estimators are the Luenberger observer and 
the Kalman filter (►Kalman Filters). Both mul¬ 
tiply the prediction error (y — y) by a weighting 
matrix K to update the estimated state x: 

k = Ax + Bu + K(y - y) 

The two techniques use different approaches for 
determining the matrix K: 

Luenberger Observer The basic assumption is 
that the deviation e(t) between x and x is due 
to wrong initial values xo. K is computed by 
choosing the desired speed of convergence of 
the error 

e(r) = x(0 - x(?) 

= (A - KC) e(0 

to zero. This is done by placing the eigenval¬ 
ues of the matrix (A — KC) in the left half 
plane. 

Kalman Filter The basic assumption is that the 
error e(£) is caused by white noise in the 
system £ as well as in the measurement (p. 
The idea is to minimize the expectation of the 
quadratic error 

min E (( x(t) — x(t)) T (x(t) — x(t))). 

X 

K is computed from the noise covariance ma¬ 
trices and the system dynamics and varies with 
time. 


The tuning of the state estimators is not trivial. 
The larger the absolute value of the eigenvalues 
in the Luenberger approach, the faster the error 
will converge to zero but the more prone the state 
estimator will be to measurement noise. A similar 
trade-off exists for the Kalman filter where the 
covariance matrices of the noise terms £ and (p 
and the covariance of the initial state £ 0 need t0 
be defined. 

For nonlinear systems, a variety of 
approaches is available. The most frequently used 
estimators are based on using the nonlinear model 
for the prediction of the state and linearizations 
of the system dynamics are used to update the 
matrix K. The extended Kalman filter (EKF) 
(►Extended Kalman Filters) and the extended 
Luenberger observer (ELO) are representatives 
of this class of approaches. The EKF is most 
widely used. Extensions are the constrained EKE 
and the unscented EKE. 

As examples are known where the EKF fails 
due the nonlinearity of the system, methods based 
on ideas other than the linearization of system 
dynamics have been developed. These methods 
include the moving horizon estimator (MHE) 
(►Moving Horizon Estimation) and the parti¬ 
cle filter. Because of the increasing capabilities 
of modern computers and significant improve¬ 
ments in dynamic optimization algorithms, the 
MHE is a very promising alternative. The idea 
of the method is to minimize the sum of the 
squared errors of the system noise £/, the mea¬ 
surement noise cpi , and the error of the initial state 
%k-N which are weighted by weighing matrices 
P^,Q and R over a predefined horizon of past 
sampling steps k — N,... ,k 
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k —1 

min &-N P k l &-N + E ^Q~^i 
6lW /=m 

+ E vi R_ V/ 

/=jfc-tf + l 

s.E the system model and the measurement 
equations are satisfied and further 
inequality constraints 
(e.g., physical limits of variables) hold. 

The possibility to define constraints on the es¬ 
timated states, e.g., that concentrations must be 
nonnegative, is an important advantage of the 
MHE approach. If the horizon is reduced to one 
single measurement, the constrained extended 
Kalman filter results which combines the sim¬ 
plicity of the EKF with the possibility to include 
constraints on the estimated states. Efficient im¬ 
plementations of the MHE have led to the method 
being capable of estimating the state of rather 
large systems in real time (Diehl et al. 2006; 
Kiipper and Engell 2007). 


Calorimetry 

Temperature measurements are probably the 
cheapest available measurements in chemical 
processes, and most plants are typically 
well equipped with temperature sensors. To 
exploit temperature measurements, e.g., for 
the observation of exothermic or endothermic 
reactions, heat balances are set up and solved 
for the heat of reaction which then enables 
the computation of the reaction rate. This is 
typically referred to as calorimetry. Reviews 
are given, e.g., in Hergeth (2006), McKenna 
et al. (2000), and Landau (1996). For ajacketed 


reactor, the heat balance around a semi-batch 
reactor typically reads (see also Fig. 2) 

C p r = Qr + kA(Tj — T r ) 
at 

+ Ym F ,i c p,Fi(TF,i — Tr), (1) 

i 

where Qr represents the heat of reaction, kA 
the overall heat transfer coefficient between the 
reactor content and the jacket, Tr the reactor 
temperature, Tj the jacket temperature, 7> the 
feed temperature, Cpr the overall heat capacity 
of the reactor, and the last term on the right side 
is the enthalpy added by the feed to the reactor. If 
kA is known, Qr can directly be computed as all 
other quantities in Eq. (1) are known or measured. 
This is referred to as heat flow calorimetry. 

In industrial practice, kA usually is not known 
and varies over time due to changes of the filling 
level, changes of the viscosity of the reaction 
mixture, and fouling. Then other heat balances 
and measurements can be added to enable a direct 
computation or estimation of kA. Typically, the 
jacket heat balance is chosen 

— kA(T R — Tj) + &Ajack(7env — Tj) 
+ rhj Cpj (TjJ n — Tj). (2) 

If necessary, also other phenomena like direct 
heat losses from the reactor content to the envi¬ 
ronment or the influence of the reactor lid can 
be taken into account by adding additional terms 
or additional heat balances. This method is called 
heat balance calorimetry. 

In order to compute Qr and kA from Eqs. (1) 
and (2), two different approaches can be used: 

1. Equations (1) and (2) are solved to give 


s 


Cpj^jf — £^4jack(Tenv “ Tj) — m j C p j ( Tjj n — Tj) 


(3a) 


kA = 


Tr-Tj 


Qr = Cp,R——^- — kA(Tj — T r ) — ^^mpjCp^iiTpj — T R ). 


dt 


(3b) 
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State Estimation for 
Batch Processes, Fig. 2 

The reactor and its jacket 
as considered for 
calorimetry 



In this approach, the derivatives need to be 
computed from the measurements which in¬ 
troduces noise in the evaluation and requires 
a filtering either of the derivatives or of the 
estimates. 

2. Equations (1) and (2) are implemented in 
a nonlinear state estimator. To estimate the 
unknown quantities kA and Qr by this ap¬ 
proach, additional assumptions about their dy¬ 
namics must be made. A common approach is 
to add the so-called dummy derivatives 

d _A± =Q 

dt 
d kA 


The tuning of calorimetric estimation schemes 
has been discussed in the literature, but for 
each case, tests in simulation runs using 
recorded batch data should be performed. 
Experimental results of the application of the 
direct solution equations (3) and an EKF for the 
estimation of Qr and kA are shown in Fig. 3. A 
laboratory-scale 101 metal reactor was filled with 
water. Cold water was injected into the reactor 
to simulate the feed of reactants. The reactor is 


equipped with a heating rod by which different 
values of Qr could be simulated. Figure 3a 
shows the measured temperatures and the feed 
stream; Fig. 3b shows the estimates. The dotted 
line displays the measured power uptake, the 
thin, black line represents the estimates from the 
evaluation of Eqs. (3), and the gray line shows the 
results obtained with an EKF. The EKF was tuned 
slightly more aggressively than the PT1-filter that 
was used to filter the values of Qr and kA that 
were obtained from Eqs. (3). 

It can be seen that the quality of both eval¬ 
uation methods is comparable. A difference in 
performance can be seen in the estimation of kA 
at the points in time where Tr ^ Tj. This is due 
to the denominator in Eq. (3b) which becomes 
^ 0. At this point, kA is unobservable. The EKF 
estimates of kA are more smooth. This does not 
have an impact on the estimation of Qr because 
the heat transfer from the jacket to the reactor is 
zero at this point. This behavior is of importance 
if kA is used in other algorithms, e.g., for control 
purposes. 

A practical problem is the determination of the 
parameters of the system model. Especially the 
heat capacity of the reactor Cp^r is difficult to 
determine as it is not clear how much impact the 
reactor material has. Also the heat capacities of 
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State Estimation for Batch Processes, Fig. 3 Illustration of the results of direct estimation (Eqs. 3) and the use of an 
EKE (a) Measured data, (b) Estimates from inverted equations (3) and EKF as well as measured Qr 


intermediate products and mixtures with the raw 
materials and final products may not be known. 
That is why typically Cp^r is considered a “free” 
parameter which is used to fit the estimates to 
measured data. If the adjustment of the available 
parameters is not sufficient to yield a satisfactory 
performance of the estimator, further extensions 
can be considered: 

• If pressurized vessels are considered, the wall 
thickness may be considerable, and the heat 
accumulation may influence the results. In this 
case, the extension of the set of equations by 
an equation for the heat transfer through the 
wall may be considered (Saenz de Buruaga 
etal. 1997). 

• If large-scale vessels are considered, the 
cooling fluid in the jacket may not be perfectly 
mixed, and a temperature gradient will be 
present. In many cases, cooling coils are 
welded on the outside surface of the reactor. 
In this case, the equation for the perfectly 
mixed jacket (Eq. (2)) should be replaced by 
a model for a plug flow reactor (Kramer and 
Gesthuisen 2005). 

• For large industrial reactors, the perfect 
mixing assumption of the reactor contents 
does not necessarily hold true. Especially if 
polymerization reactions are considered, the 
reactor content may become rather viscous. 
A straightforward method to cope with this 
problem is a detailed computational fluid 
dynamics (CFD) simulation. However, due 


to the numerical complexity, this appears 
infeasible for online applications. A practical 
alternative is the placement of several temper¬ 
ature sensors and using a weighted average 
over their readings. A different approach is 
the usage of a multi-zonal model, the idea of 
which resembles the idea of a CFD model; 
however the number of zones (elements) is 
much smaller (Bezzo et al. 2004). 

Heat balance calorimetry becomes inaccurate 
if the mass flow through the jacket is so large that 
the temperature difference between the cooling 
stream entering the jacket and leaving the jacket 
(Tjj n — Tj ) is in the order of magnitude of 
the measurement error. This mode of operation 
is typically used in laboratory-scale reactors to 
avoid temperature gradients in the jacket. To 
estimate the states in such setups, a technique 
called temperature oscillation calorimetry (TOC) 
can be used. The idea is to add a small but 
well-measurable sinusoidal signal to the typically 
constant set point of the reactor temperature Tr 
(see Fig. 4 for an example). The reaction of the 
jacket temperature to the oscillating reactor tem¬ 
perature can be used to compute kA , e.g., by es¬ 
timating its amplitude 8Tj (Tietze et al. 1996) or 
by adding an additional equation which describes 
the second derivative of the reactor temperature 
to the set of heat balances (Mauntz et al. 
2007). 

Calorimetry estimates the total heat of the 
reactions in the reactor. It can be used to estimate 
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State Estimation for Batch Processes, Fig. 4 Example experiment where TOC is applied, (a) Complete example, 
(b) Zoom of rectangle (A) 


the overall chemical conversion of a process. 
Due to its integral character, the heat of reaction 
of parallel and consecutive reactions cannot be 
estimated separately (Hergeth 2006). However, 
if models of the chemical kinetics are known 
and reliable, it is possible to couple this kinetic 
model with calorimetry and to observe the com¬ 
plete state of the reaction based on calorimet¬ 
ric estimates. This solution may however not 
be robust as slight errors in the kinetic model 
may lead to significant errors in the estimates 
of all concentrations. In order to build a more 
robust state estimator, additional measurements 
should be installed and integrated into the state 
estimator. For example, for reactions including 
a phase change from the gas phase to the liquid 
phase, a pressure measurement may be suitable. 
For some polymerization reactions, sound veloc¬ 
ity and sound attenuation measurements can be 
valuable (Brandt et al. 2012). The additional mea¬ 
surement can be incorporated into the observation 
scheme by augmenting the measurement model 
g (see Fig. 1) by the corresponding measurement 
equation. 


Summary 

In this contribution, different methods that can 
be used to determine the states of (semi-)batch 
reactions have been described. State estimation 
is useful to reconcile measurement errors and 
whenever direct online measurements are either 
too expensive or not available at all. 


Linear state estimation is a mature topic. 
However as chemical batch reactors in most cases 
have nonlinear dynamics, nonlinear methods 
should be applied. Extensions of linear state 
estimators based on linearizations of the system 
(e.g., the EKF) are the most widely used 
nonlinear state estimators. However examples are 
known where these estimators fail. Thus, other 
approaches, e.g., based on online optimization 
(MHE), have been developed. They deliver 
promising results in terms of observation quality 
and computational speed even for large-scale 
systems. 

The most widespread application of state 
estimation techniques in batch processes is 
calorimetry which is suitable for significantly 
exothermic or endothermic reactions. The 
heat balances around the reactor contents 
and the jacket are set up and solved. The 
estimated heat of reaction is used to estimate the 
chemical conversion of the process. The method 
makes use of commonly installed temperature 
measurements in the reactor. Extensions to 
include other measurements have been discussed. 
Problems that typically occur in laboratory- 
scale reactors can be overcome with the help 
of temperature oscillation calorimetry. 

Cross-References 
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Abstract 

Statistical process control has been successfully 
utilized for process monitoring and variation re¬ 
duction in manufacturing applications. This entry 
aims to review some of the important moni¬ 
toring methods. Topics discussed include: She- 
whart’s model, X and R control charts, EWMA 
and CUSUM charts for monitoring small pro¬ 
cess shifts, process monitoring for autocorrelated 
data, and integration of statistical and engineering 
(or automatic) control techniques. The goal is 
to provide readers from control theory, mechan¬ 
ical engineering, and electrical engineering an 
expository overview of the key topics in statistical 
process control. 

Keywords 

CUSUM; EWMA; Feedback control; Shewhart 
control chart; Time-series analysis 

Introduction 

Variation control is an important goal in manufac¬ 
turing. The main set of tools for variation control 
used in discrete-part manufacturing industries up 
to the 1960s was developed by W. Shewhart in the 
1920s and is known today as statistical process 
control, or SPC (Shewhart 1939). Shewhart’s 
SPC model assumes that the process varies about 
a fixed mean and that consecutive observations 
from a process are independent, as follows: 
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Yt — [i o + € t (1) 

in which /zo is the in-control process mean and € t 
is iid (independent identically distributed) white 
noise e ~ N(0, a 2 ). The Shewhart model can be 
used in distinguishing assignable cause variation 
from common cause variation. For example, a 
mean change from /zo to /zi = /zo + 8 (where 8 is 
the unknown magnitude of change) or a variance 
increase from cTq to of at an unknown point in 
time can be detected as assignable causes. 

The objective of this entry is to highlight some 
of the important references in the SPC literature 
and to discuss similarities and joint applications 
SPC has with automatic process control. The 
literature on statistical process control and appli¬ 
cations to engineering problems is vast; therefore, 
no effort is made for an exhaustive review. More 
complete reviews of the literature on statistical 
process control and adjustment methods can be 
found in texts including Montgomery (2013), 
Ryan (2011), and Del Castillo (2002). 

Shewhart Control Charts 

Shewhart’s X and R control charts are used 
to distinguish between common cause and 
assignable causes of variation (Shewhart 1939) 
by monitoring, respectively, the process mean 
and process variance. “Common cause” variation 
is the natural variability of the process due to 
uncontrollable factors in the environment that 
is not avoidable without substantial changes 


to the process. “Assignable cause” variation 
is due to unwanted disturbances or upsets to 
the process that can be detected and removed 
to produce acceptable quality products. When 
only common cause variation exists, the process 
is said to be operating “in statistical control.” 
Assignable causes of variation include operator 
changes, machine calibration errors or raw 
material variation between suppliers. 

Another concept that is closely related to the 
Shewhart’s model is process capability. Process 
capability indices are used to assess whether the 
process is operating in a satisfactory manner with 
respect to the engineering specifications. It is 
crucial to attain a stable process (eliminating all 
problematic causes) before undertaking such a 
capability analysis because only when the sam¬ 
ples come from a stable probability distribution 
can the future behavior of the process be pre¬ 
dicted “within probability limits determined by 
the common cause system” (Box and Kramer 
1992). 

Figure 1 illustrates the two main phases, 
referred to as Phase I and Phase II, in con¬ 
structing Shewhart charts (Sullivan 2002), using 
semiconductor lithography process data given 
in Montgomery (2013). It is desired to establish 
a statistical control of the width of the resist 
using X and R charts. Twenty-five preliminary 
subgroups, each of size five wafers, were taken 
at one-hour intervals and the resist width is 
measured. In Phase I, “retrospective analysis,” 
the historical data from the process is analyzed 
to bring an initially out-of-control process into 


a b 



Statistical Process Control in Manufacturing, Fig. 1 Shewhart X and R charts from (a) Phase I analysis and 
(b) Phase II analysis 
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statistical control. Subgroups yi,...,y n of size 
n are taken, and subgroup average y is used 
to monitor process mean /zo, and the subgroup 
range is used to monitor standard deviation of 
the process mean cry = cr/ *Jn. The upper and 
lower control limits are found for the X chart 
as {UCL,LCL} = /zo d= Lcry where L is a 
constant representing the width of the control 
limits. Commonly chosen three-sigma limits 
(i.e., L = 3) provide a probability p = 0.0027 
that a single point falls outside the limits when 
process is in control (“false alarm probability”). 
Points that fall outside the control limits are 
investigated, and if an assignable cause was 
identified, then this point is omitted and control 
limits are recalculated. This is repeated until no 
further points plot outside the limits. In Phase 
II these charts are used to detect shifts in the 
process mean and variability. 

The X and R charts from Phase I data in 
Fig. la indicate statistical control; hence the com¬ 
puted control limits can be used for Phase II 
monitoring. Twenty additional subgroups (also of 
size 5) are taken in Phase II while the control 
charts are in use. The Phase II charts shown in 
Fig. lb indicate that process variability is stable 
but the process mean has shifted at subgroup 18. 
The general trend in the X chart indicates that 
process mean probably has shifted earlier around 
subgroup 13. 

EWMA, CUSUM, and Changepoint 
Estimation 

Shewhart charts can detect large magnitude pro¬ 
cess upsets reasonably well; however, they are 
relatively slow to detect small shifts. In order 
to reduce the reaction time for smaller shifts, a 
set of “runs” rules (e.g., two out of three runs 
beyond 2cr limits or four out of five runs beyond 
lcr limits) has been proposed Western Electric 
(1956). A more systematic method is to accu¬ 
mulate information over successive observations 
using CUSUM and EWMA statistics rather than 
basing the detection on a single sample. In the 
cumulative sum (CUSUM) chart, a running total 
Y^i=\(Yt — /zo) is plotted against subgroup num¬ 
ber t, and a shift from the in-control mean /zo is 


signaled by an upward or downward linear trend 
in the plot. A two-sided CUSUM is defined as 
Woodall and Adams (1993): 

S± = max{=b Z t — k -h S^_ v 0} for t = 1,2,... 

( 2 ) 

where S+ and S~ are the one-sided upper and 
lower cusums, respectively, Z t = ( Y t — /zo)/cry 
is the standardized subgroup average, k = |/zi — 
/zo |/(2 cr) is the reference value, and /z i is the 
level of process mean to be detected. An out-of¬ 
control signal is given at the first t for which S t > 
h where h is a suitably chosen threshold, usually 
selected based on the desired average number of 
samples to signal an alarm, also called the aver¬ 
age run length (ARL). The recommended value 
for the threshold h is 4 or 5 (corresponding to 
four or five times the process standard deviation 
cr), and the value for the reference k is almost 
always taken as 0.5 (corresponding to shift size 
| /zi — /zo | = cr) (Montgomery 2013). 

Another chart that accumulates deviations 
over several samples is the exponentially 
weighted moving average (EWMA) which is 
based on the statistic (Lucas and Saccucci 1990) 

Z t = XY t + (1 — X)Z t -i (3) 

where 0 < X < 1 is a smoothing constant. 
Smaller A provides large smoothing (similar 
to a large subgroup size n in the Shewhart 
charts). The starting value is the in-control 
mean Zo = /zo. It can be shown that Z t 
is a weighted average of all previous sample 
means, where the weights decrease geometrically 
with the age of the subgroup mean. The 
EWMA statistic is plotted against the control 
limits /zo ± Lof ^/(A/(2 — A))[l — (1 — A) 2 *]. 
Shewhart charts that are effective for large shifts 
are more useful for Phase I, and CUSUM or 
EWMA charts that are effective for small shifts 
are more appropriate for Phase II. 

We illustrate in Fig. 2 how to monitor with 
CUSUM and EWMA charts with the lithography 
data. The in-control process mean and standard 
deviation /zo and cr are found from the Phase 
I data. CUSUM upper and lower statistics S± 
computed with Phase II data are plotted in Fig. 2a 
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Statistical Process Control in Manufacturing, Fig. 2 Phase II charts for lithography data (a) CUSUM chart and 
(b)EWMA chart 


(reference value k = 0.5 and threshold h = 4 
are used.). The upper cusum statistic S+ crosses 
the upper control limit indicating an upward shift 
at subgroup 15. The EWMA statistic applied 
with X = 0.2 on Phase II data, shown Fig. 2b, 
crosses the upper control limit at subgroup 16. 
Both charts have improved the reaction times of 
the Shewhart chart. 

When a control chart signals an assignable 
cause, it does not indicate when the process 
change actually occurred. Estimating the instant 
of the change, or changepoint estimation, is es¬ 
pecially useful in Phase I analysis where little 
is known about the process, and it is important 
to identify and remove the out-of-control sam¬ 
ples from consideration (Hawkins et al. 2003; 
Basseville and Nikiforov 1993; Pignatiello and 
Samuel 2001). The process is modeled as 

Yi ~ a 2 ) for / = 1,2,..., r 

Yi ~ N(pL 2 , cr 2 ) for i = x + 1,. .., n (4) 

where r is the unknown changepoint, at which the 
in-control mean /xi is assumed to shift to a new 
value /x 2 assuming /xi, cr are known but /X 2 is un¬ 
known. A generalized likelihood ratio (GLR) test 
statistic A, = £/=i log/ 2 (j,)//iOi) is used 
to test the hypothesis of a changepoint against 
the null hypothesis that there is no change. As¬ 
suming normality f(y) = 1/V27TCT exp[— (y — 
p) 2 /(2a 2 )] is the probability density function of 
the quality characteristic. The changepoint model 


is equivalent to the CUSUM chart when all pa¬ 
rameters and a are known a priori. For 

the lithography Phase II data in Fig. lb, it can be 
shown that the changepoint can be estimated as 
subgroup 13. 


SPC on Controlled and 
Autocorrelated Processes 

It is well known that automatic control perfor¬ 
mance relies heavily on the accuracy of the pro¬ 
cess models. An active field of research in recent 
years is the monitoring of controlled systems 
using SPC charts (Box and Kramer 1992) in 
order to reduce the effect of model accuracy. 
Shewhart charts can be used to monitor the output 
of a feedback-controlled process; however, as the 
controller effectively corrects the shift, only a 
short window of opportunity is provided to detect 
the shift (Vander Wiel et al. 1992). Tsung and 
Tsui (2008) showed that monitoring the control 
actions gives better run-length performance than 
monitoring the output for small- and medium- 
size shifts, and monitoring the output gives better 
performance for large shifts. In monitoring con¬ 
trolled processes, measurements taken at short 
intervals with positive autocorrelation usually in¬ 
flate the rate of false alarms (Harris and Ross 
1991). Widening the control limits and monitor¬ 
ing the residuals of a time-series model fitted 
to the observations are some of the strategies 
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I Chart of ARMA(1,1) process 



I Chart of Residuals 



Statistical Process Control in Manufacturing, Fig. 3 (a) Shewhart chart for autocorrelated process, (b) Shewhart 
chart for residuals 
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Observation 


Statistical Process Control in Manufacturing, Fig. 4 (a) Shewhart chart for controlled process Y t . (b) Shewhart 
chart for input X t 


employed to reduce the number of false alarms 
(Alwan and Roberts 1988). 

To illustrate the effects of autocorrelation, we 
consider simulated data from an autoregressive 
moving average ARMA(1,1) time-series distur¬ 
bance process D t = 0.8Th_i + e t — 0.36 r _i 
(Box et al. 1994) defined with the white noise 
process € t ~ N(0, l 2 ) (with in-control mean 
jio = 0 and variance g 2 d = 1.694). Figure 3a 
shows a realization of the process monitored with 
a Shewhart chart (control limits at /Xo±3ot>). Due 
to autocorrelation, false alarms are signaled at 
samples 81-83. Figure 3b shows the control chart 
monitoring of the residuals of an ARMA(1,1) 
model. Residuals (standard normal with mean 0 
and variance 1) are not autocorrelated, so the 
Shewhart chart for residuals does not signal any 
false alarms. 


We illustrate monitoring of controlled pro¬ 
cesses with simulated data from a transfer func¬ 
tion model Y t = 2X t -\ + D t where X t are the 
adjustments made on the process. A proportional 
integral control rule X t = — 0.1 Y t — 0.15 Yl*i=i ^ 
is employed, and the disturbance D t is assumed 
to follow the ARMA model considered earlier. 
As an assignable cause, the disturbance mean has 
shifted at sample 100 by a magnitude of 3 o^. 
Figure 4 shows the Shewhart charts monitoring 
the output Y t and the input X t . The effect of 
assignable cause (at sample 100) on the output 
is quickly removed by the controller; however, 
a sustained shift remains in the control input. 
The control chart for the input Fig. 4b signals the 
first alarm at sample 101 (much quicker) than the 
control chart for the output Fig. 4a which signals 
at sample 110. 























































1324 


Stochastic Adaptive Control 


Summary and Future Directions 

In this entry we reviewed some of the commonly 
used statistical process monitoring methods for 
manufacturing systems. Due to space limitations, 
only several important topics including Phase I 
and Phase II monitoring with Shewhart, EWMA, 
and CUSUM charts were discussed, highlight¬ 
ing main applications with numerical examples. 
Other current research areas include multivariate 
methods for monitoring processes with multiple 
quality characteristics taking advantage of rela¬ 
tionships among them (Lowry and Montgomery 
1992), profile monitoring for processes that gen¬ 
erate functional data (Woodall et al. 2004), multi¬ 
stage monitoring for processes with multiple pro¬ 
cessing steps and variation transmission (Tsung 
et al. 2008), and run-to-run EWMA control for 
semiconductor manufacturing processes that re¬ 
quire handling of multiple types of products, 
operators, and machine tools (Butler and Stefani 
1994). 

Cross-References 

► Controller Performance Monitoring 

► Multiscale Multivariate Statistical Process 
Control 

► Run-to-Run Control in Semiconductor Manu¬ 
facturing 
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Abstract 

Stochastic adaptive control denotes the control 
of partially known stochastic control systems. 
The stochastic control systems can be described 
by discrete- or continuous-time Markov chains 
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or Markov processes, linear and nonlinear 
difference equations, and linear and nonlinear 
stochastic differential equations. The solution of 
a stochastic adaptive control problem typically 
requires the identification of the partially known 
stochastic system and the simultaneous control of 
the partially known system using the information 
from the concurrent identification scheme. Two 
desirable goals for the solution of a stochastic 
adaptive control problem are called self-tuning 
and self-optimality. Self-tuning denotes the 
convergence of the family of adaptive controls 
indexed by time to the optimal control for the true 
system. Self-optimizing denotes the convergence 
of the long-run average costs to the optimal long- 
run average cost for the true system. Typically 
to achieve the self-optimality, it is important 
that the family of parameter estimators from the 
identification scheme be strongly consistent, that 
is, this family converges (almost surely) to the 
true parameter values. Thus, with self-optimality, 
asymptotically a partially known system can be 
controlled as well as the corresponding known 
system. 

Keywords 

Bayesian estimation; Brownian motion; Markov 
processes; Self-tuning regulators 

Motivation and Background 

In almost every formulation of a stochastic con¬ 
trol problem from a physical system, the physical 
system is incompletely known so the stochastic 
system model is only partially known. This lack 
of knowledge can often be described by some 
unknown parameters for a mathematical model, 
and the noise inputs for the model can describe 
unmodeled dynamics or perturbations to the sys¬ 
tem. The lack of knowledge of some parameters 
of the model can be modeled either by random 
variables with known prior distributions or as 
fixed unknown values. The former description 
requires Bayesian estimation, and the latter de¬ 
scription requires parameter estimation such as 
least squares or maximum likelihood. 


Stochastic adaptive control arose as a natural 
evolution from the results in stochastic control, 
and in particular it developed for some well- 
known control problems. The optimal control 
of Markov chains had been developed for some 
time, so it was natural to investigate the adaptive 
control of Markov chains. Mandl (1973) was 
probably the first to consider this adaptive control 
problem in generality. His conditions for strong 
consistency of a family of estimators were fairly 
restrictive. Borkar and Varaiya (1982) simpli¬ 
fied the conditions for the estimation part of the 
problem by only requiring convergence of the 
estimators of the parameters so that the resulting 
transition probabilities of the Markov chain are 
identical to the transition probabilities for the true 
optimal solution. 

A second major direction for stochastic 
adaptive control is described by ARMAX 
(autoregressive-moving average with exogenous 
inputs) models. These are discrete-time models 
that can be described in terms of polynomials 
in a time shift operator. A closely related and 
often equivalent model is multidimensional linear 
difference equations in a state-space form. Since 
the solution of the infinite time horizon stochastic 
control problem was available in the late 1950s, it 
was natural to consider the adaptive control prob¬ 
lem. Methods such as least squares, weighted 
least squares, maximum likelihood, and stochas¬ 
tic approximation were used for parameter identi¬ 
fication and a certainty equivalence adaptive con¬ 
trol for the system, that is, using the current esti¬ 
mate of the parameters as the true parameters to 
verify self-optimality. An important development 
in stochastic adaptive control is a result called 
the self-tuning regulator where the convergence 
of estimators of unknown parameters implied the 
convergence of the output tracking error (Astrom 
and Wittenmark 1973; Goodwin et al. 1981; Guo 
1995, 1996; Guo and Chen 1991; Kumar 1990). 

A number of monographs treat various aspects 
of stochastic adaptive control problems, e.g., 
Astrom and Wittenmark (1989), Chen and Guo 
(1991), Kumar and Varaiya (1986), and Ljung 
and Soderstrom (1983). An extensive survey 
article on the early years of stochastic adaptive 
control is given by Kumar (1985). 
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Structures and Approaches 

Various requirements can be made for the adap¬ 
tive control of a stochastic system. It can only 
be required that the family of adaptive controls is 
stabilizing the unknown system or that the family 
of adaptive controls converges to the optimal 
control for the true system or that the family of 
adaptive controls has a long-run average cost that 
is equal to the optimal average cost for the true 
system. The identification part of the adaptive 
control problem can be Bayesian estimation (Ku¬ 
mar 1990) if the parameters are assumed to be 
random variables or parameter estimation (Bercu 
1995; Lai and Wei 1982) if the parameters are 
assumed to be unknown constants. The identifi¬ 
cation scheme may also incorporate information 
about the running cost. 

For linear systems with white noise inputs, it is 
well known to use least squares (or equivalently 
maximum likelihood) estimation to estimate pa¬ 
rameters. However, for stochastic adaptive con¬ 
trol problems, the sufficient conditions for the 
family of estimators to be strongly consistent are 
fairly restrictive (e.g., Lai and Wei 1982), and in 
fact the family of estimators may not even con¬ 
verge in general. A weighted least squares esti¬ 
mation scheme can guarantee convergence of the 
family of estimators (Bercu 1995) and can often 
be strongly consistent (Guo 1996). Some other 
estimation methods are stochastic approximation 
(Guo and Chen 1991) and an ordinary differential 
equation approach (Ljung and Soderstrom 1983). 
For discrete-time nonlinear systems, a family of 
strongly consistent estimators may not converge 
sufficiently rapidly even to stabilize the nonlinear 
system (Guo 1997). 

The study of stochastic adaptive control of 
continuous-time linear stochastic systems with 
long-run average quadratic costs developed 
somewhat after the corresponding discrete-time 
study (e.g., Duncan and Pasik-Duncan 1990). A 
solution with basically the natural assumptions 
from the solution of the known system problem 
using a weighted least squares identification 
scheme is given in Duncan et al. (1999). 

Another family of stochastic adaptive control 
problems is described by linear stochastic 


equations in an infinite dimensional Hilbert 
space. These models can describe stochastic 
partial differential equations and stochastic 
hereditary differential equations. Some linear- 
quadratic-Gaussian control problems have been 
solved, and these solutions have been used to 
solve some corresponding stochastic adaptive 
control problems (e.g., Duncan et al. 1994a). 

Optimal control methods such as Hamilton- 
Jacobi-Bellman equations and a stochastic maxi¬ 
mum principle have been used to solve stochastic 
control problems described by nonlinear stochas¬ 
tic differential equations (Fleming and Rishel 
1975). Thus, it was natural to consider stochas¬ 
tic adaptive control problems for these systems. 
The results are more limited than the results 
for linear stochastic systems (e.g., Duncan et al. 
1994b). 

Other stochastic adaptive control problems 
have recently emerged that are modeled by 
multi-agents, such as mean field stochastic 
adaptive control problems (e.g., Nourian et al. 
2012 ). 


A Detailed Example: Adaptive 
Linear-Quadratic-Gaussian Control 

This example is a model that is the most well 
known continuous-time stochastic adaptive con¬ 
trol problem. Likewise for a known continuous¬ 
time system, this stochastic control problem is 
the most basic and well known. The controlled 
system is described by the following stochastic 
differential equation: 

dX(t ) = AX(t)dt + BU(t)dt + CdW(t) 
X(0) = X 0 

where X(t) e R", U(t) e R m , and (W(t\ t > 0) 
is an IR/-valued standard Brownian motion and 
(A, B, C) are appropriate linear transformations. 
X(t) is the state of the system at time t and U(t ) 
is the control at time t. It is assumed that A, B, C 
are unknown linear transformations. The cost 
functional, /(•), is a long-run average (ergodic) 
quadratic cost functional that is given by 
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J(U) = lim su Pr _i < QX(t), X(t) > 
+ < RU(t), U(t) > dt 

where R > 0 and Q > 0 are symmetric linear 
transformations and < • > is the canonical 

inner product in the appropriate Euclidean space. 
The standard assumptions for the control of the 
known system are made also for the adaptive 
control problem, that is, the pair (A, B) is con¬ 
trollable and (A, Qz) is observable. An optimal 
control for the known system is 

U°(t) = -R~ l B T SX(t) 

where S is the unique positive, symmetric solu¬ 
tion of the following algebraic Riccati equation: 

A t S + SA - SBR~ l B T S + Q = 0 

The optimal cost is 

J(U°) = tr(C T SC ) 

The unknown quantity C T C can be identified 
given e [a, b]) for a < b arbitrary from 

the quadratic variation of Brownian motion, so 
the identification of C is not considered here. 
Since it is assumed that the pair (A, B) is un¬ 
known, the system equation is rewritten in the 
following form: 

dX(t) = e T ip(t)dt + CdW(t) 

where Q T = [A B] and (p T (t ) = [ X T (t) U T (t)\. 
A family of continuous-time weighted least 
squares recursive estimators ( 9(t),t > 0) of 
0 is given by the following stochastic equation: 

d0(t) = a(t)P(t)(p(t)[dX T (t) — (p T (t)6(t)dt] 
dP(t) = —a{t)P{t)(p{t)(p T {t)P{t)dt 

where (a(t), t > 0) is a suitable family of 
positive stochastic weights (Duncan et al. 
1999). A family of estimates ( 0(t),t > 0) is 
obtained from ( 6(t),t > 0) and is expressed 
as 6{t) = [.A(t ) B(t)\ (Duncan et al. 1999). 


A process ( S(t),t > 0) is obtained using 
(A(t), B(t)) by solving the following stochastic 
algebraic Riccati equation for each t > 0: 

A r (t)S(t ) + S(t)A(t) 

- S(t)B(t)R~ 1 B T (t)S(t) + Q = 0 

A certainty equivalence method is used to de¬ 
termine the control, that is, it is assumed that 
the pair (A(t ), B(t )) is the correct pair for the 
true system, so a certainty equivalence adaptive 
control U(t) is given by 

U(t) = R~ l B T S(t)X(t ) 

It can be shown (Duncan et al. 1999) that the 
family of estimators ((A(t), B(t)),t > 0) is 
strongly consistent and that the family of adaptive 
controls given by the previous equality is self- 
optimizing, that is, the long-run average cost 
J(U) = J(U°) = tr(C T SC) where S is the 
solution of the algebraic Riccati equation for the 
true system. 

Future Directions 

A number of important directions for stochastic 
adaptive control are easily identified. Only three 
of them are described briefly here. The adaptive 
control of the partially observed linear-quadratic- 
Gaussian control problem (Fleming and Rishel 
1975) is a major problem to be solved using the 
same assumptions of controllability and observ¬ 
ability as for the known system. This problem 
is a generalization of the example given above 
where the output (linear transformation) of the 
system is observed with additive noise and the 
family of controls is restricted to depend only on 
these observations. Another major direction is to 
modify the detailed example above by replacing 
the Brownian motion in the stochastic equation 
for the state by an arbitrary fractional Brown¬ 
ian motion or by an arbitrary square-integrable 
stochastic process with continuous sample paths. 
For this latter problem it is necessary to use 
recent results for optimal controls for the true 
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system and to have strongly consistent families of 
estimators. A third major direction is the adaptive 
control of nonlinear stochastic systems. 

Cross-References 

► Stochastic Linear-Quadratic Control 

► System Identification: An Overview 
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Abstract 

Conventional deterministic chemical kinetics of¬ 
ten breaks down in the small volume of a living 
cell where cellular species (e.g., genes, mRNAs, 
etc.) exist in discrete, low copy numbers and 
react through reaction channels whose timing 
and order is random. In such an environment, 
a stochastic chemical kinetics framework that 
models species abundances as discrete random 
variables is more suitable. The resulting models 
consist of continue-time discrete-state Markov 
chains. Here we describe how such models can 
be formulated and numerically simulated, and we 
present some of the key analysis techniques for 
studying such reactions. 
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Introduction 

The time evolution of a spatially homogeneous 
mixture of chemically reacting molecules is often 
modeled using a stochastic formulation, which 
takes into account the inherent randomness of 
thermal molecular motion. This formulation is 
important when modeling complex reactions in¬ 
side living cells, where small populations of key 
reactants can set the stage for significant stochas¬ 
tic effects. In this entry, we review the basic 
stochastic model of chemical reactions and dis¬ 
cuss the most common techniques used to simu¬ 
late and analyze this model. 

Stochastic Models of Chemical 
Reactions 

We start by considering a set of N molecular 
species (reactants) Si ,... ,Sjv that are confined 
to a fixed volume Q . These species react through 
M possible reactions R\,.. ., Rm- In this for¬ 
mulation of chemical kinetics, we shall assume 
that the system is in thermal equilibrium and is 
well mixed. Thus, the reacting molecules move 
due to their thermal energy. The population of 
the different reactants is described by a random 
process X(t ) = (X\(t)... X^(t)) T , where X- t (t) 
is a random variable that models the abundance 
(in terms of the number of copies) of molecules of 
species <S ? in the system at time t . For the allow¬ 
able reactions, we shall only consider elementary 
reactions. These could either be monomolecular, 
Si —> products, or bimolecular, Si + Sj —> 
products. Upon the firing of reaction Rk, a transi¬ 
tion occurs from some state X = x ? - right before 
the reaction fires to some other state X = jt/ + 
Sk, which reflects the change in the population 
immediately after the reaction has fired. Sk is 
referred to as the stoichiometric vector. The set 


Stochastic Description of Biochemical 
Networks, Table 1 Propensity functions for elementary 
reactions. The constants c, c', and c" are related to k, 
k', and k", the reaction rate constants from deterministic 
mass-action kinetics. Indeed it can be shown that c = k, 
c' = k'/Q, and c" = 2 k"/Q 


Reaction type 

Propensity function 

Si Products 

CXj 

Si + Sj -> Products (i ^ j) 

c'XjXj 

Si + Sj Products 

c"xi(xi - l)/2 


of allowable M reactions defines the so-called 
stoichiometry matrix: 

S = [si * * * sm] • 

To each reaction Rk, we associate a propensity 
function , Wk(x) that describes the rate of that re¬ 
action. More precisely, Wk(x)h is the probability 
that, given the system is in state x at time t, 
Rk fires once in the time interval [t, t + h). The 
propensity functions for elementary reactions is 
given in Table 1 . 


Limiting to the Deterministic Regime 

There is an important connection between the 
stochastic process X(t), as represented by the 
continuous-time discrete-state Markov chain de¬ 
scribed above, and the solution of a related de¬ 
terministic reaction rate equations obtained from 
mass-action kinetics. To see this, let 0(t) = 
[<P\ (t), . .., 0n(1)] T be the vector concentrations 
of species S\,... , Sn. According to mass-action 
kinetics, <£>(•) satisfies the ordinary differential 
equation: 

0 = Sf(0(t)), 0(0) = 0 O . 

In order to compare the 0(t ) with X(t ), which 
represents molecular counts, we divide X(t) by 
the reaction volume to get X Q (t) = X(t)/£2. It 
turns out that X Q (t) limits to 0(t): According to 
Kurtz (Ethier and Kurtz 1986), for every t > 0: 

lim sup |A^(s) — 0(s)\ = 0, almost surely. 

^2 ^oo s <t 
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Hence, over any finite time interval, the 
stochastic model converges to the deterministic 
mass-action one in the thermodynamic limit. 
Note that this is only a large volume limit result. 
In practice, for a fixed volume, a stochastic 
description may differ considerably from the 
deterministic description. 

Stochastic Simulations 

Gillespie’s stochastic simulation algorithm (SSA) 
constructs sample paths for the random process 
X(t) = (X\(t) ... .X N (t)) T that are consistent 
with the stochastic model described above (Gille¬ 
spie 1976). It consists of the following basic 
steps: 

1. Initialize the state X(0) and set t = 0. 

2. Draw a random number r e (0, oo) with 
exponential distribution and mean equal to 

i /E k MX(t))- 

3. Draw a random number k e {1,2,..., M} 
such that the probability of k = i e 
{l,2,...,M}is proportional to (A(^)). 

4. Set X{t + r) = X{t) + Sk and t = t + r. 

5. Repeat from (2) until t reaches the desired 
simulation time. 

By running this algorithm multiple times with 
independent random draws, one can estimate the 
distribution and statistical moments of the ran¬ 
dom process X(t). 

The Chemical Master Equation (CME) 

The chemical master equation (CME), also 
known as the forward Kolmogorov equation, 
describes the time evolution of the probability 
that the system is in a given state x. The CME 
can be derived based on the Markov property of 
chemical reactions. Suppose the system is in state 
x at time t. Within an error of order 0(h 2 ), the 
following statements apply: 

• The probability that an Rk reaction fires ex¬ 
actly once in the time interval [t , t -\-h) is given 
by Wk(x)h. 

• The probability that no reactions fire in 
the time interval [t,t + h) is given by 
1 “ Hk W k (x)dx. 


• The probability that more than one reaction 

fires in the time interval [t, t + h) is zero. 

Let P(x,t ), denote the probability that the 
system is in state x at time t. We can express 
P(x , t + h) as follows: 

P(x,t + h) = P(x,t) ^1 — T>,(jc)/e^ 

+ P(x — s k , t)wk(x - s k )h + OQi 2 ). 

k 

The first term on the right-hand side is the prob¬ 
ability that the system is already in state x at 
time t , and no reactions occur in the next h. In 
the second term on the right-hand side, the kih 
term in the summation is the probability that the 
system at time t is an Rk reaction away from 
being at state x and that an Rk reaction takes 
place in the next h. 

Moving P(x,t ) to the left-hand side, dividing 
by h , and taking the limit as h goes to zero yields 
the chemical master equation (CME): 

dP ^ { ^ = T,k=l ( W k( x ~ °k)P(x ~ S k , t) 

-w k (X)P(XJ)). (1) 

The CME defines a linear dynamical system in 
the probabilities of the different states (each state 
is defined by a specific number of molecules of 
each of the species). However, there are generally 
an infinite number of states, and the resulting 
infinite linear system is not directly solvable. 
One approach to overcome this difficulty is to 
approximate the solution of the CME by truncat¬ 
ing the states. A particular truncation procedure 
that gives error bounds is called the finite-state 
projection (FSP) (Munsky and Khammash 2006). 
The key idea behind the FSP approach is to keep 
those states that support the bulk of the proba¬ 
bility distribution while projecting the remaining 
infinite states onto a single “absorbing” state. 
See Fig. 1. 

The left panel in the figure shows the infi¬ 
nite states of a system with two species. The 
arrows indicate transitions among states caused 
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Stochastic Description of Biochemical Networks, Fig. 1 The finite-state projection 


by allowable chemical reactions. The underlying 
stochastic process is a continuous-time discrete- 
state Markov process. The right panel shows the 
projected (finite-state) system for a specific pro¬ 
jection region (box). The projection is obtained 
as follows: transitions within the retained sates 
are kept, while transitions that emanate from 
these states and end at states outside the box 
are channeled to a single new absorbing state. 
Transitions into the box are deleted. The resulting 
projected system is a finite-state Markov process. 
The probability of each of its finite states can be 
computed exactly. It can be shown that the trun¬ 
cation, as defined here, gives a lower bound for 
the probability for the original full system. The 
FSP algorithm provides a way for constructing 
an approximation of the CME that satisfies any 
prespecified accuracy requirement. 


Moment Dynamics 

While the probability distribution P(x,t ) pro¬ 
vides great detail on the state x at time t , often 
statistical moments of the molecule copy num¬ 
bers already provide important information about 
their variability, which motivates the construction 


of mathematical models for the evolution of such 
models over time. 

Given a vector of integers m := (mi, m 2 ,..., 
m n ), we use the notation /z^ to denote the 
following uncentered moment of X : 

li (m) :=E[X^Xp---X^]. 

Such moment is said to be of order m,. With 
N species, there are exactly N first-order mo¬ 
ments e[Xi], Vz G {1,2,..., N}, which are just 
the means; N(N — l)/2 second-order moments 
e[X 2 ], V/ and e[XjXj] 9 Vi ^ j, which can 
be used to compute variances and covariance; 
N(N — l)(N — 2)/6 third-order moments; and 
so on. 

Using the CME (1), one can show that 

= £[£>*(*)((*! +Sl,k) m (X2-S2,k) m2 

k 

s NM ) mN - XX 

and, because the propensity functions are all 
polynomials on x (cf. Table 1), the expected 
value in the right-hand side can actually be writ¬ 
ten as a linear combination of other uncentered 
moments of X. This means that if we construct a 
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vector /x containing all the uncentered moments 
of x up to some order k , the evolution of ji is 
determined by a differential equation of the form 

—— — A.JJL -\- B /X, jJi G R K , jJi G (2) 

where A and B are appropriately defined matrices 
and /x is a vector containing moments of order 
larger than k. The equation (2) is exact, and we 
call it the (exact) k-order moment dynamics , and 
the integer k is called the order of truncation. 
Note that the dimension K of (2) is always larger 
than k since there are many moments of each 
order. In fact, in general, K is of order n k . 

When all chemical reactions have only one 
reactant, the term Bpt does not appear in (2), 
and we say that the exact moment dynamics 
are closed. However, when at least one chemical 
reaction has two or more reactants, then the term 
B \± appears, and we say that the moment dynam¬ 
ics are open since (2) depends on the moments 
in /x, which are not part of the state /x. When 
all chemical reactions are elementary (i.e., with 
at most two reactants), then all moments in /x are 
exactly of order k + 1. 

Moment closure is a procedure by which one 
approximates the exact (but open) moment dy¬ 
namics (2) by an approximate (but now closed) 
equation of the form 

v = Av + Bcp(v), vel 1 (3) 

where cp(v) is a column vector that approximates 
the moments in /x. The function cp(v) is called the 
moment closure function, and (3) is called the ap¬ 
proximate kth-order moment dynamics. The goal 
of any moment closure method is to construct 
cp(v) so that the solution v to (3) is close to the 
solution pi to (2). 

There are three main approaches to construct 
the moment closure function cp(-)\ 

1. Matching-based methods directly attempt to 
match the solutions to (2) and (3) (e.g., Singh 
and Hespanha 2011). 

2. Distribution-based methods construct cp (•) by 
making reasonable assumptions on the statis¬ 


tical distribution of the molecule counts vector 
x (e.g., Gomez-Uribe and Verghese 2007). 

3. Large volume methods construct cp (•) by as¬ 
suming that reactions take place on a large 
volume (e.g., Van Kampen 2001). 

It is important to emphasize that this classifi¬ 
cation is about methods to construct moment 
closure. It turns out that sometimes different 
methods lead to the same moment closure 
function cp (•)• 

Conclusion and Outlook 

We have introduced complementary approaches 
to study the evolution of biochemical networks 
that exhibit important stochastic effects. 

Stochastic simulations permit the construction 
of sample paths for the molecule counts, which 
can be averaged to study the ensemble behavior 
of the system. This type of approach scales well 
with the number of molecular species, but can be 
computationally very intensive when the number 
of reactions is very large. This challenge has 
led to the development of approximate stochastic 
simulation algorithms that attempt to simulate 
multiple reactions in the same simulation step 
(e.g., Rathinam et al. 2003). 

Solving the CME provides the most detailed 
and accurate approach to characterize the ensem¬ 
ble properties of the molecular counts, but for 
most biochemical systems such solution cannot 
be found in closed form, and numerical methods 
scale exponentially with the number of species. 
This challenge has led to the development of 
algorithms that compute approximate solutions 
to the CME, e.g., by aggregating states with 
low probability, while keeping track of the error 
(e.g., Munsky and Khammash 2006). 

Moment dynamics is attractive in that the 
number of kth -order moments only scales poly¬ 
nomial^ with the number of chemical species, 
but one only obtains closed dynamics for very 
simple biochemical networks. This limitation has 
led to the development of moment closure tech¬ 
niques to approximate the open moment dynam¬ 
ics by a closed system of ordinary differential 
equations. 
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Abstract 

This article is concerned with one of the tra¬ 
ditional approaches for stochastic control prob¬ 
lems: Stochastic dynamic programming. Brief 
descriptions of stochastic dynamic programming 
methods and related terminology are provided. 
Two asset-selling examples are presented to il¬ 
lustrate the basic ideas. A list of topics and 
references are also provided for further reading. 


Keywords 

Asset-selling rule; Bellman equation; Hamilton- 
Jacobi-Bellman equation; Markov decision 
problem; Optimality principle; Stochastic 
control; Viscosity solution 


Introduction 

The term dynamic programming was introduced 
by Richard Bellman in the 1940s. It refers to a 
method for solving dynamic optimization prob¬ 
lems by breaking them down into smaller and 
simpler subproblems. 

To solve a given problem, one often needs 
to solve each part of the problem (subproblems) 
and then put together their solutions to obtain an 
overall solution. Some of these subproblems are 
of the same type. The idea behind the dynamic 
programming approach is to solve each subprob¬ 
lem only once in order to reduce the overall 
computation. 

The cornerstone of dynamic programming 
(DP) is the so-called principle of optimality 
which is described by Bellman in his 1957 book 
(Bellman 1957): 

Principle of Optimality: An optimal policy has 
the property that whatever the initial state and 
initial decision are, the remaining decisions must 
constitute an optimal policy with regard to the state 
resulting from the first decision. 

This principle of optimality gives rise to DP 
(or optimality) equations, which are referred to as 
Bellman equations in discrete-time optimization 
problems or Hamilton-Jacobi-Bellman (HJB) 
equations in continuous-time ones. Such 
equations provide a necessary condition for 
optimality in terms of the value of the underlying 
decision problem. By and large, an optimal 
control policy in most cases can be obtained by 
solving the associated Bellman (HJB) equation. 
In view of this, dynamic programming is a 
powerful tool for a broad range of control and 
decision-making problems. When the underlying 
system is driven by certain type of random 
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disturbance, the corresponding DP approach is An Asset-Selling Example 
referred to as stochastic dynamic programming. (Discrete Time) 


Terminology 

The following concepts are often used in stochas¬ 
tic dynamic programming. 

An objective function describes the objective 
of a given optimization problem (e.g., maximiz¬ 
ing profits, minimizing cost, etc.) in terms of the 
states of the underlying system, decision (control) 
variables, and possible random disturbance. 

State variables represent the information 
about the current system under consideration. For 
example, in a manufacturing system, one needs 
to know the current product inventory in order to 
decide how much to produce at the moment. In 
this case, the inventory level would be one of the 
state variables. 

The variables chosen at any time are called the 
decision or control variables. For instance, the 
rate of production over time in the manufacturing 
system is a control variable. Typically, control 
variables are functions of state variables. They 
affect the future states of the system and the 
objective function. 

In stochastic control problems, the system is 
also affected by random events (noise). Such 
noise is referred to system disturbance. The 
noise is often not available a priori. Only their 
probabilistic distributions are known. 

The goal of the optimization problem is to 
choose control variables over time so as to either 
maximize or minimize the corresponding objec¬ 
tive function. For example, in order to maximize 
the overall profits, a manufacturing firm has to 
decide how much to produce over time so as to 
maximize the revenue by meeting the product 
demand and minimize the costs associated with 
inventory. The best possible value of the objective 
is called value function, which is given in terms 
of the state variables. 

In the next two sections, we give two examples 
to illustrate how stochastic DP methods are used 
in discrete and continuous time. 


Consider a person wants to sell an asset (e.g., 
a car or a house). She is offered an amount 
of money every period (say, a day). Let 
vo, v \,..., vn-i denote the amount of these 
random offers. Assume they are independent and 
identically distributed. At the end of each period, 
the person has to decide whether to accept the 
offer or reject it. If she accepts the offer, she can 
put the money in a bank account and receive a 
fixed interest rate r > 0; if she rejects the offer, 
she waits till the next period. Rejected offers 
cannot be recycled. In addition, she has to sell 
her asset by the end of the Ath period and accept 
the last offer vn-\ if all previous offers have been 
rejected. The goal is to decide when to accept an 
offer to maximize the overall return at the Ath 
period. 

In this example, for each k, Vk is the random 
disturbance. The control variables u k take values 
in {sell, hold}. The state variables x k are given by 
the equations 


v 0 = 0; x k +i = 


sold if u k = sell 
Vk otherwise. 


Let 


h N (x N ) = 


Xn if xn ^ sold, 


0 otherwise. 

(1 + r) N ~ k x k if Xk ^ sold 
and Uk = sell 
0 otherwise, 

for k = 0 , 1 , ..., A — 1 . 


hk(Xk->Uk,Vk) — 


Then, the payoff function is given by 


( N -1 

hw(xN) + D h k (x k ,u k , v k ) 

k =0 

Here, E { Vk } represents the expected value over 
{v k }- The corresponding value functions Vk(x k ) 
satisfy the following Bellman equations: 
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Viv(xjv) = 


xn if xn ^ sold, 
0 otherwise. 


V k (x k ) = 


max ((1 + r) N k x k ,EV k +\{v k )) if x k ^ sold 
0 otherwise. 


for k = 0 , 1 ,...,#- 1 . 


The optimal selling rule can be given as (assum¬ 
ing Xk ^ sold) (see Bertsekas 1987): 

accept the offer 

Vk -1 = X k if (1 + r) N ~ k x k > EV k+ i(vk), 
reject the offer 

Vk -1 = Xk if (1 + r) N ~ k x k < EV k+ i(v k ). 


Given the distribution for Vk, one can compute 
Vk backwards and solve the Bellman equations, 
which in turn leads to the above optimal selling 
rule. 

Note that such backward iteration only works 
with finite horizon dynamic programming. When 
working with an infinite horizon (discounted or 
long-run average) payoff function, often used 
methods are value iteration (successive approxi¬ 
mation) and policy iteration. The idea is to con¬ 
struct a sequence of functions recursively so that 
they converge pointwise to the value function. 
For description of these iteration methods, their 
convergence properties, and error bound analysis, 
we refer the reader to Bertsekas (1987). 

Next, we consider a continuous-time asset¬ 
selling problem. 


where /z and o are known constants and w t is 
the standard Brownian motion representing the 
disturbance. Suppose the transaction cost is K 
and the discount rate r. She has to decide when to 
sell her asset to maximize an expected return. In 
this example, the state variable is price x t , control 
variable is a function of selling time r, and the 
payoff function is given by 

J(x,t) = Ee~ rx {x r -K). 

Let V(x) denote the value function, i.e., V(x) = 
sup r /(v, r). Then the associate HJB equation is 
given by 


min lrV(x) 


— Xfl 


dV(x) 

dx 


xV d^Vjx) 

2 dx 2 


V(x) - k} = 0. (1) 


Let 

X*- KP 
- 0 - 1 ’ 

where 



An Asset-Selling Example 
(Continuous Time) 

Suppose a person wants to sell her asset. The 
price x t at time t e [ 0, oo) of her asset is given 
by a stochastic differential equation 

dx t j 

- = ijidt + adw t , 

x t 


Then the optimal selling rule can be given as (see 
0ksendal 2007): 

( sell if x t > x*, 

| hold if x t < x*. 

In general, to solve an optimal control problem 
via the DP approach, one first needs to solve the 
associate Bellman (HJB) equations. Then, these 
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solutions can be used to come up with an optimal 
control policy. For example, in the above case, 
given the value function V(x ), one should hold if 

dV(x) x 2 cr 2 d 2 V(x ) 

rV(x)-xfi— ---= 0 

ax 2 dx z 

and sell when V(x) — K = 0. The threshold level 
v* is the exact dividing point between the first 
part equals zero and the second part vanishes. 
In addition, one can also provide a theoretical 
justification in terms of a verification theorem to 
show that the solution obtained this way is indeed 
optimal (see Fleming and Rishel (1975), Fleming 
and Soner (2006), or Yong and Zhou (1999)). 

HJB Equation Characterization and 
Computational Methods 

In continuous-time optimal control problem, one 
major difficulty that arises in solving the asso¬ 
ciated HJB equations (e.g., (1)) is the charac¬ 
terization of the solutions. In most cases, there 
is no guarantee that the derivatives or partial 
derivatives exist. In this connection, the concept 
of viscosity solutions developed by Crandall and 
Lions in the 1980s can often be used to char¬ 
acterize the solutions and their uniqueness. We 
refer the reader to Fleming and Soner (2006) for 
related literature and applications. In addition, we 
would like to point out that closed-form solutions 
are rare in stochastic control theory and difficult 
to obtain in most cases. In many applications, 
one needs to resort to computational methods. 
One typical way to solve an HJB equation is 
the finite difference methods. An alternative is 
Kushner’s Markov chain approximation methods; 
see Kushner and Dupuis (1992). 

Summary and Future Directions 

In this article, we have briefly stated stochastic 
DP methods, showed how they work in two 
simple examples, and discussed related issues. 
One serious limitation of the DP approach is 
the so-called curse of dimensionality. In other 


words, the DP does not work for problems with 
high dimensionality. Various efforts have been 
devoted to search for approximate solutions. 
One approach developed in recent years is the 
multi-time-scale approach. The idea is to classify 
random events according to the frequency of 
their occurrence. Frequent occurring events are 
grouped together and treated as a single “state” 
to achieve the reduction of dimensionality. We 
refer the reader to Yin and Zhang (2005, 2013) 
for related literature and theoretical development. 
Finally, we would like to mention that stochastic 
DP has been used in many applications in eco¬ 
nomics, engineering, management science, and 
finance. Some applications can be found in Sethi 
and Thompson (2000). Additional references are 
also provided at the end for further reading. 

Cross-References 

► Backward Stochastic Differential Equations 
and Related Control Problems 

► Numerical Methods for Continuous-Time 
Stochastic Control Problems 

► Risk-Sensitive Stochastic Control 

► Stochastic Adaptive Control 

► Stochastic Linear-Quadratic Control 

► Stochastic Maximum Principle 
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Abstract 

A stochastic game was introduced by Lloyd 
Shapley in the early 1950s. It is a dynamic 
game with probabilistic transitions played by 
one or more players. The game is played in a 
sequence of stages. At the beginning of each 
stage, the game is in a certain state. The players 
select actions, and each player receives a payoff 
that depends on the current state and the chosen 
actions. The game then moves to a new random 
state whose distribution depends on the previous 
state and the actions chosen by the players. The 
procedure is repeated at the new state, and the 
play continues for a finite or infinite number of 
stages. The total payoff to a player is often taken 
to be the discounted sum of the stage payoffs 


or the limit inferior of the averages of the stage 
payoffs. 

A learning problem arises when the agent does 
not know the reward function or the state transi¬ 
tion probabilities. If an agent directly learns about 
its optimal policy without knowing either the 
reward function or the state transition function, 
such an approach is called model-free reinforce¬ 
ment learning. Q -learning is an example of such 
a model. 

Q -learning has been extended to a noncooper¬ 
ative multi-agent context, using the framework of 
general-sum stochastic games. A learning agent 
maintains Q -functions over joint actions and per¬ 
forms updates based on assuming Nash equilib¬ 
rium behavior over the current Q -values. The 
challenge is convergence of the learning protocol. 

Keywords 

Asynchronous dynamic programming; Dynamic 
programming; Equilibrium; Markov decision 
process; Q -learning; Reinforcement learning; 
Repeated game 

Introduction 

A Stochastic Game 

Definition 1 (Stochastic games) A stochastic 
game is a dynamic game with probabilistic 
transitions played by one or more players. The 
game is played in a sequence of stages. At the 
beginning of each stage, the game is in a certain 
state. The players select actions , and each player 
receives a payoff that depends on the current state 
and the chosen actions. The game then moves to 
a new random state whose distribution depends 
on the previous state and the actions chosen by 
the players. The process is repeated at the new 
state, and the play continues for a finite or infinite 
number of stages. 

The total payoff to a player can be defined in 
various ways. It depends on the payoffs at each 
stage and strategies chosen by players. The aim 
of the players is to control their total payoffs in 
the game by appropriate actions. 
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The notion of a stochastic game was 
introduced by Lloyd Shapley (1953) in the 
early 1950s. Stochastic games generalize both 
Markov decision processes (see also MDP) and 
repeated games. A repeated game is equivalent 
to a stochastic game with a single state. The 
stochastic game is played in discrete time with 
past history as common knowledge for all the 
players. An individual strategy for a player is a 
map which associates with each given history 
a probability distribution on the set of actions 
available to the players. The players’ actions at 
stage n determines the players’ payoffs at this 
stage and the state s e 0 at stage n + 1. 

Learning 

Learning is acquiring new, or modifying and re¬ 
inforcing existing, knowledge, behaviors, skills, 
values, or preferences, and may involve synthe¬ 
sizing different types of information. The ability 
to learn is possessed by humans, animals, and 
some machines which will be later called agents. 
In the context of this entry, learning refers to 
a particular class of stochastic game theoretical 
models. 

Definition 2 (Learning in stochastic games) A 

learning problem arises when an agent does not 
know the reward function or the state transition 
probabilities. If the agent directly learns about its 
optimal policy without knowing either the reward 
function or the state transition function, such 
an approach is called model-free reinforcement 
learning. Q -learning is an example of such a 
model. 

Learning models constitute a branch of larger 
literature. Players follow a form of behavioral 
rule, such as imitation, regret minimization, or re¬ 
inforcement. Learning models are most appropri¬ 
ate in settings where players have a good under¬ 
standing of their strategic environment and where 
the stakes are high enough to make forecasting 
and optimization worthwhile. The known ap¬ 
proaches are formulated as minimax-Q (Littman 
1994), Nash-Q (Hu and Wellman 1998), tinker¬ 
ing with learning rates (“Win or Learn Fast”- 
WoLF, Bowling and Veloso 2001) and multiple 
timescale Q -learning (Leslie and Collins 2005). 


Model of Stochastic Game 

Let us assume that the environment is modeled 
by the probability space (£2 ,J r , P). An N-person 
stochastic game is described by the objects 
(9T, 0, Xk , Ak , rk , q) with the interpretation 
that: 

1. 9T is a set of players, with|9T| = N e N. 

2. 6 is the set of states of the game, and it is 
finite. 

3. X = X\ x X 2 x ... x X N is the state of 
actions , where Xk is a nonempty, finite space 
of actions for player k. 

4. Ak s are correspondences from 0 into 

nonempty subsets of Xk . For each s e 0, 

Ak (s) represents the set of actions available 

to player k in state s. For s e 0, denote 

A (s) = Ai(s) x A 2 (s) x ... x Ajy(s). 

—y 

5. : 0 x A is a payoff function for 

player k. 

6. q is a transition probability from 0 x X to 0, 
called the law of motion among states. If s is 
a state at a certain stage of the game and the 
players select ~jc e A (s), then q \s, 2?) is 
the probability distribution of the next state of 
the game. 

The stochastic game generates two processes: 

1. {(J n }l =l with values in 0 

2. {a n }l =l with values in X 

Strategies 

Let S) = 0i x X i x 0 2 x • • • be the space of 
all infinite histories of the game and S) n = 0 1 x 
X i x 0 2 x X 2 x • • • @„ the histories up to stage n. 

Definition 3 A player’s strategy it = {ct n }l =1 
consists of random maps a n : £2 x S) n —> X. 
In other words, the strategy associates with each 
given history a probability distribution dependent 
on the set of actions available to the player. If 
a n is dependent on the history only, it is called 
deterministic. 

The mathematical description of the strategies 
can be made as follows: 

1. For player i e N, a deterministic strategy 
specifies a choice of actions for the player at 
every stage of every possible history. 
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2. A mixed strategy is a probability distribution 
over deterministic strategies. 

3. Restricted classes of strategies: 

1. A behavioral strategy - a mixed strategy 
in which the mixing takes place at each 
history independently. 

2. A Markov strategy - a behavioral strategy 
such that for each time t , the distribution 
over actions depends only on the current 
state, but the distribution may be different 
at time t than at time t' ^ t. 

3. A stationary strategy - a Markov strategy in 
which the distribution over actions depends 
only on the current state (not on the time t). 

The Total Payoff Types 

For any profile of strategies 7t = (rz\ tv n) of 

the players and every initial state s\ = s e 6, a 
probability measure Pf and a stochastic process 
{G n ,a n } are defined on Sj in a canonical way, 
where the random variables o n and a n describe 
the state and the actions chosen by the players, 
respectively, on the nth stage of the game. Let us 
define Ef the expectation operator with respect 
to the probability measure Pf. For each profile 
of strategies Tt = (rt \,..., tzn) and every initial 
state s e 6, the following are considered: 

1. The expectedT-stage payoff to player k , for 
any finite horizon T, defined as 

2. The /3-discounted expected payoff to player k, 
where /3 e (0,1) is called the discount factor, 
defined as 

$£(*■)fa) = E” ^y2p n ~ l r k ((T n ,oi n )j 

3. The average payoff per unit time for player k 
defined as 

1 T 

®k(n)(s) = lim sup -<S> k (n)(s) 

T 1 


Equilibria 

Let 7T* = iff *,..., G n be a fixed profile 
of the players’ strategies. For any strategy Tt k E 
n* of player k , we write Tt k ) to denote the 
strategy profile obtained from n * by replacing tz£ 
with Ttk • 

Definition 4 (A Nash equilibrium) A strategy 
profile 7r* = (rz 7r^) e n is called a 
Nash equilibrium (in II) for the average payoff 
stochastic game if no unilateral deviations from it 
are profitable, that is, for each s e S, 

®k(n*)(s)><S> k (7rl k ,7t k )(s) 

for every player k and any strategy 7t k . 

Definition 5 (An e-Nash equilibrium) A 

strategy profile tt* = (tt*, ..., tt^) is called 
an s-(Nash) equilibrium of the average payoff 
stochastic game if for every k e 9T, we have 

%(7r*)(^)>0,(7r^,7r,)(^)-6, 

for the given e > 0 and all 7t k . 

Nash equilibria and e-Nash equilibria are anal¬ 
ogously defined for the T -stage stochastic games, 
/3-discounted stochastic games, and the average 
payoff per unit time stochastic games. 

Construction of an Equilibrium 

For stochastic games with a finite state space 
and finite action spaces, the existence of a sta¬ 
tionary equilibrium has been shown (cf. Herings 
and Peeters 2004). The stationary strategies at 
time t do not depend on the entire history of 
the game up to that time. This allows reduction 
of the problem of finding discounted stationary 
equilibria in a general n -person stochastic game 
to that of finding a global minimum in a non¬ 
linear program with linear constraints. Solving 
this nonlinear program is equivalent to solving 
a certain nonlinear system for which it is known 
that the objective value in the global minimum 
is zero (cf. Filar et al. 1991). However, as is 
noted by Breton (1991), the convergence of an 
optimization algorithm to the global optimum is 
not guaranteed. 
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The solution of the finite horizon finite 
stochastic game can be construct by dynamic 
programming (see, e.g., Nowak and Szajowski 
1998; Tijms 2012). For discounted games, the 
solution construction is based on an equivalence 
(the two-person case is presented here for 
simplicity): 

1. (jr* ,t r*) is an equilibrium point in the 

discounted stochastic game with equilibrium 
payoffs , <f >2 ("tt *))• 

2. For each s e 0, the pair (tt*(s), tt* (s)) 

constitutes an equilibrium point in the static 
bimatrix game ( B\ (s),B 2 (s)) with equilibrium 
payoffs ^ where 

for players k = 1,2, and pure actions 

(a \, $ 2 ) £ A\(s)x A 2 (s), an admissible action 
space at state s , the elements of Bk(s) related 
to (a \, $ 2 ) 

b k (s,a u a 2 ) := (1 - P)r k (s, a u a 2 ) 

+PEl auai) <S>l (if*) (1) 

An algorithm for recursive computation of 
stationary equilibria in stochastic games can 
be derived from (1). It starts with bimatrix 
games with /3 = 0, and then a careful 

equilibrium selection process guarantees its 
convergence under mild assumptions on the 
model (see, e.g., Herings and Peeters 2004). 

A Brief History of the Research on 
Stochastic Games 

The notion of a stochastic game was introduced 
by Shapley (1953) in the early 1950s. It is a 
dynamic game with probabilistic transitions 
played by one or more players. The game is 
played in a sequence of stages. At the beginning 
of each stage, the game is in a certain state. The 
players select actions, and each player receives 
a payoff that depends on the current state and 
the chosen actions. The game then moves to a 
new random state whose distribution depends on 
the previous state and the actions chosen by the 
players. The process is repeated at the new state, 
and the play continues for a finite or an infinite 
number of stages. The total payoff to a player is 
often taken to be the discounted sum of the stage 


payoffs or the limit inferior of the averages of the 
stage payoffs. 

The theory of nonzero-sum stochastic games 
with the average payoffs per unit time for the 
players started with the papers by Rogers (1969) 
and Sobel (1971). They considered finite state 
spaces only and assumed that the transition prob¬ 
ability matrices induced by any stationary strate¬ 
gies of the players are irreducible. Until now, only 
special classes of nonzero-sum average payoff 
stochastic games have been shown to possess 
Nash equilibria (or ^-equilibria). A review of 
various cases and results for generalization to 
infinite state spaces can be found in the survey 
paper by Nowak and Szajowski (1998). 


Learning in Stochastic Game 

The problem of an agent learning to act in an 
unknown world is both challenging and interest¬ 
ing. Reinforcement learning has been successful 
at finding optimal control policies for a sin¬ 
gle agent operating in a stationary environment, 
specifically a Markov decision process. Learning 
to act in multi-agent systems offers additional 
challenges (see the following surveys: Shoham 
and Leyton-Brown 2009, Chap. 7; WeiB and Sen 
1996; Bu§oniu et al. 2010). We provide here, an 
overview of a general idea of learning for single 
and multi-agent systems: 

1. Goals of single-agent reinforcement learning 
are to determine the optimal value and a con¬ 
trol policy which maximizes the payoff. The 
model of such a system can be built based on 
the framework of Markov decision processes 
with discounted payoff. Suppose the policy is 
stationary and defined by a function h : 6 
X. Such a policy defines what action should 
be taken in each state: a n (•) := h(-). There are 
various ways to learn the optimal policy. The 
most straightforward way is based on the Q- 

00 

values: Q h (s, a) = J2 P J /+ 1 - The greedy ac- 
7=0 

tion is a = argmax Q h (s,a') (see the article 

a'eA(s) 

on Q -learning in Reinforcement learning). 
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2. Multi-agent reinforcement learning can be 
employed to solve a single task, or an agent 
may be required to perform a task in an 
environment with other agents, either human, 
robot, or software ones. In either case, from an 
agent’s perspective, the world is not stationary. 
In particular, the behavior of the other agents 
may change as they also learn to better 
perform their tasks. This type of a multi-agent 
nonstationary world creates a difficult problem 
for learning to act in these environments. Such 
a nonstationary scenario can be viewed as a 
game with multiple players. In game theory, in 
the study of such problems, there is generally 
an underlying assumption that the players 
have similar adaptation and learning abilities. 
Therefore, the actions of each agent affect 
the task achievement of the other agents. It 
allows to build the value of the game and an 
equilibrium strategy profile in following steps. 
Stochastic games can be seen as an exten¬ 
sion of the single-agent Markov decision process 
framework to include multiple agents whose ac¬ 
tions all impact the resulting rewards and the next 
state. They can also be viewed as an extension 
of the framework of matrix games. Such a view 
emphasizes the difficulty of finding the optimal 
behavior in stochastic games since the optimal 
behavior of any one agent depends on the be¬ 
havior of other agents. A comprehensive study of 
the multi-agent learning techniques for stochas¬ 
tic games does not yet exist. For the interested 
reader, there are monographs by Fudenberg and 
Levine (1998) and Shoham and Leyton-Brown 
(2009) and the special issue of the journal Ar¬ 
tificial Intelligence (Vohra and Wellman 2007), 
which could be consulted. 

Despite its interesting properties, Q -learning 
is a very slow method that requires a long period 
of training for learning an acceptable policy. In 
practice, to reduce the problem, there are par¬ 
allel computing implementation models of Q- 
learning. 

Summary and Future Directions 

Details concerning solution concepts for stochas¬ 
tic games can be found in Filar and Vrieze (1997). 


The refinements of the Nash equilibrium con¬ 
cept have been known in the economic dynamic 
games (see Myerson 1978). The Nash equilib¬ 
rium concept may be extended gradually when 
the rules of the game are interpreted in a broader 
sense, so as to allow preplay or even intraplay 
communication. A well-known extension of the 
Nash equilibrium is Aumann’s correlated equi¬ 
librium (see Aumann 1987), which depends only 
on the normal form of the game. Two other 
solution concepts for multistage games have been 
proposed by Forges (1986): the extensive form 
correlated equilibrium, where the players can 
observe private exogenous signals at every stage, 
and the communication equilibrium, where the 
players are furthermore allowed to transmit in¬ 
puts to an appropriate device at every stage. An 
application of the notion of correlated equilibria 
for stochastic games can be found in Nowak and 
Szajowski (1998). 

In economics, in the context of economic 
growth problems, Ramsey (1928) has introduced 
an overtaking optimality and independently (Ru¬ 
binstein 1979) for repeated games. The crite¬ 
rion has been investigated for some stochastic 
games by Carlson and Haurie (1995) and Nowak 
(2008), and others. The existence of overtaking 
optimal strategies is a subtle issue, and there 
are counterexamples showing that one has to be 
careful with making statements on overtaking 
optimality. 

Regarding a stochastic game and learning, 
let us mention that the first idea can be found 
in the papers by Brown (1951) and Robinson 
(1951). Some convergence results for a fictitious 
play have been given by Shoham and Leyton- 
Brown (2009) in Theorem 7.2.5. An important 
example showing non-convergence was given by 
Shapley (1964). In multi-person stochastic games 
and learning, convergence to equilibria is a ba¬ 
sic stability requirement (see, e.g., Greenwald 
and Hall 2003; Hu and Wellman 2003). This 
means that the agents’ strategies should eventu¬ 
ally converge to a coordinated equilibrium. Nash 
equilibrium is most frequently used, but their 
usefulness is suspected. For instance, in Shoham 
and Leyton-Brown (2009), there is an argument 
that the link between stage-wise convergence to 
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Nash equilibria and the performance in stochastic 
games is unclear. 

Cross-References 

► Dynamic Noncooperative games 

► Evolutionary Games 

► Iterative Learning Control 

► Learning in Games 

► Stochastic Adaptive control 
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Abstract 

In this short article, we briefly review some 
major historical studies and recent progress 
on continuous-time stochastic linear-quadratic 
(SLQ) control and related mean-variance (MV) 
hedging. 
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Introduction 

A stochastic linear-quadratic (SLQ) control 
problem is the optimal control of a linear 
stochastic dynamic equation subject to an 
expected quadratic cost functional of the system 
state and control. As shown in Athans (197 1), it is 
a typical case of optimal stochastic control both in 
theory and application. Due to the linearity of the 
system dynamics and the quadratic feature of the 
cost functions, the optimal control law is usually 
synthesized into a feedback (also called closed) 
form of the optimal state, and the corresponding 
proportional coefficients are specified by the 
associated Riccati equation. In what follows, we 
restrict our exposition within the continuous¬ 
time SLQ problem, and further, mainly for the 
finite-horizon case. 

The initial study on the continuous-time SLQ 
problem seems to be due to Florentin (1961). 
However, his linear stochastic control system is 
assumed to be Gaussian. That is, the system noise 
is additive and has neither multiplication with the 
state nor with the control. Such a case is usually 


termed as the linear-quadratic Gaussian (LQG) 
problem, and in the case of complete observation, 
the optimal feedback law remains to be invariant 
when the white noise vanishes. The continuous¬ 
time partially observable case was first discussed 
by Potter (1964) and a more general formulation 
was later given by Wonham (1968a). It is proved 
that the optimal control can be obtained by the 
following two separate steps: (1) generate the 
conditional mean estimate of the current state 
using a Kalman filter and (2) optimally feed back 
as if the conditional mean state estimate was the 
true state of the system. This result is referred 
to as the certainty equivalence principle or the 
strict separation theorem. Different assumptions 
were discussed by Tse (1971) for the separation 
of control and state estimation. 

Wonham (1967, 1968b, 1970) investigated 
the SLQ problem in a fairly general systematic 
framework. In the first two papers, his stochastic 
system is able to admit a state-dependent 
noise. Finally, Wonham (1970) considered the 
following very general (admitting both state- 
and control-dependent noise) linear stochastic 
differential system driven by a d -dimensional 
Brownian motion W = (W l , W 2 , • • • , W d )\ 

X t = x + f (A S X S + B s u s ) dt 
Jo 

+ f'y](C'Z, + D[ii s )dW‘, t e [0, n 
J ° (=1 

and the following cost functional: 

J(u) = E{MX t ,X t ) 

+ e[ [(Q t X t ,X t ) + (N t u t ,u t )]dt. 
Jo 

Here, T > 0, X t e R n is the state at time t, 
and u t e R m is the control at time t. Assume 
that all the coefficients A, B;C l , D l ,i = 
1,2,..., d ; Q , N are piecewisely continuous 
matrix-valued (of suitable dimensions) functions 
of time, and M, Q t are nonnegative matrices and 
N t is uniformly positive. Wonham (1970) gave 
the following Riccati equation: 
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j -K t = A*K t + + C\*K t C\ - T t (K t )(N t + D\*K t D\)T t (K t ), t G [0, T ); 

| K T = M. 


Here, the asterisk stands for transpose, the re¬ 
peated superscripts imply summation from 1 to 
d , and the function T is defined by 

T t (K) := ~(N t +DiKDl)-\KB t +C;*KD\)* 

for time t G [0, T] and any K e := 

{all nonnegative n x n matrices}. This Riccati 
equation is a nonlinear ordinary differential 
equation (ODE). Since the nonlinear term 
T t (K)(N t + D\*KD\)T t (K) in the right-hand 
side is not uniformly Lipschitz in K in general, 
the standard existence and uniqueness theorem of 
ODEs does not directly tell whether this Riccati 
equation has a unique continuous solution in 
To solve this issue, Wonham (1970) used 
Bellman’s principle of quasilinearization and 
constructed the following sequence of successive 
linear approximating matrix-valued ODEs. 

Define for (t, K, f) G [0, T] x R n * n x R mxn , 

F t (K, f) := [A t + B t f ]*K + K[A t + B t f] 
+[c/ + D i t f]*K[c i t +d;T] 

+ Q t + f*N t f. (2) 

For ^ G the matrix F t (K, f) - 

F* (^, r,(7C)) is nonnegative, that is, 

F t (K, f) > F t (K, T t (K)), V f g R mxn . (3) 

Riccati equation (1) can then be written into the 
following form: 

-K t = F t (K t9 T t (K t )) 9 t G [0, T); 

K t = M. K } 


The iterating linear approximations are therefore 
structured as follows: Set K° = M and for / = 
1 , 2 ,..., 

i-k l t =F t (K l t ,T t (K l t -')), t e [0, T); 

\ K 1 t = M. (5) 

Using the above minimal property (3) of 
F t (K,-) at r,(/Q, Wonham showed that the 
unique nonnegative solution K l of ODE (5) 
is monotonically decreasing in the sequential 
number / = 1,2,.... Using the method 

of monotone convergence, the sequence of 
solutions {K 1 } is shown to converge to some 
K G <5^, which turns out to solve Riccati 
equation (1). 


The Case of Random Coefficients 
and Backward Stochastic Riccati 
Equation 

Bismut (1976, 1978) are the first studies on 
the SLQ problem with random coefficients. 
Let {^ t ,t G [0, T]} be the completed 
natural filtration of W. When the coefficients 
A, B,C { , D* ,i = 1,2,..., J; Q,N and M 

may be random, with A,B;C l ,D l ,i = 
1,2,d; Q, N being ^-adapted and essen¬ 
tially bounded and M being -measurable 
and essentially bounded, Bismut (1976, 1978) 
used the stochastic maximum principle for 
optimal control and derived the following Riccati 
equation: 


' —dK t = [A* K t + K t A t + C}*K t C} + C\*L\ + L\C\ 

-A> t (K t ,L t )(Nt + DfKtD^^Kt^^dt - V dW;, t G [0 ,T); (6) 


K T = M 
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where the function for t e [0, T] is defined as 
follows: 


V t (K, L) := -{N, + D' t KD’,)- 1 (KB t + C}*KD‘ t + V D \)*, V K e , V L 
:= (L 1 ,-- - ,L rf ) e 


Peng (1992b) used his stochastic Hamilton- 
Jacobi-Bellman equation to the SLQ problem 
and also derived the above equation. They both 
established the existence and uniqueness of an 
adapted solution of backward stochastic Riccati 
equation (6) when the function 'T/ ( K , L) does 
not contain L. However, Bismut used the fixed- 
point method, and Peng (1992b) used Bellman’s 
principle of quasilinearization and the method 
of monotone convergence. Neither methodology 
works for the general case of quadratic growth 
in the second unknown variable L in the drift 
of the stochastic equation. Bismut (1976, 1978) 
and Peng (1999) stated the general case as an 
open problem. By considering the stochastic 
equation for the inverse of K t , Kohlmann and 
Tang (2003a) solved some particular cases where 
the function ^ t (K,L) can depend on L. Tang 
(2003) finally solved the general case, using the 
method of stochastic flows. 

In the general case, the optimal feedback co¬ 
efficient ^ t (K t , L t ) at time t depends on L t in a 
linear manner, which is in general not essentially 
bounded with respect to ( t,co ). Kohlmann and 
Tang (2003b) observed that the stochastic integral 
process f 0 L\ d W t l is a BMO-martingale. 


Indefinite SLQ Problem 

Chen (1985) contains a theory of singular 
(the control weighting matrix vanishing in the 
quadratic cost functional) LQG control, which 
is a particular type of indefinite SLQ problems. 
In the deterministic linear-quadratic (LQ) control 
theory, the well posedness (i.e., the value function 
is finite on [0,7] x R n ) of the problem suggests 
that the control weighting matrix N in the 
quadratic cost functional be positive definite. In 
the stochastic case, when N t is slightly negative, 


the SLQ may still be well posed if the control 
could also increase the intensity of the system 
noise. Peng (1992a) used an indefinite but well- 
posed SLQ problem to illustrate his new second- 
order stochastic maximum principle. Chen et al. 
(1998) gave a deeper study on this feature of 
the SLQ problem. Yong and Zhou (1999) gave a 
systematic account of the progress around in the 
indefinite SLQ problem. 


Mean-Variance Hedging 

In the theory of finance, Duffie and Richardson 
(1991) introduced the SLQ control model to 
hedge a contingent claim in an incomplete 
market. Schweizer (1992) developed a first 
framework for MV hedging, and then it was 
extended to a very general setting in Gourieroux 
et al. (1998). Before 2000, the martingale 
method was used to solve the MV hedging 
problem. Kohlmann and Zhou (2000) began 
to use the standard SLQ theory to derive the 
optimal hedging strategy for a general contingent 
claim in a financial market of deterministic 
coefficients, and such a SLQ methodology was 
subsequently extended to very general settings 
for financial markets by Kohlmann and Tang 
(2002, 2003b), Bobrovnytska and Schweizer 
(2004), and Jeanblanc et al. (2012). See more 
detailed surveys on the literature by Pham (2000), 
Schweizer (2010), and Jeanblanc et al. (2012). 


Summary and Future Directions 

In comparison to the continuous-time determin¬ 
istic LQ theory, the continuous-time SLQ theory 
has the following two striking features: An indef¬ 
inite SLQ problem may be well posed, and the 
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optimal feedback coefficient may be unbounded 
due to its linear dependence on the martingale 
part L of the stochastic solution of the Ric- 
cati equation. Due to the second feature, the 
convergence of the sequence of successive ap¬ 
proximations constructed via Bellman’s quasi¬ 
linearization still remains to be solved in the 
general case. This problem partially motivates 
Delbaen and Tang (2010) to study the regularity 
of unbounded stochastic differential equations 
and also may help to explain the necessity of rich 
studies on mean-variance hedging and closed¬ 
ness of stochastic integrals with respect to semi¬ 
martingales (as in Delbaen et al. 1994, 1997) in 
various general settings. 

Cross-References 

► Stochastic Maximum Principle 


Recommended Reading 

The theory of SLQ control in various contexts 
is available in textbooks, monographs, or papers. 
Anderson and Moore (1971, 1989), Bensoussan 
(1992), and Chen (1985) include good accounts 
of the LQG control theory. Wonham (1970) in¬ 
cludes a full introduction to the SLQ problem 
with deterministic piecewise continuous-time co¬ 
efficients. Bismut (1978) gives a systematic and 
readable French introduction to SLQ problem 
with random coefficients. Yong and Zhou (1999) 
include an extensive discussion on the well-posed 
indefinite SLQ problem. Tang (2003) gives a 
complete solution of a general backward stochas¬ 
tic Riccati equation. 
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Abstract 

The stochastic maximum principle (SMP) gives 
some necessary conditions for optimality for 
a stochastic optimal control problem. We give 
a summary of well-known results concern¬ 
ing stochastic maximum principle in finite¬ 
dimensional state space as well as some recent 
developments in infinite-dimensional state space. 


Keywords 

Adjoint process; Backward stochastic differential 
equations; Brownian motion; Hilbert-Schmidt 
operators 


Introduction 

The problem of finding sufficient conditions for 
optimality for a stochastic optimal control prob¬ 
lem with finite-dimensional state equation had 
been well studied since the pioneering work of 
Bismut (1976, 1978). In particular, Bismut in¬ 
troduced linear backward stochastic differential 
equations (BSDEs) which have become an active 
domain of research since the seminal paper of 
Pardoux and Peng in 1990 concerning (nonlinear) 
BSDEs in Pardoux and Peng (1990). 

The first results on SMP concerned only the 
stochastic systems where the control domain is 
convex or the diffusion coefficient does not con¬ 
tain control variable. In this case, only the first- 
order expansion is needed. This kind of SMP 
was developed by Bismut (1976, 1978), Kushner 
(1972), and Haussmann (1986). It is important to 
note that (Bismut 1978) introduced linear BSDE 
to represent the first-order adjoint process. 

Peng made a breakthrough by establishing the 
SMP for the general stochastic optimal control 
problem where the control domain need not to be 
convex and the diffusion coefficient can contain 
the control variable. He solved this general case 
by introducing the second-order expansion and 
second-order BSDE. We refer to the book Yong 
and Zhou (1999) for the account of the theory 
of SMP in finite-dimensional spaces and describe 
Peng’s SMP in the next section. 

Despite the fact that the problem has been 
solved in complete generality more than 20 years 
ago, the infinite-dimensional case still has impor¬ 
tant open issues both on the side of the generality 
of the abstract model and on the side of its 
applicability to systems modeled by stochastic 
partial differential equations (SPDEs). The last 
section is devoted to the recent development of 
SMP in infinite-dimensional space. 
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Statement of SMP 

Formulation of Problem 

Let (Q , T, P) be a complete probability space, 
on which an m -dimensional Brownian motion W 
is given. Let {J r t }t>o be the natural completed 
filtration of W. 

We consider the following stochastic 
controlled system: 

dx(t) = b(x(t ), u(t))dt + o(x(t), u(t))dW(t ), 
*(0) = x 0 , (1) 

with the cost functional 

/(«(•)) = E {jf f(x(t), u(t))dt + h(x(T)) J . 

( 2 ) 

In the above, b, a, f,h are given functions with 
appropriate dimensions. ( U,d ) is a separable 
metric space. 

We define 

U = {u : [0, T] x Q 

—> U | u is {J r t }t>o ~ adapted }. (3) 

The optimal problem is: Minimize /(w(-)) 
over U. 

Any u eU satisfying 

J(u ) = inf /(V) (4) 

uelA 

is called an optimal control. The corresponding 
v and (x,u) is called an optimal state 


process/trajectory and optimal pair, respec¬ 
tively. 

In this section, we assume the following stan¬ 
dard hypothesis: 

Hypothesis 1 1. The functions b : W 1 x U i-> 
R n , cr = (cr 1 , - - - , cr m ) : W 1 x U i-> R nXm , 
/ : W 1 x U i-> P and h : W 1 M are 
measurable functions. 

2. For <p = b,a J , j = ,m, f, the func¬ 

tions x i-> (p(x,u) and x i-> h{x) are C 2 , 
denoted (p x andcp xx (respectively, h x andh xx ), 
which are also continuous functions of(x , u). 

3. There exists a constant K > 0 such that 

Wx\ + \<Pxx \ + \h x \ + I b xx | < AT, 

and 

\<p\ + 1^1 - ^(1 + M + \ u \)• 


Adjoint Equations 

Let us first introduce the following backward 
stochastic differential equations (BSDEs). 

dp(t) = ~{b x (x(t),u(t)) T p(t) (5) 

m 

+ y>i(*(0,«(0) r 9;(0 

7=1 

-f x (x(t),u(t))}dt + q(t)dW(t), 
P(T) = ~h x (x(T)). 

The solution (/>,g) to the above BSDE (first- 
order BSDE) is called the first-order adjoint 
process. 


dP(t) = ~{b x (x(t),u(t)) T P(t) + P(t)b x (x(t),u(t)) + T; oj (x(t), u(t)) T P(t)o J x (x(t), u(t)) 

7=1 

m 

+ Y j {cri(x(t)Mt)) T Qj(t) + QjitWPmrum 
7=1 

m 

+H xx (x(t),u(t), p(t),q(t))}dt +Y J Qj(t)dW>(t), (6) 

j =1 


p(r) = -h xx (x(T)), 
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where the Hamiltonian H is defined by 

H(x,u, p,q) = (p,b(x,u )) 

+ tr [q T a(x, u)] — f(x, u). (7) 

The solution ( P , Q ) to the above BSDE (second- 
order BSDE) is called the second-order adjoint 
process. 


Stochastic Maximum Principle 

Let us now state the stochastic maximum 
principle. 

Theorem 1 Let (v, u) be an optimal pair of 
problem. Then there exist a unique couple ( p , q) 
satisfying (5) and a unique couple (P, Q ) satis¬ 
fying (6), and the following maximum condition 
holds: 


H(x(t), u(t), p(t), q(t )) — H(x(t), u, p(t), q(t )) 

1 T 

— -tr({cr(x(t), u(t)) — a(x(t), u)} T P(t){o(x(t), u(t)) — <j(x(t), u)}) > 0. (8) 


SMP in Infinite-Dimensional Space 

The problem of finding sufficient conditions 
for optimality for a stochastic optimal control 
problem with infinite-dimensional state equation, 
along the lines of the Pontryagin maximum 
principle, was already addressed in the early 
1980s in the pioneering paper (Bensoussan 
1983). 

Whereas the Pontryagin maximum principle 
for infinite-dimensional stochastic control prob¬ 
lems is a well-known result as far as the con¬ 
trol domain is convex (or the diffusion does not 
depend on the control; see Bensoussan 1983; 
Hu and Peng 1990), for the general case (that 
is when the control domain need not be convex 
and the diffusion coefficient can contain a control 
variable), existing results are limited to abstract 
evolution equations under assumptions that are 
not satisfied by the large majority of concrete 
SPDEs. 

The technical obstruction is related to the fact 
that (as it was pointed out in Peng 1990) if the 
control domain is not convex, the optimal control 
has to be perturbed by the so-called spike varia¬ 
tion. Then if the control enters the diffusion, the 
irregularity in time of the Brownian trajectories 
imposes to take into account a second variation 
process. Thus, the stochastic maximum principle 
has to involve an adjoint process for the second 
variation. In the finite-dimensional case, such 
a process can be characterized as the solution 


of a matrix-valued backward stochastic differ¬ 
ential equation (BSDE), while in the infinite¬ 
dimensional case, the process naturally lives in a 
non-Hilbertian space of operators and its charac¬ 
terization is much more difficult. Moreover, the 
applicability of the abstract results to concrete 
controlled SPDEs is another delicate step due to 
the specific difficulties that they involve such as 
the lack of regularity of Nemytskii-type coeffi¬ 
cients in L p spaces. 

Concerning results on the infinite-dimensional 
stochastic Pontryagin maximum principle, as we 
already mentioned, in Bensoussan (1983) and Hu 
and Peng (1990), the case of diffusion indepen¬ 
dent on the control is treated (with the difference 
that in Hu and Peng (1990) a complete charac¬ 
terization of the adjoint to the first variation as 
the unique mild solution to a suitable BSDE is 
achieved). 

The paper Tang and Li (1994) is the first one 
in which the general case is addressed with, in 
addition, a general class of noises possibly with 
jumps. The adjoint process of the second vari¬ 
ation (P t )te[o,T] is characterized as the solution 
of a BSDE in the (Hilbertian) space of Hilbert- 
Schmidt operators. This forces to assume a very 
strong regularity on the abstract state equation 
and control functional that prevents application 
of the results in Tang and Li (1994) to SPDEs. 

Then in the papers by Fuhrman et al. (2012, 
2013), the state equation is formulated, only in a 
semiabstract way in order, on one side, to cope 
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with all the difficulties carried by the concrete 
nonlinearities and, on the other, to take advantage 
of the regularizing properties of the leading ellip¬ 
tic operator. 

Recently in Lii and Zhang (2012), P t was 
characterized as “transposition solution” of 
a backward stochastic evolution equation in 
C(L 2 {0)). Coefficients are required to be twice 
Frechet differentiable as operators in L 2 (0). 
Finally, even more recently in a couple of 
preprints (Du and Meng (2012, 2013)), the 
process P t is characterized in a similar way as it is 
in Fuhrman et al. (2012,2013). Roughly speaking 
it is characterized as a suitable stochastic bilinear 
form. As it is the case in Lii and Zhang (2012), in 
Du and Meng (2012, 2013) as well, the regularity 
assumptions on the coefficients are too restrictive 
to apply directly the results in Lii and Zhang 
(2012), Du and Meng (2012, 2013) to controlled 
SPDEs. 
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Abstract 

Model predictive control (MPC) is a control 
strategy that has been used successfully in 
numerous and diverse application areas. The 
aim of the present entry is to discuss how 
the basic ideas of MPC can be extended to 
problems involving random model uncertainty 
with known probability distribution. We discuss 
cost indices, constraints, closed-loop properties, 
and implementation issues. 
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Introduction 

Stochastic model predictive control (SMPC) 
refers to a family of numerical optimization 
strategies for controlling stochastic systems 
subject to constraints on the states and inputs 
of the controlled system. In this approach, 
future performance is quantified using a cost 
function evaluated along predicted state and input 
trajectories. This leads to a stochastic optimal 
control problem, which is solved numerically to 
determine an optimal open-loop control sequence 
or alternatively a sequence of feedback control 
laws. In MPC, only the first element of this 
optimal sequence is applied to the controlled 
system, and the optimal control problem is 
solved again at the next sampling instant on the 
basis of updated information on the system state. 
The numerical nature of the approach makes it 
applicable to systems with nonlinear dynamics 
and constraints on states and inputs, while 
the repeated computation of optimal predicted 
trajectories introduces feedback to compensate 
for the effects of uncertainty in the model. 

Robust MPC (RMPC) tackles problems with 
hard state and input constraints, which are to 
be satisfied for all realizations of model uncer¬ 
tainty. However, RMPC is too conservative in 
many applications and stochastic MPC (SMPC) 
provides less conservative solutions by handling 
a wider class of constraints which are to be 
satisfied in mean or with a specified probability. 
This is achieved by taking explicit account of the 
probability distribution of the stochastic model 
uncertainty in the optimization of predicted per¬ 
formance. Constraints limit performance and an 
advantage of MPC is that it allows systems to 
operate close to constraint boundaries. Stochas¬ 
tic MPC is similarly advantageous when model 
uncertainty is stochastic with known probability 
distribution and the constraints are probabilistic 
in nature. 

Applications of SMPC have been reported in 
diverse fields, including finance and portfolio 
management, risk management, sustainable 
development policy assessment, chemical 
and process industries, electricity generation 
and distribution, building climate control, 


andtelecommunications network traffic control. 
This entry aims to summarize the theoretical 
framework underlying SMPC algorithms. 

Stochastic MPC 

Consider a system with discrete time model 

x + = /(x, u, w) (1) 

z = g(x,u,v) (2) 

where x e W lx and u e W lu are the system 
state and control input and x + is the succes¬ 
sor state (i.e., if x* is the state at time i, then 
x + = jt/+i is the state at time i + 1). Inputs 
w G R Hw and v e R nv are exogenous distur¬ 
bances with unknown current and future values 
but known probability distributions, and z e 
is a vector of output variables that are subject to 
constraints. 

The optimal control problem that is solved on¬ 
line at each time step in SMPC is defined in terms 
of a performance index w) evaluated 

over a future horizon of N time steps. Typically 
in SMPC Jn(x, u, w) is a quadratic function of 
the following form (in which Win = x T Qx) 

N -1 

/jv(x,u,w) = y^(p;iig + iim<iiI) + v f (x N ) 

i= 0 

( 3 ) 

for positive definite matrices Q and R , and a 
terminal cost V/(x) defined as discussed in sec¬ 
tion “Stability and Convergence.” Here u := 
{uo, ... ,un-i} is a postulated sequence of con¬ 
trol inputs and x(x, u, w) := {xo,..., x^y} is the 
corresponding sequence of states such that x, is 
the solution of (1) at time i with initial state 
xo = x, for a given sequence of disturbance 
inputs w := {wq, ..., w N -i}. Since w is a ran¬ 
dom sequence, Jn(x, u, w) is a random variable, 
and the optimal control problem is therefore for¬ 
mulated as the minimization of a cost P}y(x,u) 
derived from /jy(x, u, w) under specific assump¬ 
tions on w. Common definitions of (x, u) are 
as follows. 
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(a) Expected value cost: 

F/v(x,u) := E x (/(v,u, w)) 

where E x (•) denotes the conditional expecta¬ 
tion of a random variable (•) given the model 
state x. 

(b) Worst-case cost, assuming w/ e W for all 
i with probability 1, for some compact set 
W C R nw : 

Viv(v,u) := max J(x, u, w). 

\veW N 

(c) Nominal cost, assuming w/ is equal to some 
nominal value, e.g., if w* = 0 for all /, then 

Kjv(x,u) := /(x,u,0), 

where 0 = {0,..., 0}. 

The minimization of F/v(x,u) is performed 
subject to constraints on the sequence of outputs 
Zi := g(x; ,Ui,Vj), i >0. These constraints 
may be formulated in various ways, summa¬ 
rized as follows, where for simplicity we assume 
n z = 1 . 

(A) Expected value constraints: for all /, 

E*(Z;) < 1. 

(B) Probabilistic constraints pointwise in time: 

Pr x (zi < 1) > p, 

for all i and for a given probability p. 

(C) Probabilistic constraints over a future hori¬ 
zon: 

Pr x (zi < 1, i =0,1,..., N) > p 

for a given probability p. 

In (B) and (C), Pr x (^4) represents the conditional 
probability of an event A that depends on the 
sequence x(x, u, w), given that the initial model 
state is xo = x; for example the probabil¬ 
ity Pr x (it < 1) depends on the distribution of 

{w 0 , • • -,Wi~i,Vi}. 


The important special case of state constraints 
can also be handled by (A)-(C) through 
appropriate choice of the function g(x,u,v). For 
example the constraint Pr X (h{x) < 1) > p , 
for a given function h : W 1 M, can be 
expressed in the form (B) with z = g(x, u,v) : = 
h(f (v, u, w)) and v := w in (2). 

In common with other receding horizon con¬ 
trol strategies, SMPC is implemented via the fol¬ 
lowing algorithm. At each discrete time step: 

(i) Minimize the cost index Fjv(x,u) over u 
subject to the constraints on 2/, i >0, given 
the current system state x. 

(ii) Apply the control input u = (x) to the sys¬ 

tem, where u*(v) = { uq ( x ), ..., u^_ x (x)} 
is the minimizing sequence given v. 

If the system dynamics (1) are unstable, then 
performing the optimization in step (i) directly 
over future control sequences can result in a small 
set of feasible states v. To avoid this difficulty 
the elements of the control sequence u are usu¬ 
ally expressed in the form u\ = ur{xi ) + 57 , 
where uj{x) is a locally stabilizing feedback law, 
and {^o, • • •, sn- 1 } are optimization variables in 
step (i). 


Constraints and Recursive Feasibility 

The constraints in (B) and (C) include hard con¬ 
straints (p = 1) as a special case, but in general 
the conditions (A)-(C) represent soft constraints 
that are not required to hold for all realizations 
of model uncertainty. However, these constraints 
can only be satisfied if the state belongs to a 
subset of state space, and the requirement (com¬ 
mon in MPC) that the optimization in step (i) 
of the SMPC algorithm should remain feasible 
if it is initially feasible therefore implies ad¬ 
ditional constraints. For example, the condition 
Prx(2o < 1) > p can be satisfied only if x be¬ 
longs to the set for which there exists uo such 
that Pr x (g(v, uo,Vo) < 1) > p. Hence, soft con¬ 
straints implicitly impose hard constraints on the 
model state. 

SMPC algorithms typically handle the condi¬ 
tions relating to feasibility of constraint sets in 
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one of two ways. Either the SMPC optimization 
is allowed to become infeasible (often with penal¬ 
ties on constraint violations included in the cost 
index), or conditions ensuring robust feasibility 
of the SMPC optimization at all future times 
are imposed as extra constraints in the SMPC 
optimization. 

The first of these approaches has been used 
in the context of constraints (C) imposed over 
a horizon, for which conditions ensuring future 
feasibility are generally harder to characterize in 
terms of algebraic conditions on the model state 
than (A) or (B). A disadvantage of this approach 
is that the closed-loop system may not satisfy the 
required soft constraints, even if these constraints 
are feasible when applied to system trajectories 
predicted at initial time. 

The second approach treats conditions for fea¬ 
sibility as hard constraints and hence requires a 
guarantee of recursive feasibility, namely, that the 
SMPC optimization must remain feasible for the 
closed-loop system if it is feasible initially. This 
can be achieved by requiring, similarly to RMPC, 
that the conditions for feasibility of the SMPC 
optimization problem should be satisfied for all 
realizations of the sequence w. For example, for 
given xo = x, there exists u satisfying that the 
conditions of (B) if 

Pr;, (gtXi, ui, Vi) < 1) > p, i = 0,1,... (4a) 

Xi € X V{w 0 ,...,w/-i} € W'\ i = 1,2,... 

(4b) 

where X is the set 

X = {x : 3u such that Pr x (g(x, u, v) < 1) > p}. 

Furthermore, an SMPC optimization that 
includes the constraints of (4) must remain 
feasible at subsequent times (since (4) ensures 
the existence of u + such that each element of 
x(/(x, wo, wo), u + , w + ) lies in X for all wo E W 
and all w + E W^). 

Satisfaction of (4) at each time step i on the 
infinite horizon i > N can be ensured through 
a finite number of constraints by introducing 
constraints on the A-step-ahead state xn. This 


approach uses a fixed feedback law, ut(x), to 
define a postulated input sequence after the initial 
A-step horizon via Uj = uj(xi) for all i > A. 
The constraints of (4) are necessarily satisfied for 
all i > A if a constraint 

Xn e Xj 

is imposed, where Xt is robustly positively in¬ 
variant with probability 1 under ut(x ), i.e. 

f{x,u T {x),w ) e X T , Vx e X t , Vw e W, 

(5) 

and furthermore the constraint Pr x (z < 1) > p is 
satisfied at each point in Xt under ut(x ), i.e., 

Pr X (g(x,u r (x),v) < 1) > p, Vx E X T . 

Although the recursively feasible constraints 
(4) account robustly for the future realizations 
of the unknown parameter w in (1), the key 
difference between SMPC and RMPC is that 
the conditions in (4) depend on the probability 
distribution of the parameter v in (2). It also 
follows from the necessity of hard constraints 
for feasibility that the distribution of w must in 
general have finite support in order that feasibility 
can be guaranteed recursively. On the other hand 
the support of v in the definition of z may be 
unbounded (an important exception being the 
case of state constraints in which v = w). 

Stability and Convergence 

This section outlines the stability properties of 
SMPC strategies based on cost indices (a)-(c) of 
section “Stochastic MPC” and related variants. 
We use V^(x) = to denote the 

optimal value of the SMPC cost index, and Xt 
denotes a subset of state space satisfying the 
robust invariance condition (5). We also denote 
the solution at time i of the system (1) with 
initial state Xo = x and under a given feedback 
control law u = k(x) and disturbance sequence 
w = {wo, w\, ...} as Xi ( x, k, w). 

The expected value cost index in (a) results in 
mean-square stability of the closed-loop system 
provided the terminal term Vf(x) in (3) satisfies 
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^xVf(f(x,u T (x),w) < Vf(x) - \\x\Yq 

- II Mr Will 

for all x in the terminal set Xj. The optimal cost 
is then a stochastic Lyapunov function satisfying 

E x V*(f(x,u*(x),w)) < V*(x) - \\x\\ 2 q 

~ I|m 0 *W|||. 

For positive definite Q this implies the closed- 
loop system under the SMPC law is mean-square 
stable, so that x z (x, Sq , w) -> 0 as i -> oo with 
probability 1 for any feasible initial condition x. 
For the case of systems (1) subject to additive 
disturbances, the modified cost 


for all w G W, implying x = 0 is an asymptot¬ 
ically stable equilibrium of (1) under the SMPC 
law u = Uq (x). Clearly the system model ( 1 ) can¬ 
not be subject to unknown additive disturbances 
in this case. However, for the case in which the 
system (1) is subject to additive disturbances, 
a variant of this approach uses a modified cost 
which is equal to zero inside some set of states, 
leading to asymptotic stability of this set rather 
than an equilibrium point. Also in the context 
of additive disturbances, an alternative approach 
uses an Hoo-type cost, 

~N -1 

XX u) := max V(||x ; ||g + ||w ; |||- 

f^o 

Y 2 II w,l| 2 ) + V f (x N ) 


Fftr(x,u) := E* 


pv-i 


£(IXII 2 e Wkll|-« 

_ i =0 


for which the closed-loop trajectories of (1) under 
the associated SMPC law Ui = (x z ) satisfy 


+ V f (x N ) 


y^(||x;(x,Mo,w)||| + \\Ui\\ 2 R ) < y 2 
i= 0 


where l ss := hm ) ^ooE; C (||x ! (x<«r,w)llg + 
||m ( |||) under n, = «-/■ (x,) results in the asymp¬ 
totic bound 

1 " -1 

lim -X E x(\\Xi(x,UQ,w)\\ 2 Q + \\ui III) <l ss 

n —>-oo n L — ^ 

i= 0 

along the closed-loop trajectories of (1) under the 
SMPC law Ui = Wq(x z ), for any feasible initial 
condition x. 

For the worst-case cost (b), if V/(x) is de¬ 
signed as a control Lyapunov function for (1), 
with 

V f (f(x,u T (x),w) < V f {x)-\\x\\ 2 Q -\\u T (x)\\ 2 R 

for all w G W and all x e X T , then V^(x) is a 
Lyapunov function satisfying 


oo 

£ikii 2 + x*x) 

i=0 

provided V/(/(x, ut(x), w)) < F/(x) — 

(IMIg + ll M rWI||-y 2 ||w|| 2 ) for all w e Wand 
x e X T . 

Algorithms employing the nominal cost (c) 
typically rely on the existence of a feedback law 
ut(x) such that the system (1) satisfies, in the 
absence of constraints and under Uj = ur(xi), 
an input-to-state stability (ISS) condition of the 
form 

oo oo 

£(IX (X Mr, w)||^+|k III) < y 2 Ik; \\ 2 +P 

i =0 i =0 

(6) 

for some y and > 0. If Vf (x) satisfies 
Vf(f(x, UT (x),0))< V f (x)-(\\x\\ 2 Q 


V*{f{x,ul{x),w) < V*(x)-\\x\\ 2 Q -\\u* 0 (x)\\ 2 R 


+ IXXIll) 
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for all x e Xt, then the closed-loop system 
under SMPC with the nominal cost (c) satisfies 
an ISS condition with the same gain y as the 
unconstrained case (6) but a different constant /3. 


Implementation Issues 

In general stochastic MPC algorithms require 
more computation than their robust counterparts 
because of the need to determine the probability 
distributions of future states. An important ex¬ 
ception is the case of linear dynamics and purely 
additive disturbances, for which the model (1)- 
(2) becomes 

x + = Ax + Bu + w (7) 

z — Cx T- Du + v (8) 

where A,B,C,D are known matrices. In this 
case the expected value constraints (A) and prob¬ 
abilistic constraints (B), as well as hard con¬ 
straints that ensure future feasibility of the SMPC 
optimization in each case, can be invoked non- 
conservatively through tightened constraints on 
the expectations of future states. Furthermore, the 
required degree of tightening can be computed 
off-line using numerical integration of proba¬ 
bility distributions or using random sampling 
techniques, and the online computational load is 
similar to MPC with no model uncertainty. 

The case in which the matrices A, B,C, D in 
the model (7)—(8) depend on unknown stochastic 
parameters is more difficult because the predicted 
states then involve products of random variables. 
An effective approach to this problem uses a 
sequence of sets (known as a tube) to recursively 
bound the sequence of predicted states via one 
step-ahead set inclusion conditions. By using 
polytopic bounding sets that are defined as the 
intersection of a fixed number of half-spaces, 
the complexity of these tubes can be controlled 
by the designer, albeit at the expense of con¬ 
servative inclusion conditions. Furthermore, an 
application of Farkas’ Lemma allows these sets 


to be computed online through linear conditions 
on optimization variables. 

Random sampling techniques developed 
for general stochastic programming problems 
provide effective means of handling the soft 
constraints arising in SMPC. These techniques 
use finite sets of discrete samples to represent 
the probability distributions of model states and 
parameters. Furthermore bounds are available 
on the number of samples that are needed 
in order to meet specified confidence levels 
on the satisfaction of constraints. Probabilistic 
and expected value constraints can be imposed 
using random sampling, and this approach has 
also been applied to the case of probabilistic 
constraints over a horizon (C) through a scenario- 
based optimization approach. 

Summary and Future Directions 

This entry describes how the ideas of MPC and 
RMPC can be extended to the case of stochas¬ 
tic model uncertainty. Crucial in this develop¬ 
ment is the assumption that the uncertainty has 
bounded support, which allows the assertion of 
recursive feasibility of the SMPC optimization 
problem. For simplicity of presentation we have 
considered the case of full-state feedback. How¬ 
ever, stochastic MPC can also be applied to the 
output feedback case using a state estimator if 
the probability distributions of measurement and 
estimation noise are known. 

An area of future development is optimization 
over sequences of feedback policies. Although 
an observer at initial time cannot know the 
future realizations of random uncertainty, 
information on Xf will be available to the 
controller /-steps ahead, and, as mentioned in 
section “Stochastic MPC” in the context of 
feasible initial condition sets, Ui must therefore 
depend on Jc;. In general the optimal control 
decision is of the form U[ = /z; (x/) where jll; (•) is 
a feedback policy. This implies optimization over 
arbitrary feedback policies, which is generally 
considered to be intractable since the required 
online computation grows exponentially with the 
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horizon N . However, approximate approaches to 
this problem have been suggested which optimize 
over restricted classes of feedback laws, and 
further developments in this respect are expected 
in the future. 

Cross-References 

► Distributed Model Predictive Control 

► Economic Model Predictive Control 

► Nominal Model-Predictive Control 

► Robust Model-Predictive Control 

► Tracking Model Predictive Control 


Recommended Reading 

A historical perspective on SMPC is provided 
by Astrom and Wittenmark (1973), Charnes 
and Cooper (1963), and Schwarm and Nikolaou 
(1999). A treatment of constraints stated in terms 
of expected values can be found, for example, in 
Primbs and Sung (2009). Probabilistic constraints 
and the conditions for recursive feasibility can 
be found in Kouvaritakis et al. (2010) for the 
additive case, whereas the general case of multi¬ 
plicative and additive uncertainty is described in 
Evans et al. (2012), which uses random sampling 
techniques. Random sampling techniques were 
developed for random convex programming 
(Calafiore and Campi 2005) and were used in 
a scenario-based approach to predictive control 
in Calafiore and Fagiano (2013). An output 
feedback SMPC strategy incorporating state 
estimation is described in Cannon et al. (2012). 

The use of the expectation of a quadratic cost 
and associated mean-square stability results are 
discussed in Lee and Cooley (1998). Robust sta¬ 
bility results for MPC based on worst-case costs 
are given by Lee and Yu (1997) and Mayne et al. 
(2005). Input-to-state stability of MPC based on 
a nominal cost is discussed in Marruedo et al. 
( 2002 ). 

Descriptions of SMPC based on closed-loop 
optimization can be found in Lee and Yu (1997) 
and Stoorvogel et al. (2007). These algorithms 
are computationally intensive and approximate 


solutions can be found by restricting the class of 
closed-loop predictions as discussed, for exam¬ 
ple, in van Hessem and Bosgra (2002) and Primbs 
and Sung (2009). 
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Abstract 

This article covers stock trading from a feedback 
control point of view. To this end, the mechanics 
and practical considerations associated with the 
use of feedback-based algorithms are explained 
for both real-world trading and scenarios involv¬ 
ing numerical simulation. 

Keywords and Phrases 

Feedback Control; Finance; Model-Free; Stock 
Trading 

Introduction 

Stock trading involves the purchase and sale 
of shares of ownership in public companies by 
an individual or entity such as a pension fund, 
mutual fund, hedge fund, or endowment. These 
shares are typically traded in markets, such as the 
New York Stock Exchange and the NASDAQ, 
with the trader’s goal generally being to increase 
wealth. The words feedback control in the title of 
this article broadly refer to the use of information 
such as prices, profits and losses which becomes 
available to the trader over time and is used to 
make purchase and sales decisions according 
to some set of rules. That is, the size of the 
stock position being held varies with time. The 
mapping from information to the investment 
level is called the feedback law and is typically 
described with a closed-loop configuration and 


classical algorithms which come from the body 
of research called control theory; e.g., see Astrom 
and Murray (2008). 

For simplicity, in this article, we restrict atten¬ 
tion to trading a single stock while noting that the 
concepts described herein are readily modified 
to address the multi-stock case, i.e., a portfolio. 
To our knowledge, the basic idea of viewing 
portfolios in a control-theoretic setting goes back 
to Merton (1969) where optimal control concepts 
are explicitly used; see also Samuelson (1969) 
where a less general formulation is considered. 
Whereas the theoretical foundations in their work 
rely on idealized assumptions such as “friction¬ 
less markets” and “continuous trading,” the main 
objective in this article is to describe the practical 
considerations and complexities which arise in 
real-world stock trading via feedback control and 
associated simulations. That is, the exposition 
to follow includes no significant idealizing as¬ 
sumptions and emphasizes implementation issues 
and constraints which are encountered by the 
practitioner; i.e., the purpose of this article is to 
describe trading mechanics in a feedback context. 
Hence, when we define a trading strategy in the 
sequel, we include no significant discussion of 
performance metrics related to risk and return; 
the reader is referred to the book by Luenberger 
(1998) for coverage of these topics. 

Feedback Versus Open-Loop Control 

We first elaborate on the definitions above by 
pointing out the distinction between trading a 
stock via feedback control and its alternative, 
“open-loop control.” This is done via simple ex¬ 
amples: Suppose an investor buys $ 1,000 of stock 
at time t = 0 with the a priori plan to make no 
changes in this position until some prespecified 
future time t = T. Then, this buy-and-hold trad¬ 
ing strategy falls within the realm of open-loop 
control. If instead this same investor adds $1,000 
to the position every month, then this type of 
dollar-cost averaging strategy would still fall into 
the open-loop category. That is, in both scenarios, 
no information is being used to modify the stock 
position over time. Finally, suppose this same 
investor makes a $1,000 purchase only at the end 
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of those months over which the account value 
has decreased. Then this type of buy-low investor 
is now using a simple feedback control strategy 
because gain-loss information is being used to 
modify the stock position over time. The ability 
of feedback to cope with the uncertainty of future 
price movements is an important advantage of its 
use in trading. 

Closed-Loop Feedback Configuration 

To describe stock trading via feedback control in 
a more formal manner, the first step involves the 
creation of a closed-loop feedback configuration 
involving the trader and the broker; see Fig. 1. In 
the figure, the feedback controller resides inside 
the block labeled “trader.” There is a wide diver¬ 
sity of possible algorithms which the trader can 
use to modify the investment level over time. In 
some cases, a fixed model for future stock prices 
is central to the trading algorithm. Oftentimes, no 
stock price model is used at all, and trading sig¬ 
nals are generated based on “price patterns.” This 
falls under the umbrella “technical analysis” in 
its purest form; e.g., see the books by Kirkpatrick 
and Dahlquist (2007) and Lo and Hasanhodzic 
(2010) for further details. In any event, regardless 
of the trading method used, the time-varying 
control signal is the investment level I (t). 


Discrete Time and Short Selling 

Since this article aims to describe real-world 
stock-trading mechanics as opposed to theoretical 


results, we work in discrete time. That is, the 
initial investment at time t = 0 is denoted 
by 7 0 = 7(0), and assuming trade updates can 
be performed every At units of time, I(t ) is 
replaced by I(k) = I{kAt). We also allow for 
the possibility that I(k) < 0. In this case, the 
trader is called a short seller and the following 
is meant: Shares valued at I(k) are borrowed 
from the broker and immediately sold in the 
market in the hope that the price will decline. 
If such a decline occurs, the short seller can 
“cover” the position and realize a profit by buying 
back the stock and returning the borrowed shares 
to the broker. Alternatively, if the stock price 
increases, the short seller can continue to hold 
the position with a “paper loss” or buy back the 
borrowed stock at a loss. For the more classical 
case when I{k) > 0, the trade is said to be 
long. Finally, to conclude this section, analogous 
to what was done for the investment, we use the 
notation p(k),g(k ), and V(k) to represent the 
stock price, trading gains or losses, and account 
value at time t = kAt. 


First Ingredient: Price Data 

A trading system, be it a simulation or real- 
money implementation, involves sequential price 
data p(k). This can be obtained either in real time 
or can be historical stock market data. As far as 
historical data is concerned, there are various rec¬ 
ognized sources that provide end-of-day “closing 
prices,” adjusted for splits and dividends. These 
can be downloaded for free from Yahoo! Finance. 
Another possibility, available from the Wharton 


Stock Trading via 
Feedback Control, Fig. 1 

Feedback loop involving 
trader and broker 


Broker gathers Information 
such as price, volume, news 

-* Order Routing 

Investment 
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Broker passes information and financial 
accounting reports to the stock trader 
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Research Data Services for a subscription fee, is 
the comprehensive database of historical prices at 
time scales from monthly to tick by tick. 

It is also possible to conduct stock-trading 
simulations using synthetic data. For example, 
one of the most common ways that synthetic 
prices are generated is via a geometric Brownian 
motion process. That is, a process drift \i and 
a volatility <j > 0, say on an annualized basis, 
are provided to the simulator, and prices are 
generated sequentially in time via a recursion 
such as the Euler scheme with iterates 

p(k + 1) = ^1 + /xA t + G€(k)y/~Kt ^ p(k) 

where is measured in years and e(k) is a 
zero-mean normally distributed random variable 
with unit standard deviation. A code used for 
simulation of stock trading should also include a 
check that p(k) > 0. The reader is referred to 
the textbook by Oksendal (1998) for a detailed 
description of this celebrated stochastic price 
model. 

Second Ingredient: The Feedback Law 

The second ingredient for trading is the 
previously mentioned mapping taking the 
information available to the trader to the amount 
invested I(k). This feedback law is the “heart” of 
the controller and allows it to adapt to uncertain 
and changing market conditions. Perhaps the 
simplest example of a stock-tradin g feedback law 
is obtained using a classical linear time-invariant 
controller. In this case, the trader modulates the 
level of investment I(k) in proportion to the 
cumulative gains or losses from trading according 
to the formula 

m = h + Kgik). 

This is an example of technical analysis with no 
stock price model being used; see Fig. 2. 

Using the feedback law above, the trader ini¬ 
tially invests 7(0) = Iq in the stock and then 
begins to monitor the cumulative gain or loss 
g(k ) associated with this investment. One begins 


with states g(0) = 0 and 7(0) and subsequently 
changes I(k) if the position begins to either make 
or lose money depending on the movement of the 
stock. The constant of proportionality K above, 
the so-called feedback gain, is used to scale 
the investment level. When Iq and K are posi¬ 
tive, I(k) is initially positive and the trade is long. 
Alternatively, when Iq and K are negative, I(k) 
is initially negative; hence, the trader is a short 
seller. This type of classical linear feedback is an 
example of a strategy which falls within the well- 
known class of “trend followers.” 

As a second example, we consider a long trade 
with Iq, K > 0 and investor who wishes to limit 
the trade to some level I max > Iq. In this case, 
the feedback loop includes a nonlinear saturation 
block, see Fig. 3, and the update equation for 
investment is 

I(k) = min{/ 0 + Kg(k), / max }. 

A short-trade version of the above can similarly 
be defined and there are also variations of this 
scheme, involving the notion of “reset,” which 
assures that excessive time is not spent in the 
saturation regime when the stock price is falling 
after a long period of increase or decrease. 

In the formula above and in the sequel, for 
simplicity, we allow I(k) to represent a fractional 
number of shares. In practice, this type of frac¬ 
tional holding is only allowed in some restricted 
situations such as reinvestment of dividends or 
dollar allocations to buy shares of a mutual fund. 
However, in cases where a significant number 
of shares are being bought or sold, the use of 
fractional shares is a good approximation which 
can be used for all practical purposes. Finally, 
to conclude this section, we mention a subtlety 
which is easily overlooked in a simulation: If the 
intention of the trader is to be “long,” then I(k) < 
0 should be ruled out by including the condition 
I(k) = max{ I(k), 0} as part of the control logic. 


Order-Filling Mechanics 

At time t = kAt, the trader specifies the de¬ 
sired investment update to the broker who is 
responsible for providing a “fill” via interaction 
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Stock Trading via Feedback Control, Fig. 2 Stock trading via linear feedback 


stock price 
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Stock Trading via Feedback Control, Fig. 3 Feedback loop with saturation 


with the stock exchange. The way this step is 
carried out depends on a number of factors: If 
the stock being purchased is not heavily traded, 
there may be “liquidity” issues which manifest 
themselves as “bid-ask spread.” In general, there 
will always exist an ask price and a bid price for 
any stock in the market. To see how a liquidity 
issue can arise, imagine a trader who wishes to 
purchase 100 shares at the ask price of $100 
per share. If there are only 75 shares available 
at $100, the trader will need to pay more for the 
second portion of the purchase. For example, if 
there are 500 shares available with an ask price 
of $102 and transaction costs charged by the 
broker are 5 cents per share, the following will 
occur: The trader will obtain 100 shares with 
two “partial fills” and end up with an average 
acquisition cost of $100.55. This type of bid-ask 
gap scenario may arise for a large trader such as 
a hedge fund. For example, if millions of shares 
are being purchased at time t = kAt, the price 


of the final shares acquired may be significantly 
higher than the initial shares. 

In the case when a stock trades with large daily 
volume, if large “market movers” such as hedge 
funds are not transacting, it can often be assumed 
in simulations that the trader is a price taker. That 
is, one assumes bid-ask spread is zero and trading 
is said to be “highly liquid.” The final point to 
mention is that there are different order types 
which can be specified by the trader. The three 
most common order types are called market , 
limit, and stop. 

The bottom line on order filling is as follows: 
When stock trading is carried out or simulated, 
all of the complications above can be handled via 
appropriate interpretation of the stock price p(k) 
at time t = kAt. This is accomplished as 
follows: When a trade is executed, be it with mul¬ 
tiple transactions or as a special order type, we 
take p(k) to be the average weighted price. For 
example, to illustrate for a long trade involving 
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two transactions, suppose a trader arrives at in¬ 
vestment level I(k) via two trades: the first is in¬ 
vestment I a 0 k ) to purchase shares at price p a (k) 
and the second is an investment h (k) to purchase 
shares at price Pb(k). Then, the average cost to 
acquire these shares is readily calculated to be 

(k\ = Paik)Pbik)_ 

n) Pa(k)I b (k) + Pb (k)Ia(k) ( 

where A I(k) is the amount of the stock transac¬ 
tion at time t = kAt. This quantity is given by 

A I(k) = I(k) - (1 + p(k - 1 ))I(k - 1) 

where 

n ^ ■ P( k )~ P( k ~ 0 

P( k - 1) = - 7j -77- 

p(k - 1) 

is the percentage change in the stock price 
from k — 1 to k. Subsequently, transactions at 
later times t > kAt can be carried out as if all 
shares were acquired at price p(k). 

When this multiple-transaction issue arises in 
real trading, it may not be possible to predict in 
advance what price p(k) will result. For example, 
in the 100-share scenario above, the outcome 
depended on the bid-ask queue. Notice that this 
did not present a problem as far as gain-loss 
accounting is concerned; i.e., the average price 
per share $100.55 was readily calculated. How¬ 
ever, when it comes to simulation, a model for 
“share acquisition” would need to be assumed. 
For example, for the case of geometric Brownian 
motion described earlier, a common model is 
that the trader is a price taker and that liquidity 
is sufficiently high so that an order involving 
investment A I{k) is filled at the sample-path 
price p(k)\ i.e., no averaging over multiple trans¬ 
actions is required. 

Gain-Loss Accounting 

A broker generally provides frequent updates on 
gains and losses g(k) attributable to stock price 
changes. That is, 


g(k + 1) = g(k) + p(k)I(k ) - T(k) 

where T(k ) is the so-called transaction costs, 
most of which consist of the broker’s commis¬ 
sion. These costs are charged for each trade and 
are much lower nowadays versus decades ago. 
For example, using a discount broker, one can 
easily obtain commission rates of less than $5 per 
trade, even when a large number of shares are be¬ 
ing transacted. Modulo the transaction costs, the 
equation above simply states that the change in 
the cumulative gain or loss A g(k) over a time in¬ 
crement At is equal to the investment I{k) mul¬ 
tiplied by the return on the stock A p{k)/p(k). 

Interest Accumulation and Margin 
Charges 

In many brokerage accounts, it is possible to bor¬ 
row funds or shares from the broker to purchase 
or short sell a stock. This is referred to as trading 
on margin and the broker will charge an interest 
rate on the borrowed funds known as the margin 
rate. While in practice there is a limit on how 
much money can be borrowed, it can be quite 
large; e.g., hedge funds can easily obtain access 
to many multiples of their account value. Another 
possibility is that the trader is not fully invested 
and the account contains “idle cash” on which 
interest, paid by the broker, accrues. 

To cover both the interest and margin accrual, 
we work with the account cash , surplus or short¬ 
fall, to determine whether interest is accrued or 
margin charges need to be paid. For a long trade 
with I{k) > 0 for the period At, we work with 
the broker interest rate, often called the risk-free 
return, rf > 0, or the broker margin rate m to 
obtain the interest accrual 

A(k) = rf max{V(k) — I(k), 0} 

+ m min {V(k) — I(k), 0}. 

For the case of a short trade with I(k) < 0, 
the formula above will only hold for traders with 
very large accounts who have sufficient leverage 
with the broker so as to be allowed to capitalize 
on the proceeds of a short sale. For the typical 
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small- to medium-size trading account, the short- 
sale proceeds are generally “held aside” and the 
account is “marked to market” on a daily basis. 
As a result, the A(k) equation above needs to be 
revised to account for “cash in reserve” and turns 
out to provide smaller interest rate accruals to the 
trader. 

Finally, the broker’s report generally includes 
the entire value of the account V(k). This number 
is made up of the stock positions, either idle or 
borrowed cash and “dividends” D(k ) which may 
be paid periodically to the trader by the company 
whose shares are being held. Thus, the broker 
performs the calculation 

V (k + 1) = V (0) + g(k) + A(k) + D(k) 

and a trader can typically see these updates in real 
time. 

Collateral Requirements and 
Margin Calls 

When formulating the simulation model for trad¬ 
ing, it is important to take account of the fact 
that the size of the trader’s investment I(k) is 
limited by the collateral requirements of the bro¬ 
ker. For example, when a long stock position 
falls dramatically, a trader on margin may find 
that I(k) exceeds the account value V(k) by too 
large an amount to meet the broker’s collateral 
requirements. In this case, new transactions are 
“stopped” and a so-called in guates results; i.e., 
to avoid forced liquidation of positions to bring 
the account back into compliance, the trader 
must deposit new assets or cash into the account 
within a short prespecified time period. In simu¬ 
lations, for a brokerage account with total market 
value V(k), a constraint of the sort 

|/(*)l < yV(k) 

can be imposed with y = 2 being rather typical. 

Simulation Example 

We provide a simulation example illustrating the 
use of control in stock trading and its ability to 
adapt to the inherent uncertainty in stock price 


movements. Figure 4 shows the daily closing 
prices from January 1, 2008 to June 1, 2012 
of Google (GOOG), traded on the NASDAQ 
stock exchange. The figure also includes the 50- 
day simple moving average p aY (k) which will 
be used with a control law whose investment 
level depends on sign changes in p(k) — Pw(k)\ 
see Brock et al. (1992) where moving average 
crossing strategies are studied. There is no trading 
during the first 50 days while the moving average 
is being initialized. Subsequently, the trading be¬ 
gins at the first instant k = k* when the moving 
average has been crossed. For k >k*, the control 
law for the investment level is given by 

I(k) = / 0 sign' p(k) - p m (k)} 

where / () = $20,000 is used in the simulation. To 
make the example more interesting, we assume 
initial account value F(0) = $10,000. Hence, 
the issue of margin is immediately in play. In 
the simulation, we use risk-free rate rf — 0.015 
corresponding to 1.5 % per annum and a margin 
rate m = 0.03 corresponding to 3 % per annum. 
It is assumed that interest may be obtained on 
the proceeds of short sales at the risk-free rate. 
Google does not pay a dividend, so no adjustment 
of closing prices is required. A transaction cost of 
$3 per trade is charged. This charge occurs every 
day of trading because the position is adjusted 
daily to target I(k) = =b$20,000. We assume the 
broker imposes a collateral constraint of | I(k) | < 
2 V(k) to limit I(k) when sufficient funds are 
not available. Furthermore, we assume that it is 
possible to hold a fractional number of shares and 
that a “market-on-close” order each day is filled 
at the closing price. Finally, Fig. 4 also shows the 
evolution of the account value V(k) over time. 

Summary and Future Directions 

This article concentrated entirely on trading me¬ 
chanics and simulation using strategies based 
on control-theoretic considerations. In a future 
version of the encyclopedia, it would be desirable 
to include a “companion” article which covers 
the topic of performance metrics. That is, once 
trading or simulation is complete, it is natural to 
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Stock Trading via Feedback Control, Fig. 4 Feedback trading of Google 


ask whether the algorithm used was successful or 
not. To this end, there is a large body of literature 
covering measures for risk and return which are 
important for performance strategy evaluation 
purposes. One highlight of this literature is the 
paper by Artzner et al. (1999) on coherent risk 
measures, a topic pursued in current research. 
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Recommended Reading 

In addition to the basic references cited in the 
previous sections, there is a growing body of 
literature on stock trading and financial markets 
with a control-theoretic flavor. In contrast to this 
article, the focal point in this literature is largely 
performance-related issues rather than the “nuts 
and bolts” of stock-trading mechanics which are 
described here. For the uninitiated reader, one 
starting reference for an overview of the liter¬ 
ature would be the tutorial paper by Barmish 
et al. (2013). To provide a capsule summary, 
it is convenient to subdivide the literature into 
two categories: The first category, called model- 
based approaches, involves an underlying param¬ 
eterized model structure which may or may not 
be completely specified. The second category of 
papers, called model-free approaches, falls under 
the previously mentioned umbrella of technical 
analysis. That is, the stock price is viewed as an 
external input with no predictive model for its 
evolution. In addition, no parameter estimation is 
involved and feedback trade signals are generated 
based on some observed “patterns” of prices or 
trading gains. Thus, this line of research high¬ 
lights the ability of feedback to cope with the 
uncertainty of an unmodelled price process. 
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Abstract 

This chapter introduces strategic form games, 
which provide a framework for the analysis of 
strategic interactions in multi-agent environ¬ 
ments. We present the main solution concept 
in strategic form games, Nash equilibrium , and 
provide tools for its systematic study. We present 
fundamental results for existence and uniqueness 
of Nash equilibria and discuss their efficiency 
properties. We conclude with current research 
directions in this area. 
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Introduction 

Many problems in communication, decision, and 
technological networks as well as in social and 
economic situations depend on human choices, 
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which are made in anticipation of the behavior 
of the others in the system. Examples include 
how to map your drive over a road network, 
how to use the communication medium, and how 
to choose strategies for resource use and more 
conventional economic, financial, and social de¬ 
cisions such as which products to buy, which 
technologies to invest in, or who to trust. The 
defining feature of all of these interactions is 
the dependence of an agent’s objective (payoff, 
utility, or survival) on others’ actions. Game the¬ 
ory focuses on formal analysis of such strategic 
interactions. Here, we will review strategic form 
games, which focus on static game-theoretic in¬ 
teractions and present the relevant solution con¬ 
cept. 

Strategic Form Games 

A strategic form game is a model for a static game 
in which all players act simultaneously without 
knowledge of other players’ actions. 


game can be represented in matrix form, where 
the rows correspond to the actions of player 1 and 
columns represent the actions of player 2. The 
cell indexed by row v and column y contains a 
pair (< a,b ), where a is the payoff to player 1 and 
b is the payoff to player 2, i.e., a = u\(x, y) and 
b = U 2 (x,y). This class of games is sometimes 
referred to as bimatrix games. For example, con¬ 
sider the following game of “Matching Pennies.” 


Heads Tails 


- 1,1 

1,-1 

1,-1 

- 1,1 


Matching Pennies 


This game represents “pure conflict” in the 
sense that one player’s utility is the negative of 
the utility of the other player, i.e., the sum of 
the utilities for both players at each outcome is 
“zero.” This class of games is referred to as zero- 
sum games (or constant-sum games ) and has been 
studied extensively in the game theory literature 
(Basar and Olsder 1995). 


Definition 1 (Strategic Form Game) A strate¬ 
gic form game is a triplet 
(X, ( Si) ie x , (ui)i e z) where: 

1. X is a finite set of players, X = {1,...,/}. 

2. Si is a nonempty set of available actions for 
player i. 

3. Ui : S M is the utility (payoff) function of 
player i where ^ — Hex 

We will use the terms action and (pure) strat¬ 
egy interchangeably. (We will later use the term 
“mixed strategy” to refer to randomizations over 
actions.) We denote by Si e Si an action for 
player i , and by s-i = [sj]j^ a vector of actions 
for all players except i. We refer to the tuple 
(si , S-i) e S as an action (strategy)profile or out¬ 
come. We also denote by S-i = Yljjti Sj the set 
of actions (strategies) of all players except i. Our 
convention throughout will be that each player i 
is interested in action profiles that “maximize” his 
utility function Ui . 

The next two examples illustrate strategic 
form games with finite and infinite strategy sets. 

Example 1 (Finite Strategy Sets) We consider a 
two-player game with finite strategy sets. Such a 


Example 2 (Infinite Strategy Sets) We next 
present a game with infinite strategy sets. We 
consider a simple network game where two 
players send data or information flows over a 
communication network represented by a single 
link. Each player i derives a value for sending Si 
units of flow over the link given by 



where at e [0,1] is a player-specific scalar. Each 
player also incurs a per-flow delay or latency cost, 
due to congestion on the link, represented by the 
function / (s) = s, where s is the total flow on the 
link, i.e., s = S\ + S2 (see Fig. 1). The resulting 
interactions can be represented by the strategic 
form game ( X , (Si), (u^), which consists of: 

1. A set of two players, X = 1 , 2 

2. A strategy set Si = [0,1] for each player /, 
where Si e St represents the amount of flow 
player i sends over the link 

3. A utility function Uj for each player i given 
by value derived from sending St units of flow 
minus the total latency cost, i.e., 
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Si 


S 2 




o 


l(s) = s 


S = S, + S 2 



Strategic Form Games and Nash Equilibrium, Fig. 1 

A network game with two players 


Thus, if we define the best-response corre¬ 
spondence B(s ) = [Bj (s-j)\i e z, the set of Nash 
equilibria is given by the set of fixed points of 
B(s). Below, we give two examples of games 
with pure strategy Nash equilibria. 

Example 3 (Battle of the Sexes) Consider a two- 
player game with the following payoff structure: 


Ui (Si, S 2 ) = Vi (Si )- S t l ( si + S 2 ). 


Nash Equilibrium 


Ballet 

Soccer 


Ballet Soccer 


2 , 1 

0,0 

0,0 

1,2 


Battle of the Sexes 


We next introduce the fundamental solution con¬ 
cept for strategic form games, Nash equilibrium. 
A Nash equilibrium captures a steady state of 
the play in a strategic form game such that each 
player acts optimally given their “correct” con¬ 
jectures about the behavior of the other players. 

Definition 2 (Nash Equilibrium) A (pure strat- 
egy) Nash equilibrium of a strategic form game 
(X, (Si), (uj)iez) is a strategy profiles* e S such 
that for all i e X , we have 

Ui (s *, s* f ) > Ui (si, s * i ) for all s t G S t . 

Hence, a Nash equilibrium is a strategy profile 
s* such that no player i can profit by unilaterally 
deviating from his strategy s*, assuming every 
other player j follows his strategy s *. The def¬ 
inition of a Nash equilibrium can be restated in 
terms of best-response correspondences. 

Definition 3 (Nash Equilibrium - Restated) 

Let (X, (Si), (ui)iez) be a strategic form game. 
For any s-i e S-i, consider the best-response 
correspondence of player i, B{ (s-i), given by 

Bi (S—i ) = {$i ^ Si | Ui (Si, S—i ) > Ui (Sj, S—i ) 

for all s- e Si}. 


This game, referred to as the Battle of the 
Sexes game, represents a scenario in which the 
two players wish to coordinate their actions but 
have different preferences over their actions. 
This game has two pure strategy Nash equilibria, 
i.e., the strategy profiles (Ballet, Ballet) and 
(Soccer, Soccer). 

Example 4 Recall the network game given in 
Example 2. To simplify the computations, let us 
assume without loss of generality that a\ > a^ > 
y. It can be seen that the best-response functions 
(single-valued in this case) of the players are 
given by 

Bi (s-i ) = max jo, Ul ^ 1 j for i = 1,2. 

The unique pure strategy Nash equilibrium of this 
game is the fixed point of these functions given by 

, * ( 3fli -< 2 2 3<2 2 -a\\ 

= 8 — 8 — I' 

Mixed Strategy Nash Equilibrium 

Consider the two-player “penalty kick” game 
between a penalty taker and a goalkeeper that 
has the same payoff structure as the matching 
pennies: 


We say that an action profile s* is a Nash 
equilibrium if 


Left 

Right 


Left Right 


1,-1 

- 1,1 

- 1,1 

1,-1 


Penalty kick game 


s* e Bi (sli) for all i e X. 
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This game does not have a pure strategy Nash 
equilibrium. It can be verified that if the penalty 
taker (column player) commits to a pure strategy, 
e.g., chooses left, then the best response of 
the goalkeeper (row player) would be to choose 
the same side leading to a payoff of —1 for the 
penalty taker. In fact, the penalty taker would be 
better off following a strategy which randomizes 
between left and right, ensuring that the goal¬ 
keeper cannot perfectly match his action. This 
is the idea of “randomized” or mixed strategies 
which we will discuss next. 

We first introduce some notation. Let X 7 de¬ 
note the set of probability measures over the pure 
strategy (action) set Si . We use 07 G X/ to denote 
the mixed strategy of player /. When Si is a 
finite set, a mixed strategy is a finite-dimensional 
probability vector, i.e., a vector whose elements 
denote the probability with which a particular 
action will be played. For example, if Si has two 
elements, the set of mixed strategies X, is the 
one-dimensional probability simplex, i.e., X 7 = 
{(vi,X 2 ) | Xi > 0, x\ + X 2 = 1}. We use 
cr G ^ ~ rizeZ X 7 to denote a mixed strategy 
profile. Note that this implicitly assumes that 
players randomize independently. We similarly 
denotea_, e = Y\j& Zj- 

Following von Neumann-Morgenstern 
expected utility theory, we extend the payoff 
functions u\ from S to X by 



i.e., the payoff of a mixed strategy a is given by 
the expected value of pure strategy payoffs under 
the distribution a. 

We are now ready to define the mixed strategy 
Nash equilibrium. 

Definition 4 (Mixed Strategy Nash Equilib¬ 
rium) A mixed strategy profile cr* is a mixed 
strategy Nash equilibrium if for each player i , 

Ui (cr*, cr * -) > Ui ( Gi , cr * -) for all cr 7 - e X 7 . 

Note that since Ui (< cr 7 , cr* ? .) = J s Ui (57 , < 7 *.) 
dGi ( Si ), it is sufficient to check only pure strategy 


“deviations” when determining whether a given 
profile is a Nash equilibrium. This leads to the 
following characterization of a mixed strategy 
Nash equilibrium. 

Proposition 1 A mixed strategy profile cr* is a 
mixed strategy Nash equilibrium if and only if for 
each player i, 

Ui (cr*, cr*- ) > m (s t , cr *.) for all s t G S t . 


We also have the following useful character¬ 
ization of a mixed strategy Nash equilibrium in 
finite strategy set games. 

Proposition 2 Let G = (X, (S/)iez, (w/)i'ez) be 
a strategic form game with finite strategy sets. 
Then, cr* g X is a Nash equilibrium if and only if 
for each player i G X, every pure strategy in the 
support of g* is a best response to cr*-. 

Proof Let cr* be a mixed strategy Nash equi¬ 
librium, and let E* = w 7 (cr *, cr X ) denote the 
expected utility for player i . By Proposition 1 , we 
have 

E * > Ui (, St , cr* ; -) for all s t G Sj . 

We first show that E* = Ui ( 57 , cr*-) for all s 7 in 
the support of cr* (combined with the preceding 
relation, this proves one implication). Assume to 
arrive at a contradiction that this is not the case, 
i.e., there exists an action s[ in the support of cr* 
such that w 7 (s', cr*-) < Ef. Since w 7 (s 7 ,< 7 *.) < 
E * for all Si G 5/, this implies that 

X *;&)«/(*,*-/) < e *, 

s i ^ Si 

which is a contradiction. The proof of the other 
implication is similar and is therefore omitted. 

It follows from this characterization that every 
action in the support of any player’s equilib¬ 
rium mixed strategy yields the same payoff. This 
characterization extends to games with infinite 
strategy sets: cr* g X is a Nash equilibrium if and 
only if for each player i G X, given a* -, no action 
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in Si yields a payoff that exceeds his equilibrium 
payoff, and the set of actions that yields a payoff 
less than his equilibrium payoff has cr* -measure 
zero. 

Example 5 Let us return to the Battle of the 
Sexes game. 


Ballet 

Soccer 


Recall that this game has 2 pure strategy Nash 
equilibria. Using the characterization result in 
Proposition 2, we show that it has a unique 
mixed strategy Nash equilibrium (which is not a 
pure strategy Nash equilibrium). First, by using 
Proposition 2 (and inspecting the payoffs), it 
can be seen that there are no Nash equilibria 
where only one of the players randomizes over 
its actions. Now, assume instead that player 1 
chooses the action Ballet with probability p e 
(0,1) and SOCCER with probability 1 — p and 
that player 2 chooses Ballet with probability 
q e (0,1) and SOCCER with probability 1 — q. 
Using Proposition 2 on player l’s payoffs, we 
have the following relation: 

2xq + 0x(l— q) = 0xq + lx(l— q). 
Similarly, we have 

lxp + 0x(l — p) = 0xp + 2x(l — p). 

We conclude that the only possible mixed strat¬ 
egy Nash equilibrium is given by q = \ and 
n _ 2 
E ~ 3 - 

Existence of Nash Equilibrium 

The first question that one contemplates in ana¬ 
lyzing a strategic form game is whether it has a 
pure or mixed strategy Nash equilibrium. While 
it may be possible to explicitly construct a Nash 
equilibrium (using either computational means or 
characterization results), this may be a tedious 
task in the case of both large finite strategy 
set games or infinite strategy set games with 


complicated utility functions. One is therefore of¬ 
ten interested in establishing existence of an equi¬ 
librium, using conditions on the utility functions 
and constraint sets, before trying to understand 
its properties. In the sequel, we present results 
on existence of an equilibrium for games with 
finite and infinite strategy sets. The proofs of 
such existence results typically use fixed point 
arguments on the best-response correspondences 
of the players. They are omitted here and can be 
found in graduate-level game theory text books 
(see Fudenberg and Tirole 1991 and Myerson 
1991). 

Finite Strategy Set Games 

We have seen that while the matching pennies 
game (and the penalty kick game with the same 
payoff structure) does not have a pure strategy 
Nash equilibrium, it has a mixed strategy Nash 
equilibrium. The next theorem, states that this 
existence result extends to all finite strategy set 
games. 

Theorem 1 (Nash) Every strategic form game 
with finite strategy sets has a mixed strategy Nash 
equilibrium. 

Infinite Strategy Set Games 

A stronger result on existence of a pure strategy 
Nash equilibrium can be established in infinite 
strategy set games under some topological con¬ 
ditions on the utility functions and constraint 
sets (see Debreu 1952, Fan 1952, and Glicksberg 
1952). 

Theorem 2 (Debreu, Fan, Glicksberg) Con¬ 
sider a strategic form game (X, (S/)/ e z> ( u i)iex) 
with infinite strategy sets such that for each 
i e X: 

1. Si is convex and compact. 

2. Ui (Si , S-i ) is continuous in s~i. 

3. Ui (Si , s~i ) is continuous and quasiconcave in 
Si. (Let X be a convex set. A function f : X 

M is quasiconcave if every upper level set of 
the function, i.e., {x e X \ fix') > a} for 
every scalar a, is a convex set (see Bertsekas 
et al. 2003).) 

The game has a pure strategy Nash equilibrium. 


Ballet Soccer 


2 , 1 

0,0 

0,0 

1,2 


Battle of the Sexes 
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Note that Theorem 1 is a special case of this 
result. For games with finite strategy sets, mixed 
strategy sets are simplices and hence are convex 
and compact, and utilities are linear in (mixed) 
strategies; hence, they are concave functions of 
(mixed) strategies (and continuous functions of 
mixed strategy profiles). 

The next example shows that quasiconcavity 
cannot be dispensed with in the previous exis¬ 
tence result. 

Example 6 Consider the game where two players 
pick a location S\,S 2 G M 2 on the circle. The 
payoffs are 


Ui(si,s 2 ) = -u 2 (si,s 2 ) = d(si,s 2 ), 


We first introduce some notation to state this 
result. Given a scalar-valued function / : W 1 -> 
M, we use the notation V /(x) to denote the 
gradient vector of / at point x, i.e., 


V/(x) 


9 fix) df(x) 


-\T 


dx\ 


dx n 


Given a scalar-valued function F : Y\i=i ► 
M, we use the notation V z F(x) to denote the 
gradient vector of F with respect to x z at point 
x, i.e., 


V z F (x) = 


3 F(x) 3 F(x) 


We use the notation V F(x ) to denote 


where d(s\,S 2 ) denotes the Euclidean distance 
between siand^ e M 2 . It can be verified that 
this game does not have a pure strategy Nash 
equilibrium. However, the strategy profile where 
both players mix uniformly on the circle is a 
mixed strategy Nash equilibrium. 

Without quasiconcavity, one can establish the 
following existence result (see Glicksberg 1952). 

Theorem 3 (Glicksberg) Consider a strategic 
form game (X, (5/)/ e x» (ui)iez), where the Sj 
are nonempty compact metric spaces and the 
Uj : S M are continuous functions. The game 
has a mixed strategy Nash equilibrium. 


Uniqueness of Nash Equilibrium 


VF(x) = [V l F l (x),...y,F I (x)] T . (1) 

We assume that the strategy set Si of each 
player i is given by 

St ={xi eR mt | hi(xi) > 0 }, ( 2 ) 

where hi : ^ 1 is a concave function. 

(Since hi is concave, it follows that the set Si is 
a convex set.) The next definition introduces the 
key condition used in establishing the uniqueness 
of a pure strategy Nash equilibrium. 

Definitions We say that the utility functions 
(wi,..., ui) are diagonally strictly concave for 

x e S, if for every x*, x G S, we have 

(x — x*) 7 Vw(x*) + (x* — x) T Wu(x) > 0. 


Another important question that arises in the 
analysis of strategic form games is whether the 
Nash equilibrium is unique. This is important for 
the predictive power of Nash equilibrium since 
with multiple equilibria, the outcome of the game 
cannot be uniquely pinned down. The following 
result by Rosen provides sufficient conditions 
for uniqueness of an equilibrium in games with 
infinite strategy sets (see Rosen 1965). (Except 
for games that are strictly dominant solvable, 
there are no general uniqueness results for finite 
strategic form games.) 


We can now state the result on uniqueness of 
pure strategy Nash equilibrium in strategic form 
games. 

Theorem 4 (Rosen) Consider a strategic form 
game (L , (Si), (i )). For all i e 1, assume that 
the strategy sets Si are given by Eq. (2), where 
hi is a concave function, and there exists some 
Xi G such that hi (x z ) > 0. Assume also that 
the utility functions (u \,... ,uj) are diagonally 
strictly concave for x e S. Then, the game has 
a unique pure strategy Nash equilibrium. 












1370 


Strategic Form Games and Nash Equilibrium 


We next provide a tractable sufficient con¬ 
dition for the utility functions to be diagonally 
strictly concave. Let U(x) denote the Jacobian of 
Vu(x) [see Eq. (1)]. Specifically, if the Xj are all 
1-dimensional, then U{x) is given by 


U(x) = 


/ d 2 u\{x) d 2 u\{x ) 
dx\ 3xi9 x 2 

3 2 M2(x) 

3x23x1 


A 


: 7 


Proposition 3 (Rosen) For all i e X, assume 
that the strategy sets Si are given by Eq. (2), 
where hi is a concave function. Assume that the 
symmetric matrix (U(x) + U T (x)) is negative 
definite for all x e S, i.e., for all x e S, we 
have 


Don’t Confess 
Confess 


Don’t Confess Confess 


a, a 

b — c,a + c 

a + c,b — c 

b,b 


Prisoner’s Dilemma 


This game, generally used for capturing the 
dilemma of cooperation among selfish agents, 
has a unique (pure strategy) Nash equilibrium. 
(In fact each player has a dominant strategy, see 
Fudenberg and Tirole 1991, which is (CONFESS, 
Confess)). This clearly illustrates two aspects 
of the inefficiencies that arise in Nash equilib¬ 
ria. First, the unique Nash equilibrium is Pareto 
inferior meaning that if both players cooperated 
and chose Don’t Confess, they would both 
obtain the higher payoff of a. Second, the extent 
of inefficiency can be arbitrarily large based on 
the values of a and b. We can capture this by the 
efficiency loss (or Price of Anarchy as known in 
the literature) defined as 


y T {U(x) + U T (x))y < 0, 0. 

Then, the payoff functions (u \,..., uj) are diag¬ 
onally strictly concave for x e S. 

Rosen’s sufficient conditions for uniqueness 
are quite strong. Recent work has extended such 
uniqueness results to hold under weaker condi¬ 
tions using differential topology tools. The main 
idea is to provide sufficient conditions so that 
the indices of all stationary points can be shown 
to be positive, which from a generalization of 
the Poincare-Hopf theorem (Simsek et al. 2007, 
2008) implies that there exists a unique equilib¬ 
rium (see Simsek et al. 2005 for applications of 
this methodology to several network games). 


Efficiency of Nash Equilibria 

Because the Nash equilibrium corresponds to the 
fixed point of the best-response correspondences 
of the players, there is no presumption that it is 
efficient or maximizes any well-defined weighted 
sum of utility functions of the players. This fact 
is clearly illustrated by the well-known Prisoner’s 
Dilemma game. For some a > 0 ,b > 0, and 
c > 0 with a > b, the payoff matrix is given by: 


c ,, . , . , E, u i (equilibrium) 

parameters ^. m (social Optimum) 

where the social optimum is the strategy profile 
that maximizes the sum of utility functions. In the 
preceding example, this is clearly 

inf b - = 0, 

a,b a 

showing that efficiency loss can be arbitrarily 
large. In problems that have more structure, the 
efficiency loss can be bounded away from zero. A 
well-known example is by Pigou, which showed 
that in a network routing game where the conges¬ 
tion penalty can be described by linear latency 
functions (see Example 2), the efficiency loss 
is 3/4 (Pigou 1920). Roughgarden and Tardos 
in an important contribution (Roughgarden and 
Tardos 2000) showed that this is a lower bound 
for such routing games over all possible network 
topologies. 

Summary and Future Directions 

This article has provided an introduction to the 
basics of strategic form games. After defining 
the concept of Nash equilibrium, which is the 
basis of much of recent game theory, we have 
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presented fundamental results on its existence 
and uniqueness. We also briefly discussed issues 
of efficiency of Nash equilibria. 

Though game theory is a mature field, there 
are still several important areas for inquiry. The 
first is a more systematic analysis and catego¬ 
rization of classes of games by their equilibrium 
and efficiency properties. Recent work by Can- 
dogan et al. (2010, 2011, 2013) provides tools 
for systematically analyzing equivalence classes 
of games that may be useful for such an investi¬ 
gation. The second area that is very much active 
concerns computational issues, which we have 
not considered here. Recent literature showed 
that computation of Nash equilibria in finite strat¬ 
egy set games is potentially hard and focused 
on developing algorithms for computing approx¬ 
imate Nash equilibria (see Daskalakis et al. 2006 
and Lipton et al. 2003). Ongoing research in 
this area focuses on infinite strategy set games 
and exploits special structure to develop algo¬ 
rithms for computing (exact and approximate) 
Nash equilibria (Parrilo 2006; Stein et al. 2008). 
A third area is to develop a better application 
of tools of strategic form games and understand 
the resulting efficiency losses in networks and 
large-scale systems. Work in this area uses game- 
theoretic models to investigate resource alloca¬ 
tion, pricing, and investment problems in net¬ 
works (Johari and Tsitsiklis 2004; Acemoglu and 
Ozdaglar 2007; Acemoglu et al. 2009; Njoroge 
et al. 2013). A fourth area of research is to 
develop and apply alternative solution concepts 
for strategic form games. While some of the 
research in game theory has focused on subsets of 
Nash Equilibria (see Fudenberg and Tirole 1991), 
from a computational point of view, the set of 
correlated equilibria, which is a superset of the 
set of Nash Equilibria, is also attractive since it 
can be represented as the optimal solution set 
of a linear program. Correlated equilibrium can 
be implemented using a correlation scheme (a 
trusted party) or cryptographic tools as shown 
in Izmalkov et al. (2007). Recent work investi¬ 
gates alternative solution concepts for symmetric 
games intermediate between Nash and correlated 
equilibria (Stein et al. 2013), which can be imple¬ 
mented using specific correlation schemes. 


Cross-References 

► Dynamic Noncooperative Games 

► Game Theory: Historical Overview 

► Linear Quadratic Zero-Sum Two-Person Dif¬ 
ferential Games 
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Abstract 

Stream of variation (SoV) theory is a unified, 
model-based method for modeling, analyzing, 
and controlling variation in multistage manufac¬ 
turing systems. A SoV model represents variation 
and its propagation in a multistage system using 
the recursive structure of state space models; such 
models can be derived from physical knowledge 
and/or estimated empirically using system opera¬ 
tional data. Immediately, the SoV model enables 
integrated design and optimization for product 
and process tolerancing, allocation of distributed 


sensors in production lines, and evaluation of 
multistage system designs. With the help of these 
functions, the SoV method fulfills the objectives 
of system monitoring, diagnosis, and control and, 
ultimately, reduces a system’s variation during 
its operation. The SoV method can be further 
extended to model the interactions among prod¬ 
uct quality and tooling reliability, known as the 
quality and reliability chain effects, which is the 
crucial element in carrying out quality-ensured 
maintenance, as well as system reliability evalu¬ 
ation and optimization. The SoV theory has been 
successfully implemented in assembly, machin¬ 
ing, and semiconductor manufacturing processes. 
More research and development are needed to 
extend the SoV theory to manufacturing systems 
with complex configurations. 

Keywords 

Data fusion; Engineering-driven statistics; Mul¬ 
tistage manufacturing system; Quality improve¬ 
ment; Variation reduction 

Introduction 

A multistage system refers to a system consisting 
of multiple units, stations, or operations to finish a 
final product or a service. Multistage systems are 
ubiquitous in modern manufacturing processes 
and service systems. In most cases, the final 
product or service quality of a multistage system 
is determined by complex interactions among 
multiple stages - the quality characteristics of 
one stage are not only influenced by the local 
variations at that stage but also by the variations 
propagated from upstream stages. Multistage sys¬ 
tems present significant challenges for quality 
engineering research and system improvement. 

The stream of variation (SoV) theory has been 
developed to understand and represent the com¬ 
plex production stream and data stream involved 
in the modeling and analysis of variation and 
its propagation in a multistage manufacturing 
system (Fig. 1). 
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Final Product 
Measurements 


► 


Stream of Variations Analysis, Fig. 1 Variation propagation in a multistage manufacturing process (MMP) and 
notations in SoV modeling (Reproduced from Shi 2006) 


Stream of Variation Model 

The foundation of the SoV theory is a mathe¬ 
matical model that links the key product quality 
characteristics with key process control charac¬ 
teristics (e.g., fixture error, machine error, etc.) in 
a multistage system. This model has a state space 
representation that describes the deviation and its 
propagation in an N -stage process (as shown in 
Fig. 1) and takes the form of 

x k = A k -ix k -x + + w^, k = 1,2,..., V, 

( 1 ) 

y k = C k x k + \ k , {k} C {1,2,..., N}, (2) 

where k is the stage index, x k is the state vector 
representing the key quality characteristics of 
the product (or intermediate work piece) after 
stage k , U£ is the control vector representing 
the tooling deviations (e.g., no fault occurs if 
all tooling deviations are within their tolerances; 
fault occurs when excessive tooling deviations 
are beyond their tolerances; active adjustments 
of tooling deviations can be done to achieve 
error compensation objectives) at stage k , and y k 
is the measurement vector representing product 
quality measurements at stage k. Vectors and 
\ k represent modeling error and sensing error, 
respectively. The coefficient matrices , B k , and 
C k are determined by product and process de¬ 
sign information: represents the impact of 

the deviation transition from stage k—1 to stage 
k , B^ represents the impact of the local tooling 
deviation on the product quality at stage k , and 
C k is the measurement matrix, which can be 


obtained from the defined quality features of the 
product at stage k. 

If we repeat the modeling efforts for each stage 
from k = 1 to V, we will get the deviation 
and its propagation throughout the multistage 
manufacturing systems. By taking variances on 
both sides of ( 1 ) and ( 2 ) and by assuming inde¬ 
pendence among certain variables, we will obtain 
the variation and its propagation model for the 
multistage manufacturing system. 

The SoV models (1) and (2) can be obtained 
from product and process design information 
and/or from the system operational data. In Shi 
(2006), two basic modeling methods, a physics- 
driven method and a data-driven method, were 
investigated. In the physics-driven modeling, the 
kinematic relationships between key control char¬ 
acteristics (KCC) and key product characteristics 
(KPC) are identified through a detailed physi¬ 
cal analysis of the product and manufacturing 
process. A set of carefully defined coordinate 
systems are defined to represent the whole sys¬ 
tem, including the quality features in the part 
coordinates, part orientation to fixture/machine 
coordinates, and tooling to fixture/machine co¬ 
ordinates. Based on these coordinate systems, 
SoV models (1) and (2) are obtained using the 
state space model framework. In the data-driven 
modeling approach, system operational data are 
measured for those selected KPC and KCC vari¬ 
ables. System identification and estimation meth¬ 
ods are adopted to construct the SoV model. 
In some cases, data mining and clustering tech¬ 
niques are used to identify inherent relation¬ 
ships of the system in pre-processing. The SoV 
model may have different formulations, such as 
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the state space model, input-output model, and 
piecewise linear regression tree model. In most 
cases, engineering-driven statistical analysis is 
commonly used in the data analysis and modeling 
efforts. 

With models (1) and (2), variation reduction 
can be achieved in both design and manufactur¬ 
ing phases by using mathematical optimization 
to make optimal decisions. However, significant 
challenges exist in both the model development 
for specific processes and model utilization to 
realize the benefits of the analytical capability 
of this model. These challenges are addressed 
in the SoV methodological research (Shi 2006). 
In more detail, the SoV methodology addresses 
the following important questions for variation 
reduction in a multistage manufacturing process. 

SoV-Enabled Monitoring and 
Diagnosis 

In multistage manufacturing systems, it is chal¬ 
lenging to systematically find the root causes of 
a severe variability in terms of isolating both the 
manufacturing station and the underlying cause 
in that station. During continuous production, ex¬ 
cessive product variation may occur at any stage 
of a multistage manufacturing system due to 
worn tooling, tooling breakage, and/or abnormal 
incoming part variation. The SoV theory presents 
systematic approaches for root cause identifica¬ 
tion. In this approach, a new concept of “sta¬ 
tistical methods driven by engineering models” 
is proposed to integrate the product and process 
design knowledge with the on-line statistics. By 
solving the difference equation of models (1) and 
(2) and with some mathematical simplifications, 
the SoV model can be transformed into an input- 
output format as 

y = r • u + e, (3) 

where y is an n x 1 vector of product 
quality measurements, T is an n x p con¬ 
stant system matrix determined by prod¬ 
uct/process designs, u is a p x 1 random vector 
representing the process faults, and e is an n x 1 


random vector representing measurement noises, 
un-modeled faults, and high-order nonlinear 
terms. During production, the product quality 
features (y) are measured, and the data are used 
to conduct statistical analysis based on the model 
(1) to identify root causes. Two basic methods are 
developed for root cause diagnosis: (i) variation 
pattern matching: In this method, all potential 
variation patterns can be obtained from the matrix 
T resulting from the off-line system design. 
During the system operation, observed variation 
patterns can be obtained from the covariance 
matrix of y. A pattern matching can be performed 
to identify the root causes, (ii) estimation-based 
diagnosis: With the SoV model and availability 
of on-line measurement of quality feature (y), the 
deviation value of u can be estimated on-line. A 
hypothesis testing of u and its variance reveals 
the significant changes that occurred to u, corre¬ 
sponding to the root causes of the system. Various 
estimators and their performances are evaluated 
in the diagnosis study (Chapter 11 of Shi 2006). 

SoV-Enabled Sensor Allocation and 
Diagnosability 

The issue of diagnosability refers to the problem 
of whether the product measurements contain 
sufficient information for the diagnosis of critical 
process faults, i.e., if root causes of process faults 
can be diagnosed. The diagnosability analysis 
is investigated based on model (3) that links 
potential process faults (u) and product quality 
measurements (y). In the SoV theory, a set of 
criteria is developed to evaluate the mean di¬ 
agnosability and variance diagnosability for a 
system. Similar to observability in control theory, 
diagnosability is determined by the A&, B&, and 
C* matrices (k = 1,..., N) in the SoV models 
(1) and (2) (or the T matrix in model (3)). In some 
cases, only a subset of variables (vs. specific root 
cause variables) can be identified as potential root 
causes of the process faults, which are referred to 
as minimum diagnosable classes. 

One emphasis in the SoV-enabled diagnos¬ 
ability study is to promote the concept of the 
“process-oriented measurement” strategy. In 
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current industrial practice, most of the existing 
measurement strategies focus on the product 
coherence inspection (i.e., product-oriented 
measurements), which is effective for detecting 
product imperfection, but may not be effective 
to identify the root causes of product quality 
failures. The SoV theory proposes a “process- 
oriented measurement” concept with a distributed 
sensing strategy. In this strategy, selected key 
control characteristics, as well as selected key 
product characteristics, will be measured in the 
selected stages for both detecting product defects 
and identifying their root causes. 

SoV-Enabled Design and 
Optimization 

Variation analysis and design evaluations are con¬ 
ducted in the product and process design stage to 
identify critical components, features, and manu¬ 
facturing operations. With the SoV model defined 
in (3) and certain assumptions, we can represent 
the KPC-to-KCC relationship as 

N 

± y =Y,Tk*u k Tl, (4) 

k =1 

where Y, y is the variance-covariance matrix 
of product quality features resulting from the 
variance-covariance matrix (Z Mjk ) of tooling 
errors. Based on (3) and (4), the following four 
tasks can be performed: (i) tolerance analysis 
by allocating the tooling tolerance (u^) and 
then predicting the final product tolerance 
(yiv); (ii) tolerance synthesis by fixing the final 
product tolerance (y#) and then assigning the 
tolerance for individual tooling components 
(ujO with certain cost objectives minimized; (iii) 
sensitivity study by identifying the critical tooling 
components (u^) that have significant impacts on 
the final production variation through evaluation 
of the defined sensitivity indices; and (iv) process 
planning by optimizing parameters in and 
matrices to minimize the final product variation. 

One unique feature of SoV-enabled design and 
optimization is to provide a unified method for 


simultaneous optimization of product and process 
tooling tolerance, as well as process planning. 
This is because the SoV models (1) and (2) 
represent the product quality features (x^ and y^), 
tooling features (u^), and the process planning 
formation (A k and B&) within one mathematical 
model. As a result, a math-based optimization 
is feasible to achieve the best quality through 
process-oriented tolerance synthesis for product 
and process, as well as optimized process plan¬ 
ning. 

SoV-Enabled Process Control and 
Quality Compensation 

The SoV model provides the opportunity to apply 
active control for dimensional variation reduction 
in a multistage manufacturing system. The ba¬ 
sic idea is to implement a system-level control 
strategy during production to minimize the end- 
of-line product variance, which is propagated 
from upstream manufacturing stages. An optimal 
control scheme was devised to use the state space 
structure of the SoV model by treating the control 
as a stochastic discrete-time predictive control 
problem. The optimization index for determining 
the optimal control action is formulated as 

4* = min J k = min E Q N yN\k + u^RfcUfr] , 

s.t. 0^ c < u kc < 0^ c , k = 1,... ,N, c = 1,..., n u k . 

(5) 

where y^ denotes the product quality at the 
final stage N that is predicted at stage k and 
n u £ is the dimension of the control action u^. 
The constraints [C kc ,Cj^ c ] define the upper and 
lower actuator limits that can be applied on each 
part/substage. Qjy e R mXm i s a positive semi- 
definite matrix, and e R nxn is a positive 
definite matrix. 

This optimization index takes the form 
of the widely accepted cost function of a 
linear-quadratic regulator under the predictive 
control framework and thus satisfies the 
common requirements in control theory. Various 
research topics have been investigated under 
this framework, including the feed-forward 
control for multistage process, cautious control 
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considering model uncertainties, and actuator 
layout optimization in control system designs. 


SoV-Enabled Product Quality and 
Reliability Chain Modeling and 
Analysis 

There is a complex, intricate relationship between 
product quality and tooling reliability in a mul¬ 
tistage manufacturing system. A degraded (or 
failed) production tool leads to a large variability 
in product quality and/or an excessive number of 
defects; on the other hand, excessive variability of 
product quality features accelerates the degrada¬ 
tion and failure rates of production tooling at the 
station thereafter. For a multistage manufacturing 
system, these interactions are more complex as 
variations propagate from one stage to the next 
stage. Thus, a “chain effect” between the product 
quality (Q) and tooling reliability (R) can be 
observed and thus noted as the “QR chain” effect. 
Modeling of the QR chain is an integrated effort 
of the SoV model and the semi-Markov process 
model. The QR chain model plays an essential 
role in system reliability modeling and mainte¬ 
nance decisions and has led to new concepts of 
quality-ensured maintenance strategy, and toler¬ 
ance synthesis considering tool degradation and 
system down time. 


Summary and Future Directions 

The concept of stream of variation for multistage 
systems can be applied to a very broad range 
of systems, although the existing work mostly 
focuses on the quality control of multistage dis¬ 
crete manufacturing processes. A comprehensive 
discussion on the stream of variation theory for 
a multistage manufacturing system is summa¬ 
rized in a monograph (Shi 2006). In addition, 
Shi and Zhou (2009) provides a survey of emerg¬ 
ing methodologies for tackling various issues in 
multistage systems including modeling, analysis, 
monitoring, diagnosis, control, inspection, and 
design optimization. 


The success of the multistage system frame¬ 
work in manufacturing processes will certainly 
stimulate the application of this framework to 
other systems. For example, monitoring and di¬ 
agnosis of the abnormalities in throughput, cycle 
time, and lead time of a multistage production 
system are very promising application areas un¬ 
der the multistage system framework. The supply 
chain and logistics management, which involve 
multiple suppliers/venders in an interconnected 
fashion, can be treated as another multistage 
system with network structures. Most service 
systems such as health-care clinics, hospitals, and 
transportation systems are inherently multistage 
as well. It will be interesting to expand the stream 
of variation theory to these broadly defined mul¬ 
tistage systems for their quality control, variation 
reduction, and other system-level performance 
improvement. 

Cross-References 

► Fault Detection and Diagnosis 

► Multiscale Multivariate Statistical Process 
Control 

► Statistical Process Control in Manufacturing 

Recommended Reading 

The monograph (Shi 2006) provides detailed re¬ 
sults of the stream of variation theory discussed 
in this entry. In addition, the first five chapters 
of Shi (2006) provide views of basic statistical 
and system analysis tools needed for the SoV 
research and development. Some recent develop¬ 
ments related to the SoV theory and applications 
are summarized in a review paper (Shi and Zhou 
2009). 
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Abstract 

This entry presents the most commonly used 
formulations of robust stability and robust Hoo 
performance for linear systems with highly 
structured, linear, time-invariant uncertainty. 
The structured singular value function (/z) is 
specifically defined for this purpose, involving 
a problem-specific set, called the uncertainty 
set. With the uncertainty set chosen, /z is a real¬ 
valued function defined on complex matrices 
of a fixed dimension. A few key properties 
are easily derived from the definition and 
then applied to solve the robustness analysis 
problem. Computation of /z, which is required 
to implement the analysis tests, is difficult, so 
computable and refinable upper and lower bounds 
are derived. 


Keywords 

Robustness analysis; Robust control; Structured 
uncertainty 


Notation, Definition, and Properties 

R and C are the real and complex numbers; 
C+ = {y e C : Re(y) > 0}; C n is the set of 
n x 1 vectors and C nxm the set of n xm matrices 


with elements in C. <r(-) refers to the maxi¬ 
mum singular value of a matrix; for A e C nxn , 
p ( A ) is the spectral radius (largest, in magnitude, 
eigenvalue of A), and p R (A ) is the real spectral 
radius (largest, in magnitude, real, eigenvalue of 
A ); 1Z is the ring of proper rational functions, 
S = {g eTZ : g has no poles in C+}; <S #X# de¬ 
notes matrices with elements in S , where the 
exact dimensions are unspecified, but clear from 
context; finally, no notational distinction is made 
between a linear system, its transfer function, 
and/or its frequency response function. 

Let R,S , and F be nonnegative integers 
and n,...,r R , s u ...,s s , and 

be positive integers. Define sets Ar := 
{diag [8\I ri , • • • ,8 R Ir R \:8i e R}, 

A c := {diag [Si/^, • • • , 8 S I ss , M, • • • , A F ] 

: S t eC,A k e C fkXfk ) 

and their diagonal augmentation, A : = 
{diag [Ar, A c ] : A* e A R , A c e A c } c C" x ". 
The set A is called the block structure. The 
block structure can be generalized to handle 
nonsquare blocks in Ac at the expense of 
additional notation. If R = 0, then A is called a 
complex block structure. If S = F = 0, then A 
is called a real block structure. For M e C nxn , 
/z a (M) is defined as 


min{cr(A): A e A, det(/ — MA) = 0} 

unless no A e A makes I — M A singular, in 
which case /za(M) := 0, (Doyle 1982; Safonov 
1982). The function /za(-) : C nxn R is upper 
semicontinuous. Following Fan et al. (1991), the 
constraint set in the definition can be written as 
{<t(A) : 3w ,z e C n ,w = Mz,z = Aw, w ^ 0„}, 
so that without loss of generality, at the minimum, 
the elements Ai,..., A F each have rank equal 
to 1. For specific block structures, simplifications 
occur: if R = S = 0 and F = 1, then 
/x a (M) = d(M); if R = F = 0 and S = 1, 
then /za(M) = p(M); and if S = F = 0 
and R = 1, then /za(M) = p R (M). In general 
p R (M) < /za(M) < d(M). Associated with 
A define Ba := {A e A : d (A) < 1}. Since 
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I — M A is singular if and only if M A has an 
eigenvalue exactly equal to 1, it follows that 
/za(M) = maxAGB A Pr (Af A). If A is a complex 
block structure, then /)r (•) can be replaced with 
p(-), and in that case fi A (-) : C nxn R is 
continuous. 

A common application is to quantify the 
effect (in structured singular value terms) that 
an uncertain matrix A has on the expression 
F l (M, A) := Mu + M 12 A (/ - M22A ) -1 M21, 
a linear fractional transformation (LFT) of A 
by M . This is conceptually straightforward 
(informally called the main loop theorem) 
using the Schur formula for determinants. 
Specifically, let Ai c C WlXWl , A 2 c C" 2 ™ 2 
be block structures A and c Q(ni+n 2 )x(ni+n 2 ) 
be their block-diagonal augmentation. For 
M e C (W1+W2)X(W1+W2) , /x a (M) < 1 if and only if 
/xa 2 (M 22 ) < 1 and 

max /z A| (/A (Af, A 2 )) < 1. 

A 2 eB A2 

Finally (Packard and Pandey 1993) if A 1 is 
a block structure, and A 2 is a complex block 
structure, and M satisfies /z Al (Afn) < /x A (M), 
then /z A (•) is continuous on an open ball around 
M. Loosely speaking, “if there are any complex 
blocks, and M is such that they matter, then /z 
is continuous at Af This means that at points 
of discontinuity, only A r e Ar need to be 
nonzero. For any polynomial p : C" —> C, 
there is a minimum-norm root (using H-H^ on 
C") whose components all have equal modulus 
(Doyle 1982). Defining 

Qa := {diag [Ar, A c ] : a (Ar) <1, A* A c = /} 

and employing this result (Young and Doyle 
1997) derives that /z A (M) = max^^ p R ( MQ ). 
This gives a generalized maximum-modulus-like 
theorem for LFTs (Packard and Pandey 1993). 
Revisiting the setup for the main loop theorem, 
assume further that A 2 is a complex block struc¬ 
ture. If /z A 2 (M 22 ) < 1, then 

max jiAi(F L (Af, A 2 )) = max /z Al (F L (M, g 2 )). 
A 2 €Ba 2 Qi^Q \ 2 


This leads to specialized results per Boyd and 
Desoer (1985), Packard and Pandey (1993), and 
Tits and Fan (1995) for stable transfer function 
matrices. For any block structure A c Q nXn and 
M e s nxn , then 

max < sup /z A (M(y<z>)), /z a (M(oo))> 

tweR ) 

= max < sup pla(M{s)) ,/z a (M(cx)))> . 

( s£C+ \ 


Robustness of Stability 
and Performance 

There are several uncertain system formulations 
that all result in the same /z-analysis test to assess 
the robustness of stability and/or performance 
(Wall et al. 1982; Foo and Postlethwaite 1988). 
In this article, we present the simplest and most 
common interpretation. Consider an interconnec¬ 
tion of known systems, {Gi}fL x , and unknown 
systems {r^}^ =1 , as described by 


<7i 

e 

= H 

Zl 

d 

_^2_ 


_ £2 _ 


where zi = diag [G u .. •, G M ] (qi + wi), 
Z2 = diag [Ti,..., T^v] (q 2 + w 2 ), and H e 
R("i +rie + n d +^ 2 ) (naturally partitioned 

as a block 3-by-3 array). This is depicted in 
Fig. 1. Each G ? and T^ is assumed to be a finite¬ 
dimensional, time-invariant linear system, with 
proper transfer function, and a stabilizable and 
detectable internal state-space description. 

The interconnection is well posed if for any 
initial conditions and any (say) piecewise con¬ 
tinuous inputs wi, w 2 , and d , there exist unique 
solutions to the interconnection equations. By 
manipulating the state-space or transfer function 
descriptions of a well-posed interconnection, a 
state-space model or proper transfer function de¬ 
scription for the map from (< d , w) to (e, z) can be 
derived. A well-posed interconnection is stable 
if the resultant state-space model is internally 
stable - the eigenvalues of its “A” matrix are in 
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U)\ 


e 


W2 



z\ 


d 


Z2 


r := {diag [T R , T v ] : T R e A R , Tjj e S' x \ 
Tu(s 0 ) € Ac V^o G C+}. 


Since 0 is a possible value of T, two necessary 
conditions (denoted c.l and c.2, respectively) 
for robust well-posedness and stability are at 
T = 0, specifically det(/ — G(oo)ifn) ^ 0 and 
V := G(s)(I — N n G(s))~ 1 G <S #X V Assuming 
det(/ — G(oo)//n) ^ 0 (i.e., c.l), the Schur 
formula for block determinants reduces the well- 
posedness condition to 


Structured Singular Value and Applications: Ana¬ 
lyzing the Effect of Linear Time-Invariant Uncer¬ 
tainty in Linear Systems, Fig. 1 Interconnection of 
Gi,..., G m , Ti, ..., T n 

the open, left-half plane. Given some restrictions 
on the values of the elements of T, robustness 
analysis poses the question: is the interconnec¬ 
tion well posed and stable for all possible values 
of T? And if so, then is the W'W^ gain from 
d-to-e < 1 for all possible values of T? The 
goal of the analysis is to confirm “yes” or supply 
a particular T which proves that the answer is 
“no” (by rendering the interconnection ill-posed, 
unstable, or with d-to-e gain >1). Standard linear 
systems theory gives that the interconnection is 
well posed if and only if 


det 


(j 

'H u 

Hn 

'G(oo) 

0 


V 

_H 3 \ 

h 33 _ 

0 

r(oo)_ 

) 


7^0, 


det (/ - r(oo) [H 33 + H 3l (I - G(oo)/fn) _1 
G(oo)H 13 ]) + 0. 

Define M := H 33 + H 3l G(I - H n G)~ l H l3 e 
S' x \ and X := I - TM. Then 


J^W,Z 


V + VH u X~ l TH 31 V vh 13 x~ 1 t 
X~ l TH 3l V X~ l T 


Assuming c.2, namely, V G <S* X *, then X 1 g 
<S* X * implies that T w,z e <S* X * - moreover 
TW , Z e £.x. implies that x~ l = I + T% Z M G 
<S #X *. Finally, since both M and T are stable, 
it follows that X~ l G <S* X * if and only if 
det(/ — M(so)T(so)) ^ 0 Vs 0 e C+. The 
maximum-modulus property gives the robustness 
theorem. With the definition of M and conditions 
c. 1 and c.2, the uncertain system is robustly stable 
and well posed if and only if 


and that the interconnection is stable if and only 
if the transfer function matrix T w,z , mapping 
[w \; w 3 \ to [z \; zi\, is an element of <S* X V 

The assumptions on each T \ are of three kinds: 
(i) T k is a stable linear system, known only to 
satisfy \\T k < 1; (ii) T k is a stable linear 
system of the form y k I, where the scalar linear 
system y k is known to satisfy || yjt lloo < 1; (hi) 
T^ is a constant gain, of the form y& /, where 
the scalar y k G R is known to satisfy —1 < 
Yk < 1 • Note the similarity between this and the 
block structure A (via Ar and Ac) introduced 
earlier. After rearrangement, this block-diagonal 
augmentation of uncertain systems is a norm- 
bounded (by 1) element of the set 


max < sup ii\(M(jco)) , /xa(M(oo))> < 1. 

(wER ) 

Indeed, if the condition holds, then by maximum- 
modulus theorem and the definition of /x, it fol¬ 
lows that det(/ — M(s)T(s)) ^ 0 for all s G 
C+ as well as ^ = oo, since T(s) G A and 
d (T (s)) < 1. This gives well-posedness and 
stability for all such T, as desired (an alternate 
proof, using the Nyquist criterion is also com¬ 
mon). Conversely, if the condition is violated, 
then at some frequency (0, nonzero, or oo), /x is 
larger than 1, as evidenced by a (constant matrix) 
A G A c C nxn , d (A) < 1, which causes sin¬ 
gularity. If the frequency is nonzero (and finite), 
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the interpolation lemmas in the appendix enable 
replacing the complex blocks with stable, real- 
rational entries. Otherwise (0 or oo), the matrix 
is such that /z is continuous, and hence a finite, 
nonzero frequency also has /z > 1, or only the 
real blocks are necessary to cause singularity. In 
all cases, Ter with ||F< 1 exists to cause 
ill-posedness or instability (Tits and Fan 1995). 

Robustness of performance, measured as 
T e,d 1^, can be addressed, using the main loop 
theorem, and an additional complex full block 
(recall o (•) = /za(-) when F = 1, S = R = 0). 
Define 


~h 22 

h 2 { 

+ 

~H 2i - 

H n 

H 3 3_ 

_H 3 i_ 


G(I - HnGy 1 [H l2 H u ] 

and A P := {diag [A P , A] : A P e C n “ xne , 
A e A}. With conditions c.l and c.2, the 
uncertain system is robustly stable and well posed 
and satisfies II T e ,d II < 1 if and only if 

max < sup ii\ p (Mp(jco )), /xa p (M^(oo))> < 1. 

( coeR ) 

Computations 

The robust stability and robust performance the¬ 
orems require computing /z on the frequency re¬ 
sponse function M(jco). Computing /z is known 
to be a computationally difficult problem (Toker 
and Ozbay 1998), so exact computational meth¬ 
ods are generally not pursued. Reliable algo¬ 
rithms have been developed which yield upper 
and lower bounds, which are often sufficiently 
close for many engineering problems. 

Lower Bounds 

Recall that /za(M) = maxAeB A Pr (M A) = 
maxgeck Pr(MQ). Practically speaking, these 
maximizations yield lower bounds for /za(M), 
since the global maximum may not be attained. 
In addition to gradient-based ascent methods, the 
optimality conditions for Q e Qa to be a local 
maximum of the function pr (M A) on the set 


B a can be derived (Young and Doyle 1997). A 
solution approach, similar to a Jacobi iteration, 
leads to an iteration that resembles combinations 
of the familiar power methods for spectral radius 
and maximum singular value. If the iteration con¬ 
verges (which is not guaranteed), a lower bound 
for /za(M) (along with a corresponding A e A) 
is produced. Studies with matrices constructed 
to have /za(M) = 1 suggest that the iteration 
is very reliable for complex block structures, 
though usually quite poor for purely real block 
structures. There are several, more computation¬ 
ally demanding algorithms available for purely 
real block structures (de Gaston and Safonov 
1988; Sideris and Sanchez Pena 1989). For the 
common situation, with both real and complex 
blocks, where continuity is assured, the power 
algorithm generally has adequate performance. 

Upper Bounds 

DefineG a :={G=-G * :GA=-A*G* VAeA}, 
D a := {D = D* >0 : DA = AD VA e A}, 
subsets of C nxn . Elements of Da are of the 
form diag [D n ,..., D rR , D Sl ,..., D ss , d\I fl , 

..., dpif F \, and therefore D e Da implies that 
D i G Da too. Likewise, 

G a := {diag [Gr ,0] : G R = -G* e C* x *, 
GrA r = A r Gr VA R e A r }. 

A concise derivation (Helmersson 1995) verifies 
the upper bound formula (Fan et al. 1991). If 
P > 0, G G Ga, and D e Da satisfy M*DM — 
P 2 D + GM + M*G* ^ 0, then /z A (M) < p. 
Indeed, if A e A has det(7 - MA) = 0, there 
exist nonzero w, z G C" with w = Mz, z = Aw. 
Certainly z*(M*DM-^ 2 D + GM+M*G*)z < 
0. Making substitutions gives 

0 > w*Dw — p 2 w* A*DAw 
+w*A*Gw + w*G*Aw 

= w*Dw-p 2 w*D?A*ADi 
w + w*A*Gw-w*A*Gw 
= w*D l 2 (I - p 2 A*A) D 1 2 w. 
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Since D is invertible and w ^ 0 W , it must be 
that d (A) > /3 _1 , as desired. The constraint 
M*DM - fi 2 D + GM + M*G* ^ 0 is a linear 
matrix inequality (LMI) in the variables D and G. 
Minimizing /3 over G G Ga and D G Da subject 
to the LMI constraint (using Boyd and El Ghaoui 
1993, for instance) yields the best upper bound 
that this inequality can produce. 


Further Perspectives 

The robustness tests involve bounding 
over the entire real axis. A common 
approach is to use a dense frequency gridding and 
upper/lower bound calculations at each gridded 
point. The advantages, simplicity and trivial 
parallelization, are offset with disadvantages, in 
that the peak value (over R) may not be reflected 
accurately by the peak across the finite grid. 
In fact, such a grid-based test determines the 
smallest A G A which can cause a pole to 
migrate from the left-half plane into the right- 
half plane at exactly one of the frequency grid 
points (as opposed to any location). Nevertheless, 
with some continuity assurances in place and a 
dense grid, this is often adequate knowledge for 
most engineering decisions. However, the brute- 
force grid approach can be avoided by treating 
the frequency-variable ( co ) as an additional 
real parameter (since M(jco) is an LFT of ^) 
(Ferreres et al. 2003). This is a generalization 
of the Hamiltonian methods to compute the H 0 o 
norm of a linear system without a frequency grid, 
coupled with an alternative form of the upper 
bound (Young et al. 1995). Moreover, if only the 
peak value (upper bound, say) across frequency 
is desired, this approach can be fast, as some 
calculations rule out large frequency ranges to 
not contain the peak. 

Improved upper bounds can be derived using 
higher-order arguments, changing the FMI con¬ 
straint into a sum-of-squares constraint (which 
ultimately is just a larger FMI). Alternatively, 
branch-and-bound techniques are especially use¬ 
ful at reducing the conservativeness of the ( D , G) 
upper bound when there are several real parame¬ 
ters (R > 0) (Newlin and Young 1997). 


Appendix: Interpolation Lemmas 


Two interpolation lemmas make the connection 
between robustness to constant-gain, complex¬ 
valued uncertainties (A) and stable, finite¬ 
dimensional, time-invariant linear systems 
described by ODEs with real coefficients (T). 
Femma 1 is used (block by block and element 
by element on the relevant vector directions 
within each block) to interpolate complex blocks 
causing singularity into real-rational blocks 
which cause singularity at a particular frequency. 


Lemma 1 Given a positive co > 0 and a com¬ 
plex number 8, with Imag (<5) ^ 0, there is 
a P > 0 such that by proper choice of sign 




= 8 . 


S = J(0 


Lemma 2 Suppose M e C nxn and co > 0. If 
A G A satisfies det (I n — M A) = 0, then there 
is a T G T with ||T ||^ < d(A) and det (I n — 
MT(jcb)) = 0. 


Summary and Future Directions 

The structured singular value, /x, is a linear alge¬ 
bra construct, defined to exactly deal with linear, 
time-invariant uncertainty in linear systems. The 
main issues are computational, focused on effi¬ 
cient manners to compute reasonably tight upper 
and lower bounds at each frequency and, more 
specifically, ascertain the peak value across fre¬ 
quency. Alternatives to the worst-case approach 
to robustness analysis are gaining favor and may 
be applicable in analysis and design situations 
where the abstraction of a worst-case view is too 
conservative (Calafiore et al. 2000). 


Cross-References 

► Fundamental Fimitation of Feedback Control 

► KYP Femma and Generalizations/Applications 

► Finear Systems: Continuous-Time, Time-In¬ 
variant State Variable Descriptions 

► FMI Approach to Robust Control 

► Optimization Based Robust Control 
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► Robust Control in Gap Metric 

► Robust Fault Diagnosis and Control 

► Robust H 2 Performance in Feedback 
Control 


Recommended Reading 

A comprehensive list of references, including 
theory, computations, and diverse applications 
would require many pages. The list below 
is minimal and does not do justice to the 
many researchers who have made significant 
contributions to this subject. In addition to the 
cited work, connections to Kharitonov’s theorem 
can be found in Chen et al. (1994). Textbooks, 
such as Dullerud and Paganini (2000) and Zhou 
et al. (1996), include derivations and additional 
citations. 
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Abstract 

Optimization problems arising in the control of 
some important types of physical systems lead 
naturally to problems in sub-Riemannian opti¬ 
mization. Here we provide context and back¬ 
ground material on the relevant mathematics and 
discuss some specific problem areas where these 
ideas play a role. 
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Introduction 

After a start in the early 1970s, over the last two 
decades, sub-Riemannian geometry and the re¬ 
lated theory of subelliptic operators have become 
popular topics in the control literature. Their 
study is sometimes linked to questions involving 
the dynamics and control of mechanical systems 
with nonholonomic (nonintegrable) constraints 
and the use of what has classically been called 
quasi-coordinates because both subjects depend 
on Lie algebraic techniques. However, here we 
limit ourselves to problems in sub-Riemannian 
optimization per se, describing how they arise in 
various areas of physics and engineering. Most 
famously, the second law of thermodynamics, as 
recast by Caratheodory in differential geometric 
form, provides an example of the reach of sub- 
Riemannian geometry into the engineering world. 

The statement of control theoretic problems 
often begins with a description of the system of 
interest in differential equation form: 

x = f(x) + ^ Uigi (v) ; x e X,u eR m 

with X an ft-dimensional manifold. In well- 
motivated control problems, n is almost always 
larger than m\ the dimension of the space of 
controls is less than the dimension of the state 
space. In the case of mechanical systems, the 
phrase under actuated is sometimes used to 
characterize this, but the situation is ubiquitous. 
The analysis is complicated by presence of the 
immutable drift term f. When it is desired to use 
an optimization principle to find a good choice 
for ft, one introduces a performance measure, 
often of the form 

rh 

7] = / L(x, ft) dt 

Jo 


and attempts to minimize rj subject to whatever 
constraints there may be on u and v. If there is 
no drift term and if the Lie algebra generated by 
{gi, g 2 , • * * , gm} defines a distribution that spans 
the tangent space of X at every point, the problem 
falls under the purview of sub-Riemannian geom¬ 
etry. In this case, one can describe the situation 
as x = G{x)u with G being an v-dependent 
rectangular matrix of rank m everywhere. 

This entry is written from a control theory 
point of view. The problems discussed here pro¬ 
vided the impetus for some later mathematical 
work, often not discussing the motivation. The 
purely mathematical work is de-emphasized here, 
much as the mathematical work often gives little 
or no attention to the control theoretic work that 
preceded it. 

The Distance Function 

A prototype control problem leading to sub Rie- 
mannian geometry is that of steering the system 
X\ = u\ X2 = ft 2 *3 = X\U2 — X2U\ from one 
state to another while minimizing 



It might seem that this is just a minor change from 
a standard shortest path problem in Riemannian 
geometry, e.g., it might be thought as a limiting 
case of a standard Riemannian geodesic problem 
in which the infinitesimal length is given by 

^dx i dx 2 dx 3 j 1 0 — y 

(< ds ) 2 = 0 1 x 

_-y x e + x 2 + y 2 _ 

dx i 
dx 2 
_dx 3 _ 

and e is allowed to go to zero. However, because 
when e equals zero this matrix is singular, it can¬ 
not be used to define the equations for geodesics. 
The most direct attack seems to be to use a 
Lagrange multiplier to enforce the condition on 
X 3 , which leads to the minimization of 
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= f * i 

Jo 


+ x\ + A(xii 2 — X 2 X 1 ) dt 


This yields a set of A-dependent linear equations 
for x\ and V 2 . Solving these shows that the 
projections of the minimum length trajectories 
onto the (x \, V 2 )-plane are circular arcs. 

In Riemannian geometry, the set of points 
which are of distance r from a given point will, 
for r sufficiently small, form a co-dimension 
one manifold diffeomorphic to a sphere. In this 
qualitative sense, Riemannian spaces are locally 
isotropic. In sub-Riemannian geometry, the set of 
points of distance r > 0 from a distinguished 
point Vo does not have such a simple structure. 
For example, for the problem just discussed, we 
have the approximations 


It is of interest to generate a “shortest path” be¬ 
tween two points in (v, y, 0)-space where short¬ 
est is defined as the integral of some function 
of x,y,6,u i,« 2 - This is typical of the kind of 
path planning problems in which nonholonomic 
constraints lead to sub-Riemannian problems. A 
variety of such problems arise in robotics with 
optimal steering programs for cars being one 
example. 

As an example involving a compact manifold, 
let X be the space of 3-by-3 orthogonal matrices 
and consider the system described by 


0 U\ U 2 

—u\ 0 0 
—U2 0 0 


v 


d = y x\ + x\ + |v 3 |/Oi + x\) 
for |*31 (x\+ x\) 

d = 2tt I jc 3 I — J%n(x\ + v|)|v 3 | 
for y x\ + x\ |* 3 1 


In this case, the manifold X is three dimensional 
and the control space is two dimensional. If we 
wish to minimize the integral of u 2 + v 2 subject 
to v(0) = vo and v(l) = vi, we have a typical 
sub-Riemannian geodesic problem. 

If the controls contain random effects, efforts 
to analyze the situation lead to related problems 
in stochastic process. The most widely studied of 
these are described by an Ito equation of the form 


That is, for points bounded by paraboloids, defin¬ 
ing a region near the (vi, V 2 )-plane, the distance 
is close to the Riemannian distance, whereas 
in a cone containing the v 3 axis, the distance 
is close to the square root of the Riemannian 
distance. These approximations make it clear that 
d(x 1 , V 2 , v 3 ) is not differentiable at points on the 
v 3 axis. There is much more that can be said here. 
One interesting topic concerns the number of 
trajectories that satisfy the first-order necessary 
conditions and join a point to the origin. 


dx = f(x)dt + E gi (x)d wi 

The corresponding equation for the evolution of 
the probability density p(t,x ) can be put in the 
form 

dP , x 9 , X 

9? = 

+ 2>« 5 jt S 7 *'•*> 


More Examples 

Consider the kinematic equations of the unicycle. 
If (v, y) are the coordinates of the center of the 
wheel and 0 is the heading angle, then these are 

v = cos 0 u 2 ; y = sin 0 u 2 ; (p = u\ 


However, rather than the right-hand side being 
a fully elliptic operator, as it would be in a 
typical heat equation (e.g., the Laplace-Beltrami 
operator), the symmetric matrix B(x ) = bij (v) 
is singular. If the gi satisfy the bracket-generating 
condition, the density equation is said to be subel- 
liptic. The system described by the Ito equation 
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dx i 


—dt 

dw\ 

dw 2 

~Xi " 

dx 2 

= 

—dw\ 

—dt /2 

0 

x 2 

dx 3 


—dw 2 

0 

—dt 2 

_* 3 _ 


evolves on the two-sphere and the spectrum of the 
subelliptic operator is discrete. The diffusion time 
constants, i.e., the eigenvalues of the subelliptic 
operator, can be computed explicitly and com¬ 
pared with those of the fully elliptic operator, i.e., 
the standard Laplacian on the spherical shell. 

Much has been written on the ways in which 
subelliptic diffusion does, and does not, share the 
properties of the ordinary diffusion equation. 

A Special Structure 

A rich, and especially tractable, class of sub- 
Riemannian problems come from the following 
situation. Suppose that Q is a Lie group with Lie 
algebra G and that H is a closed subgroup with 
Lie algebra H . According to one definition, the 
pair H C Q is said to define a symmetric space 
if the Lie algebra G, viewed as a vector space, is 
the direct sum of H and K with [H, K] C K and 
[K, K] C H. Let x evolve in Q as 

x = ux \ x e Q \ u e K 

For the sake of exposition, suppose that Q is a 
matrix Lie group. We look for paths joining xo 
and x\ that are shortest in the sense that 

n = [ INI dt ; 

Jo 

is minimized, where \ \u\\ 2 = tr (u T u) . (This leads 
to the same trajectories as those which minimize 
the integral of | \u\ | 2 .) To find the first-order nec¬ 
essary conditions using the maximum principle, 
define a Hamiltonian as h(x, p,u) = tr(p T ux + 
u T u). Thus, p = —u T p and minimizing over u 
implies 2 u = —7t\(xp T ) where tt\ is the projec¬ 
tion onto K. The product m = xp T satisfies m = 
[m, n i (m)]. Using the structural properties of the 
Lie algebra, we see that (d/dt)jto(m) = 0 and 
that (d/dt)7t\(m) = [jti(m),mo\. Working out 
the implications, we see that trajectories of the 


form x(t) = e at e^ b ~ a>}t with a e H and b e K 
satisfy the first-order optimality conditions. 

To illustrate, we consider the generalization of 
an earlier example. Let X be the space of n- by- 
n orthogonal matrices and consider the system 
described by 

0 U\ ^2 * —j 

—u\ 0 0 • • • 0 
x = ~ u 2 0 0 • • • 0 x 

_ —u n -\ 0 0 ••• 0 

Here the role of H is played by the sub-algebra 
of the set of real n-by-n skew-symmetric ma¬ 
trices consisting those whose first row and col¬ 
umn vanish and K consists of the subset whose 
lower-right (n — l)-by -{n — 1) sub-matrix van¬ 
ishes. In this case, the paths satisfying the first- 
order necessary conditions take the form x(t) = 

e ht e {k-h)t x ^y 

Nonintegrability and Cyclic Processes 

Of course nonintegrable stands in opposition to 
the word integrable , as it is used in the consider¬ 
ation of integration performed along paths, e.g., 

1=1 g\(x)dxi +g 2 (x)dx 2 -\ -b g n (x)dx n 

Jy 

If the path y starts at x and ends at x, then the 
equality of mixed partials dgi/dxj = dgj/dxi 
implies that along any two paths with these end 
points, the integral has the same value, provided 
that one of the paths can be continuously de¬ 
formed into the other with the gi being well 
defined along the deformation. In particular, if 
y is a closed curve so x = x, then under these 
assumptions, the integral is zero. 

On the other hand, there is a large list of 
important processes in biology and engineering, 
such as those involving the thermodynamic cy¬ 
cles of internal combustions engines or air con¬ 
ditioners, that depend critically on nonintegrable 
effects. These include cyclic phenomena such as 
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walking and breathing and a widely used mecha¬ 
nisms for efficient voltage conversion in electrical 
engineering. Thus, both nature and technology 
provide examples of processes in which the pis¬ 
tons, valves, etc. move along a smooth path and at 
the end of a cycle return to their initial configura¬ 
tion, while a related integral is not zero. Perhaps, 
the best-known path problem of this type is the 
Carnot cycle. 

Questions about sub-Riemannian optimization 
enter here both as the optimization of the path 
defining the cycle and in the optimal regulation 
of the output of such cyclic processes. In general, 
the output can adjust both the amplitude and 
frequency of the cycle (volume of air per cycle 
and respiration rate), although in some cases one 
or the other of these might be fixed. For exam¬ 
ple, cruise control for automobiles regulates the 
frequency (rpm) of the engine but cannot adjust 
the stroke length of the pistons, whereas speed 
control of a running animal ordinarily involves 
adjusting both the length of the stride and the 
“steps” per minute. The primary considerations 
for these control processes are stability and re¬ 
sponse time, with the shape of the cycles being 
determined by some measure of efficiency. It 
seems that the optimization of such regulatory 
processes deserves more attention. 

Cross-References 

► Learning Theory 

► Markov Chains and Ranking Problems in Web 
Search 

► Modeling, Analysis, and Control with Petri 
Nets 

► Nonlinear Adaptive Control 

► Redundant Robots 


Recommended Reading 

Material on sub-Riemannian geometry can be 
found in the very readable survey (Strichartz 
1986) and in more depth in Gromov (1996). 
The examples discussed here have mostly come 
from the literature Brockett (1973a,b), Baillieul 


(1975), and Brockett (1999) and these papers 
contain motivational material as well. Symmetric 
spaces are discussed in the sub-Riemannian con¬ 
text in Strichartz (1986), but for the optimization 
aspect, see Brockett (1999). Reference Brockett 
(2003) studies the regulation of sub-Riemannian 
cycles. 
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Abstract 

An overview is given of the class of subspace 
techniques (STs) for identifying linear, time- 
invariant state-space models from input-output 
data. STs do not require a parametrization of the 
system matrices and as a consequence do not 
suffer from problems related to local minima 
that often hamper successful application of 
parametric optimization- based identification 
methods. 
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The overview follows the historic line of de¬ 
velopment. It starts from Kronecker’s result on 
the representation of an infinite power series by a 
rational function and then addresses, respectively, 
the deterministic realization problem, its stochas¬ 
tic variant, and finally the identification of a state- 
space model given in innovation form. 

The overview summarizes the fundamental 
principles of the algorithms to solve the problems 
and summarizes the results about the statistical 
properties of the estimates as well as the practi¬ 
cal issues like choice of weighting matrices and 
the selection of dimension parameters in using 
these STs in practice. The overview concludes 
with probing some future challenges and makes 
suggestions for further reading. 

Keywords 

Extended observability matrix; Hankel matrix; 
Innovation model; State-space model; Singular 
value decomposition (SVD) 

Introduction 

Subspace techniques (STs) for system identifi¬ 
cation address the problem of identifying state- 
space models of MIMO dynamical systems. The 
roots of ST were laid by the German mathe¬ 
matician Leopold Kronecker (°1823-U891). In 
Kronecker (1890) Kronecker established that a 
power series could be represented by a rational 
function when the rank of the Hankel operator 
with that power series as its symbol was finite. In 
the early 1990s of the twentieth century, new gen¬ 
eralizations of the idea of Kronecker were pre¬ 
sented for identifying linear, time-invariant (LTI) 
state-space models from input-output data or out¬ 
put data only. These new generalizations were 
formulated from different perspectives, namely, 
within the context of canonical variate analysis 
(Larimore 1990), within a linear algebra context 
(Van Overschee and De Moor 1994; Verhae- 
gen 1994), and subspace splitting (Jansson and 
Wahlberg 1996). Despite their different origin, 
the close relationship between these methods was 
quickly established by a unifying theorem that 


interpreted these methods as a singular value 
decomposition (SVD) of a weighted matrix from 
which an estimate of the column space of the 
observability matrix or the row space of the state 
sequence of the given system or Kalman filter 
for observing the state of that system is derived 
(Van Overschee and De Moor 1995). This sub¬ 
space calculation is the key feature that leads to 
the indication by ST for system identification or 
subspace identification methods (SIM). 

The STs are attractive complementary tech¬ 
niques to the maximum likelihood or prediction 
error framework. They do not require the user to 
specify a parametrization of the system matrices 
of the state-space model, and the user is not 
confronted with the problems due to possible 
local minima of a nonlinear parameter optimiza¬ 
tion method that is often necessary in estimating 
the parameters of a state-space model via, e.g., 
prediction error methods. Though the statisti¬ 
cal properties such as consistency and efficiency 
have been investigated, such as in Bauer and 
Ljung (2002), the estimates obtained via ST are 
in general not optimal in the statistical minimum 
variance sense. However, practical evidence with 
the use of ST in a wide variety of problems has 
indicated that ST provides accurate estimates. 
As such they are often used as an initialization 
to the maximum likelihood or prediction error 
parametric identification methods. 

In this chapter we make a distinction between 
output only or stochastic identification problems 
and input-output or combined deterministic- 
stochastic identification problems. The first 
occurs when identifying, e.g., the eigenmodes 
of a bridge from ambient acceleration responses 
of the bridge. The second occurs when, in 
addition to ambient excitations that cannot be 
directly measured, controlled excitations through 
actuators integrated in the system are used during 
the collection of the input-output data. 

The outline of this chapter is as follows. In the 
next section, we formulate the LTI state-space 
model identification problems and outline the 
general strategy of ST. The presentation of ST 
is given according to the historical development 
of ST. It starts with a summary of the solution 
to the deterministic realization problem, which 
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considers the noise-free “impulse” response 
of the system. Subsequently we present the 
stochastic realization problem which considers 
the output-only identification problem where 
the output is assumed to be a filtered zero- 
mean, white-noise sequence. The ST solution is 
discussed assuming samples of the covariance 
function of the output to be given. The 
deterministic-stochastic identification problem 
is considered in section “Combined Determin¬ 
istic-Stochastic ST.” In this section we first 
consider open-loop identification experiments. 
For this case, the basic linear regression problem 
is formulated that is at the heart of many 
ST. Second reference is made to a framework 
for analyzing and understanding the statistical 
properties of ST, the selection of the order, as 
well as to a number of open problems in the un¬ 
derstanding of important choices the user has to 
made. Closed-loop identification experiments are 
considered in the third part of section “Combined 
Deterministic-Stochastic ST,” while the fourth 
part makes a brief reference to ST papers that go 
beyond the LTI case. 

Finally we provide a brief overview on future 
research directions and conclude with some rec¬ 
ommended literature for further exploration. 

ST in Identification: Problems and 
Strategy 

The LTI system to be analyzed in this chapter is 
given by the following state-space model: 


x(k + 1) = Ax(k) + Bu(k ) + Ke(k) 
y(k) = Cx(k ) + Du(k ) + e(k) (1) 

with u(k) e M m the (measurable) input, 
e(k ) a zero-mean, white-noise sequence with 
E[e(k)e(k) T ] = R , y(k) e the (measurable) 
output, and x(k) e W 1 the state vector. This 
model is in the so-called innovation form since 
the sequence e(k ) is the innovation signal in a 
Kalman filtering context. 

The historical sequence of ST developments 
considers the following open-loop problem for¬ 
mulations. In the deterministic realization prob¬ 
lem, the innovation sequence e(k ) is zero, and the 
input u(k) is an impulse. The stochastic realiza¬ 
tion problem considers the case where the input 
u(k) is zero and the given data is assumed to be 
samples of the covariance function of the output. 
The combined deterministic-stochastic identifica¬ 
tion problem considers the model (1) for generic 
input u(k). 

The general strategy of ST is to formulate an 
intermediate step in deriving the parameters of 
the system matrices of interest from the given 
data; see Fig. 1. This intermediate step makes the 
ST different from the parametric model identifi¬ 
cation framework that aims for a direct estimation 
of the parameters of the system matrices by (in 
general) nonlinear parameter optimization tech¬ 
niques. The intermediate step in ST aims to deter¬ 
mine a matrix from the given data that reveals an 
(approximation of an) essential subspace of the 
unknown system. This essential subspace can be 


Intermediate ST strategy 



Direct (Non-Linear) Parameter Optimization Strategy 


Subspace Techniques in System Identification, Fig. 1 

Schematic representation of the intermediate step of 
ST to derive from the given data (input-output data 
{u(k), y (£)}, Markov parameters {CAJ~ l B}, etc.) a sub¬ 
space revealing matrix, from which the subspace of in¬ 
terest is computed via, e.g., singular value decomposition 
and that enables the computation of the state-space model 


realization by solving a (convex) linear least-squares prob¬ 
lem. The commonly used approach to directly go from 
the given data to a state-space realization via in general 
nonlinear parameter optimization methods is indicated by 
the arrow directly connecting the given data box to the 
state-space realization box 
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the extended observability matrix of (1) as given 
by the matrix O s : 


o. 


c 

CA 


for s > n, 


Following the cited result of Kronecker, the solu¬ 
tion to the minimum realization problem is based 
on the construction of the (block-)Hankel matrix 
H s , ,N constructed from the Markov parameters 
{CAj~ l B}» =x as 


CA s ~ l 


CB CAB ••• CA N ~ S B~\ 


or the state sequence of a Kalman filter designed 
for (1). Essential for ST is that both the interme¬ 
diate step to reveal the subspace of interest and 
the subsequent derivation of the system matrices 
from that subspace and the given data are done 
via convex optimization methods and/or linear 
algebra methods. 


Realization Theory: The Progenitor 
of ST 

The Deterministic Realization Problem 

In the 1960s, the cited result of Kronecker in¬ 
spired independently Ho and Kalman, Silverman 
and Youla, and Tissi to present an algorithm 
to construct a state-space model from a Hankel 
matrix of impulse response coefficients (Schutter 
2000). This breakthrough gave rise to the field 
of realization theory. One key problem in real¬ 
ization theory that paved the way for subspace 
identification is the determination of a minimal 
realization from a finite number of samples of 
the impulse response of a deterministic system, 
assumed to have a minimal representation as 
in (1) for e(k) = 0. The samples of the im¬ 
pulse response are called the Markov parame¬ 
ters. The minimal realization sought for is the 
LTI model with quadruple of system matrices 
[At, Bt , Ct , D], with Aj e M wxw and n minimal 
such that the pair ( Aj,Ct ) is observable, the 
pair ( At,Bt ) is controllable, and the transfer 
function D + CjizJ — A T )~ l B T equals D + 
C(zl — A)~ l B with z the complex variable of the 
z-transform. When A is stable, the latter transfer 
function can be written into the matrix power 
series: 


oo 

D + C{zI-A)~ l B = D + yy j CA j ~ l Bz~ i (2) 
j =i 


H s ,n — 


CA S ~ X B CA S B ■■■ CA N ~ l B 


( 3 ) 


For the deterministic realization problem, the 
intermediate ST step simply is the storage of the 
impulse response data into a Hankel matrix. The 
subsequent step is to derive from this matrix a 
subspace from which the system matrices can 
be either read-off or computed via linear least 
squares. How this is done is outlined next. 

When the order n of the minimal realization is 
known and the Hankel matrix dimension param¬ 
eters s, N are chosen such that 


s > n N > 2n — 1 (4) 

the Hankel matrix H s jy has rank n. A 
numerically reliable way to compute that rank 
is via the SVD of H s jy. Under the assumption 
that the rank of H s n is n, we can denote that 
SVD as U n Yi n Vj, with Yj n e W xn positive 
definite and with the columns of the matrices 
U n and V n orthonormal. By the minimality 
of (1) (for e(k) = 0), H s ^ can be factored 
as O s [B AB ••• A n ~ s B ] = O s Cn- s + i or as 

(UnT^n^^nVj^j, and these factors are related 
as 

u n = O s T~ l = O sJ si V n T = TC N -s+\ 

= Cn-s+1,T 

for T G M wxn a nonsingular transformation. 
Therefore O s ,t resp. Cn-s+u act as the 
extended observability resp. controllability 
matrix of a similarly equivalent triplet of system 
matrices (At, Bj,Ct)- This correspondence 
allows to read-off the system matrices Ct and 
Bt as the first l rows of the matrix O s j and 
the first m columns of Cn-s+i,t resp. Further 
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the shift-invariance property of the extended 
observability resp. controllability matrices allows 
to find the system matrix At of the minimal 
realization. For example, consider the extended 
observability matrix O s , then the shift-invariance 
property states that: 

O sJ ( 1 : (s-l)l,:)A T = O sJ (l + l : si ,:) (5) 

where the notation M(u : v,:) indicates the 
submatrix of M from rows u to rows v. The 
shift-invariance property delivers a set of linear 
equations from which the system matrix Aj can 
be computed via the solution of a linear least- 
squares problem when s > n. 

Finding the dimension parameters s (and N) 
of the Hankel matrix H s ^ is a nontrivial problem 
in general. When only the Markov parameters are 
given and the knowledge that they stem from a 
finite-order state-space model, a possible sequen¬ 
tial strategy is to select s and N equal to the 
upperbounds in (4) for presumed orders n and 
n - hi, respectively. When the rank of the Hankel 
matrices for these two selections of s (and N) is 
identical, the right dimensioning of the Hankel 
matrix H s ^ is found. Otherwise the presumed 
order is increased by one. 

The Stochastic Realization Problem 

The output-only identification problem aims at 
determining a mathematical model from a mea¬ 
sured multivariate time series {y(k)}^ =1 with 
y(k) G Such a model can be then used for 
predicting future values of the (output) data from 
past values. 

In the vein of the revival of the work of 
Kronecker on realizing dynamical systems from 
its impulse response, Faure and a number of 
contemporaries like Akaike and Aoki made pi¬ 
oneering contributions to extend this methodol¬ 
ogy to stochastic processes (Van Overschee and 
De Moor 1993). These extensions are known as 
solutions to the stochastic realization problem. 

This problem is formulated for y(k) to be a 
Markovian stochastic process. Reusing the nota¬ 
tion in (1) y(k) is assumed to be generated by 
(1) with the input u(k) = 0. The A matrix in (1) 
is again assumed to be stable. The given data in 


the early formulations of the stochastic realiza¬ 
tion problem was the samples of the covariance 
function 

Ry(j) = E[y(k)y{k - j) T ] 

These samples define the strictly positive real 
spectral density function of y ( k ): 

oo 

<Mz) = J2 R y(J)z~ j > 0 ( 6 ) 

j =-oo 

Given the samples of the covariance function 
R y (j ), the stochastic realization problem was to 
find an innovation model representation of the 
form 

x(k + 1) = A T x(k) + K T e'(k ) 

y(k) = C T x(k ) + e'(k ) (7) 

with e'(k ) a zero-mean, white-noise input with 
covariance matrix R e , the pair (At, Ct) observ¬ 
able, and At stable, such that the spectral density 
functions <£>j;(z) and <f>y(z) are equal. 

The partial similarity between this problem 
and the minimal realization problem becomes 
clear when expressing the covariance function 
samples R y (j ) in terms of the system matrices 
in (l)-for u(k) =0 as 

Ry O') = CA J_1 G for j Jt 0 (8) 

with the matrices G and R y (0) derived from the 
following covariance expressions: 

E[x(k)x(k) T ] = 

= AH x A t + KRK t (9) 
E[x(k + 1 )y(k) T ] = G : G 

= AH x C t + KR (10) 
E[y(k)y(k) T ] = R y ( 0) : R y (0) 

= C^ x G t + R (11) 

Since the spectral density has a two-sided series 
expansion, there is a so-called forward stochastic 
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realization problem (considering R y (j ) for 
j > 0 only) and a backward version. Here we 
only treat the forward one. Drawing the parallel 
between the samples of the covariance function 
R y (j ), as given in (6)—(8) and the Markov 
parameters in (2), we can use the deterministic 
tools from realization theory to find a minimal 
realization (At, Ct, Gt). 

The intermediate ST step in the stochastic 
realization problem is the construction of a Han- 
kel matrix similar to the matrix H s ^ as in the 
deterministic realization problem but now from 
the samples of the covariance function R y (j ) 
in (8). 

With the triplet ( A t ,Ct,Gt ) determined, the 
innovation model (7) is classically completed via 
the solution of a Riccati equation in the unknown 
E*. This Riccati equation results by noting that 
R > 0, and therefore, KRK T can be written as 
KR(R)~ l R t K t . This reduces the expression for 
E x in (9) with the help of (10) and (1) as 

£* = AX x A t + (G- AT, x C T )(R y (0) 

-CV X C T )~ 1 (G-AZ X C T ) T (12) 

By replacing the triplet (A, C, G ) with the found 
minimal realization (At, Ct, Gt) in this Riccati 
equation, its solution enables in the end to 
define the missing quantities as 

R e = R y ( 0) — Ct^xjCj 

K t = (G t - A T ^ xJ C^)R; 1 (13) 

By the positive realness of <& y (z) and the similar 
equivalence between the triplets (At,Ct,Gt) 
and (A, C, G), the solution Yi x j is positive defi¬ 
nite. 

A persistent problem in solving the stochas¬ 
tic realization problem has existed for a long 
time when using approximate values of the sam¬ 
ples R y (j ). This problem is that the estimated 
power spectrum based on estimates of the triplet 
(At, Ct, Gt) is no longer positive real. 

An approximate solution overcoming the 
problem of the loss of positive realness of the 


estimated power spectrum was provided in the 
vein of the ST developed in the early 1990s as 
discussed in the next section. 


Combined 

Deterministic-Stochastic ST 

Identification of LTI MIMO Systems in 
Open Loop 

Since the golden 1960s and 1970s of the twen¬ 
tieth century, many attempts have been made to 
make the insights from deterministic and stochas¬ 
tic realization theory useful for system identifi¬ 
cation. To mention a few, there are attempts to 
use the solutions to the deterministic realization 
problem with measured or estimated impulse re¬ 
sponse data. One such method is known under the 
name of the eigensystem realization algorithm 
(ERA) (Juang and Pappa 1985) and has been 
used for modal analysis of flexible structures, 
like bridges, space structures, etc. Although these 
methods tend to work well in practice for these 
resonant structures that vibrate (strongly), they 
did not work well for other type of systems and an 
input different from an impulse. Extensions to the 
stochastic realization problem considered the use 
of finite sample average estimates of the covari¬ 
ance function as an attempt to make the method 
work with finite data length sequences. As in¬ 
dicated in section “The Stochastic Realization 
Problem,” these approximations of the covariance 
function tended to violate the positive realness 
property of the underlying power spectrum. 

In the early 1990s of the twentieth century, 
new breakthroughs were made working directly 
with the input-output data of an assumed LTI sys¬ 
tem without the need to first compute the Markov 
parameters or estimating the samples of covari¬ 
ance functions. Pioneers that contributed to these 
breakthroughs were Van Overschee and De Moor, 
introducing the N4SID approach (Van Overschee 
and De Moor 1994); Verhaegen, introducing the 
MOESP approach (Verhaegen 1994); and Lari- 
more, presenting ST in the framework of canoni¬ 
cal variate analysis (CVA) (Larimore 1990). 

These three pioneering contributions consid¬ 
ered the identification of the state-space model 
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(1) from the input-output data {u(k), y(k)}% =l 
recorded in open loop. The pair ( A,C ) was 
assumed to be observable, and the pair (A, KR ) 
controllable. The innovation noise covariance 
matrix R was assumed to be positive definite. 

The formulation of the intermediate ST step 
from which these three pioneering contributions 
can be derived (by weighting the result of Theo¬ 
rem 1) and that is at the heart of many more vari¬ 
ants is summarized in Theorem 1 . This theorem 
requires two preparations: first the storage of the 
input and output sequences into (block-) Hankel 
matrices and relating these Hankel matrices via 
the model parameters and second to make three 
observations about the model (1) when presented 
in the prediction form. This form is obtained 
by replacing x(k) by x(k) and e(k) by y(k) — 
Cx(k ) — Du(k ) and is given by 

x(k + !) = (/! — KC)x(k ) 

+ (£ - KD)u(k) + Ky{k) 
y(k) = Cx(k) + Du(k ) + e(k) (14) 


To compact the notation we make the following 
substitutions: A = (A — KC ) and B = [(B — 
KD ) K]. 

Let the Hankel matrix with the “future” part 
{y(k))k= p + 1 be defined as 


y(p +i) y(p + 2) • • • y(N - f + \) 
y{p + 2) 


y(p + f) ••• y(N) J 

(15) 


for the dimensioning parameters p and / se¬ 
lected such that 


p> f >n 


D 0 ••• 0" 

CB D 0 

CAB CB 0 


CAf~ l B CAf~ 2 B ••• D 


and similarly the definition of the Toeplitz ma¬ 
trix T e from the quadruple of system matrices 
(A, K, C, /), we can relate the data Hankel ma¬ 
trices Yf and Uf as 

Y f = O f \x(p + 1) • • • x(N — f + 1)] 
~\~T U Uf + T e Ef 

= OfXf + T u U f + T e E f (16) 


Based on the prediction form (14), 3, key ob¬ 
servations are made to support the rational of the 
intermediate step summarized in Theorem 1 : 

01: The standard assumption that the transfer 
function from e(k) to y(k) is minimum 
phase leads to the fact that matrix A is 
stable. Therefore, there exists a finite integer 
p such that 

A p 0 


02: The state-pace model of (14) has inputs 
u(k) and y(k). Grouping both together into 
u(k) 

L y(V_ 

express the state x(k + p) as 


the new vector z(k) = 


enables to 


p 

x(k-\-p ) = A p x(k)+^2, A j ~ l Bz(k+p—j ) 
j =i 


for k > 1. With the assumption that 
A p & 0 and the definition of the input- 
output data vector sequence Z(k) = 
[z(k) T • • • z(k + p — l) r ] r , we have the 
following approximation of the state: 

x(k+p) % [A p ~ l B---B]Z(k) = C z Z(k) 


In a similar way we define the Hankel matrices 
Uf and E / from the input u(k ) and the inno¬ 
vation e(k ), respectively. Then with the defini¬ 
tion of the (block-)Toeplitz matrix T u from the 
quadruple of system matrices (A, B, C, D) as 


As such the state sequence Xf in (16) can 
be approximated by 

C z Z p =C z [Z(1)Z(2) • • • Z(N-f-p+ 1)]. 
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03: The (approximate) knowledge of the 
row space of the state sequence in Xf 
makes that the unknown system matrices 
(A, 13, C, D, K) appear (approximately) 
linearly in the model (14). 

The intermediate ST step to retrieve a matrix 
with relevant subspaces is summarized in the 
following theorem taken from Peternell et al. 
(1996). 

Theorem 1 (Peternell et al. 1996) Consider the 
model (1) with all stochastic processes assumed 
to be ergodic and with the input u(k) to be sta¬ 
tistically uncorrelated from the innovation e(l) 
for all k , l. Consider the following least-squares 
problem: 

[t u N A] = arg min \\Y f - [L u L z ] 

(17) 


Uf 

z„ 


Ilf 


with ||.||^ denoting the Frobenius norm of a 
matrix, then 


^lim^ L z n = OfC z + O f A p A z 

with A z a bounded matrix. 

The theorem delivers the matrix L Z N via the solu¬ 
tion of a convex linear least-squares problem that 
has asymptotically (in the number of measure¬ 
ments N) the extended observability matrix Of 
as its column space and that has asymptotically 
(in the number of measurements as well as in 
the dimension parameter p) the matrix C z as its 
row space. Based on the expression of the state 
sequence Xf given in the observation 02 above, 
the estimate of the row space of C z delivers an 
estimate of the row space of the state sequence 
Xf. The observation 03 then shows that this 
intermediate step allows to derive an estimate 
of the system matrices [A, B, C, D, K] (up to 
a similarity transformation) via a linear least- 
squares problem. 


Towards Understanding the Statistical 
Properties 

Many ST variants for system identification us¬ 
ing data recorded in open loop have been de¬ 
veloped since the early 1990s of the twentieth 


century. These variants mainly differ in the use 
of weighting matrices Wi and >V r in the product 
WiL z N W r prior to computing the subspaces of 
interest. The effect on the accuracy and the statis¬ 
tical properties of the estimated model by these 
weighting matrices is yet not fully understood 
as is that of the dimensioning parameters p and 
/ in the definition of the data Hankel matrices 
Yf,Uf,Z p . Only for very specific restrictions 
results have been achieved. For example, in Bauer 
and Ljung (2002), it has been shown that when 
the input u(k) in (1) is either non-present or zero- 
mean white noise, as well as when the system 
order n of the underlying system to be known and 
letting in addition to the dimension parameter p 
and the number of data points N the dimension 
parameter / go to infinity, that the weighting 
matrices selected to represent the CVA approach 
(Larimore 1990) yield an optimal minimum vari¬ 
ance estimate. A framework for analyzing the sta¬ 
tistical properties like consistency and asymptotic 
distribution of the estimates determined by the 
class of STs that were discovered in the 1990s is 
given in Bauer (2005). 

The minimum variance property of the esti¬ 
mates by the CVA approach (Larimore 1990) is 
theoretically not yet proven for more generic and 
practically relevant experimental conditions. For 
these cases, the choices of the different weight¬ 
ing matrices, the dimensioning parameters f p , 
as well as selecting the system order are of¬ 
ten diverted to user. Despite this fact, practi¬ 
cal evidence has shown that STs are able to 
accurately identify state-space models for LTI 
MIMO systems under industrially realistic cir¬ 
cumstances. As such they are by now accepted 
and widely used as a common engineering tool 
in various areas, such as model-based control, 
fault diagnostics, etc. Further they generally pro¬ 
vide excellent initial estimates to the nonlinear 
parametric optimization methods in prediction 
error or maximum likelihood estimation meth¬ 
ods. 

Identification of LTI MIMO Systems in 
Closed Loop 

The least-squares problem (17) in Theo¬ 
rem 1 leads to biased estimates when using 
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input-output data that is recorded in a closed-loop 
identification experiment. This is because of 
the correlation between the measurable input 
and the innovation sequence. A number of 
solutions have been developed to overcome 
this problem. We refer to the paper van der 
Veen et al. (2013) for an overview of a number 
of these rescues. A simple and performant 
rescue is described here based on the work in 
Chiuso (2010). The intermediate ST step in 
order to avoid biased estimates is to estimate 
a high-order vector autoregressive models 
with exogenous inputs, a so-called VARX 
model: 


N-p 

min II y(k + p) — 0Z(k) — Du(k + p)\\\ 
k=1 

(IB) 

Using the result on the approximation of the state 
vector x(k + p) in observation 02, it can be 
shown that the solution © of (18) is an approx¬ 
imation of the parameter vector: 

© = [cA^B ■■■OB] 

Then using this solution 0 and 01 above leads 
to the following “subspace revealing matrix” (cf. 
Fig. 1): 


CA p ~ l B CA P ~ 2 B ••• CA p ~ f B ••• CB 
0 CaFB CAA~f^B--- CAB 

o o ••• caFb ■■■caPb 


(19) 


As in the open-loop case of section “Iden¬ 
tification of LTI MIMO Systems in Open 
Loop,” column and row weighting matrices 
as well as changing the size of the subspace 
revealing matrix (19) can be used to influence 
the accuracy of the estimates (Chiuso 2010). The 
subspace of interest of this weighted subspace 
revealing matrix is its row space that is an 
approximation of that of the state sequence 
Xf as in (16), now extended to make the size 
compatible to the weighted version of (19). 
Similarly as in the open-loop case, knowledge of 
this subspace turns the estimation of the system 
matrices [A, B,C, D, K] (up to a similarity 
transformation) into a linear least-squares 
problem. The statistical asymptotic properties 
of this closed-loop ST and the treatment of the 
dimensioning parameters have also been studied 
in Chiuso (2010). Here, the result is proven that 
the asymptotic variance of any system invariant 
of the model estimated via the above closed- 


loop ST is a nonincreasing function of the 
dimensioning parameter / when the input u(k) 
to the plant is generated by an LQG controller 
with a white-noise reference input. 

Beyond LTI Systems 

The summarized discrete-time ST methodology 
has been extended in various ways. A number of 
important extensions including representative 
papers are towards continuous-time systems 
(van der Veen et al. 2013), using frequency- 
domain data (Cauberghe 2006) or for different 
classes of nonlinear systems, like block- 
oriented Wiener and/or Hammerstein and linear 
parameter-varying systems (van Wingerden and 
Verhaegen 2009). ST for linear time-varying 
systems with changing dimension of the state 
vector is treated in Verhaegen and Yu (1995), and 
finally we mention the developments to make ST 
recursive (van der Veen et al. 2013). 
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Summary and Future Directions 

Subspace techniques aim at simplifying the 
system identification cycle and make it more 
user-friendly. Still a number of challenges persist 
in improving on this general goal. A critical 
one is the “optimal” selection of the weighting 
matrices and the dimensioning parameters p and 
/ of the subspace revealing matrix. Optimality 
here can be expressed, e.g., by the minimality 
of the variance of the estimates but could 
also be viewed more generally in relationship 
with the use of the model, e.g., in terms of 
the performance of a model-based closed-loop 
design. A profound theoretical framework is 
necessary to fully automate the selection of the 
weighting matrices and dimensioning and order 
indices. This would substantially contribute to 
fully automated identification procedures for 
doing system identification (for linear systems). 

A second challenge is to better integrate 
ST with robust controller design. This requires 
the assessment of the model quality and the 
selection of an optimal input. Particular to 
the integration of ST to control design is the 
striking similarity of data equations used in ST 
and model predictive control. The challenge is 
to further exploit this similarity to develop data- 
driven model predictive control methodologies 
that are robust w.r.t. the identified model 
uncertainty. 

One interesting development in ST is the use 
of regularization via the nuclear norm in order to 
improve the model order selection with respect 
to, e.g., SVD-based ST in Liu and Vandenberghe 
( 2010 ). 

A final challenge is to extend ST for LTI sys¬ 
tems to other classes of dynamic systems, such as 
nonlinear, hybrid, and large-scale systems. 

Cross-References 

► Linear Systems: Discrete-Time, Time-Invariant 
State Variable Descriptions 

► Realizations in Linear Systems Theory 


► Sampled-Data Systems 

► System Identification: An Overview 


Recommended Reading 

The recommended readings for further study are 
the books that appeared on the topic of subspace 
identification. In the books Verhaegen and Ver- 
dult (2007) and Katayama (2005), the topic of 
subspace identification is treated in a wider con¬ 
text for classroom teaching at the MSc level since 
more elaborate topics relevant in the understand¬ 
ing of ST are treated, such as key results from 
linear algebra, linear least squares, and Kalman 
filtering. The book Van Overschee and De Moor 
(1996) is focused on subspace identification only 
and also emphasizes the success of ST on various 
applications. All these books provide access to 
numerical implementations for getting hands-on 
experience with the methods. The integration of 
subspace methods with other identification ap¬ 
proaches is done in the toolbox (Ljung 2007). 

There also exist a number of overview arti¬ 
cles. An overview of the early developments of 
ST since the 1990s of the twentieth century is 
given in Viberg (1995). Here also the link be¬ 
tween ST for identifying dynamical systems and 
the signal processing application of direction-of- 
arrival problems was clearly made. A more recent 
overview article is van der Veen et al. (2013). In 
this article also reference is made to the statistical 
analysis and closed-loop application of ST. 

Many papers have appeared reporting success¬ 
ful application of subspace methods in practical 
applications. We refer to the book Van Overschee 
and De Moor (1996) and the overview paper 
van der Veen et al. (2013). 
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Abstract 

We introduce background and base model for 
supervisory control of discrete-event systems, 
followed by discussion of optimal controller 
existence, a small example, and summary of 
control under partial observations. Control 
architecture and symbolic computation are noted 
as approaches to manage state space explosion. 
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Introduction 

Discrete-event (dynamic) systems (DES or 
DEDS) constitute a relatively new area of 
control science and engineering, which has 
taken its place in the mainstream of control 
research. Recently, DES have been combined 
with continuous systems in an area called hybrid 
systems. 
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Problems and methods for DES have been 
investigated for some time, although not neces¬ 
sarily with a “control” flavor. The parent domains 
can be identified as operations research and soft¬ 
ware engineering. 

Operations research deals with systems of in¬ 
terconnected stores and servers which operate 
on processed items. For instance, manufacturing 
systems employ queues, buffers, and bins (which 
store workpieces). These are served by machines, 
robots, and automatic guided vehicles (AGVs), 
which process workpieces. The main problems 
are to measure quantitative performance and es¬ 
tablish trade-offs, for instance flow vs. cost, and 
to optimize design parameters such as buffer size 
and maintenance frequency. 

The relevant areas of software engineering 
include operating systems control, concurrent 
computing, and real-time (embedded or reactive) 
systems, with focus on synchronization algo¬ 
rithms that enforce mutual exclusion and resource 
sharing in the presence of concurrency, as in the 
classical problems of Readers & Writers and 
Dining Philosophers. The main objectives are 
(i) to guarantee safety (“Nothing bad will ever 
happen”), as in mutual exclusion and deadlock 
prevention, and (ii) to guarantee liveness 
(“Something good will happen eventually”), for 
instance, successful computational termination 
and eventual access to a desired resource. 


DES from a Control Viewpoint 

With these domains in mind, we consider DES 
from a control viewpoint. In general, control 
deals with dynamic systems, defined as entities 
consisting of an internal state space, together 
with a state-evolution or transition structure, and 
equipped (for control purposes) with both an 
input mechanism for actuation and an output 
channel for observation and feedback. The ob¬ 
jective of control is to bring together information 
and dynamics in some purposeful combination: 
the interplay between observation and control or 
decision-making is fundamental. 

In this framework, a DES is a dynamic sys¬ 
tem that is discrete, in time and usually in state 


space; is asynchronous or event driven, that is 
driven by events or instantaneous happenings in 
time (which may or may not include the tick 
of a clock); and is nondeterministic, namely, 
embodies internal chance or other unmodeled 
mechanisms of choice which govern its state 
transitions. With a manufacturing system, for 
example, the dynamic state might include the 
status of machines (idle, working, down, under 
maintenance or repair), the contents of queues 
and buffers, and the locations and loads of robots 
and AGVs, while transitions (discrete events) 
occur when queues and buffers are incremented 
or decremented, robots load or unload, and ma¬ 
chines start work, finish work, or break down 
(the “choice” between finishing work success¬ 
fully and breaking down, being thus nondeter¬ 
ministic). In this example and many others, the 
objectives of design and analysis include logi¬ 
cal correctness in the presence of concurrency 
and timing constraints, and quantitative perfor¬ 
mance such as rates of production, all of which 
depend crucially on feedback control synthesis 
and optimization. To this end the models will 
tend to be DES or hybrid systems. Nevertheless 
one finds the continuing relevance of standard 
control-theoretic concepts like feedback, stabil¬ 
ity, controllability, and observability, along with 
their roles in large-system architectures embody¬ 
ing hierarchical, decentralized, and distributed 
functional organization. 

Here we focus on models and problems from 
which explicit constraints of timing are absent 
and which can be considered in a framework of 
finite-state machines and the corresponding reg¬ 
ular languages. While the theory has been gener¬ 
alized to more flexible and technically advanced 
settings, our restricted framework is already rich 
enough to support numerous applications and re¬ 
mains challenging for large systems of industrial 
size. 


Base Model for Control of DES 

The formal structure of a DES to be controlled 
will resemble the simple “machine” called 
MACH shown in Fig. 1. The state set of MACH 




1398 


Supervisory Control of Discrete-Event Systems 


is Q = {I,W,D}, interpreted as Idle, Working, 
or Broken Down. MACH is initialized at state 
q 0 = I , denoted by an entering arrow without 
source. The transition structure is displayed in 
Fig. 1 as a transition diagram, whose nodes are 
the states q e Q and edges are the transitions, 
each labeled with a symbol a in the alphabet 
E, here {w,c,b,r}. If a transition (labeled) g 
is an edge from q to q ', then “the event g can 
occur at state qT Transitions (or events) are 
interpreted as instantaneous in time, while states 
are thought of as locations where MACH is able 
to reside for some indeterminate time interval. 
The occurrence of w means “MACH enters the 
Working state from Idle” and similarly for c,b,r. 
These transitions determine the state-transition 
function of MACH, denoted by <5 : Q x E Q . 
Thus 8(1, w) = W, 8(W,b) = D , and so on. 
Notice that 8 is a partial function, defined at each 
state q e Q for only a subset of event (labels) 
in E. To denote that 8(q,o) is defined at state 
q e Q for the event o e E, we write 8(q,o)\. 
The function 8 can be extended in a standard way 
to 8 : Q x E* —>► Q, where E* is the set of 
all finite strings of elements of E, including the 
empty string €. Thus 8(q,e) := q and inductively 
if q' := S(q,s)l, then 

8(q,s.a ) := 8(8(q, s), a) := 8(q' ,g ) 

whenever 8(q' ,g)\. Graphically the strings s = 
G\ ... Gk £ E* for which 8(q,s)\ are precisely 
those for which there exists a path in the transi¬ 
tion diagram starting from q and having succes¬ 
sive edges labeled G \,..., Gk . 

We call any subset of E* (i.e., any set of 
strings of elements from E) a language over 
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E and accordingly speak of sublanguages of a 
language over E. 

For MACH, the execution of a production cy¬ 
cle, namely the event sequence (or string) w.c, or 
a work-breakdown-repair cycle, the string w.b.r., 
can be considered successful, and the correspond¬ 
ing string is said to be marked. States which are 
entered by marked strings are marked states and 
identified in a transition diagram by an outgoing 
arrow with no target. In Fig. 1, the only marked 
state happens to be the initial state, which is thus 
shown with a double arrow; in general there could 
be several marked states, which may or may 
not include the initial state. The marked states 
comprise a subset Q m c Q , which may be empty 
(at one extreme) or equal to Q (at the other). 
The case Q m = Q (all states marked) would 
imply that every string of events is considered 
as significant or successful as any other, while 
the case Q m = 0 (no state marked, so there are 
no successful strings) plays a technical role in 
computation. 

In general a generator is a tuple G = 
(Q,H,8,q 0 , Q m ) usually interpreted physically 
as for MACH above, but mathematically 
consisting merely of the finite-state set Q, finite 
alphabet E, marked subset Q m c Q, with initial 
state q 0 e Q, and (partial) transition function 
8 : Q x E —> Q. Additionally we bring in the 
closed behavior L(G) of G, defined as all the 
strings of E* which G can generate starting from 
the initial state, in the sense 

L(G):={seX*\8(q 0 ,s)\}. 

Of central importance also is the marked behavior 
of G, namely, the sublanguage of L( G) given by 

L m (G) := {s e L(G) \ 8(q 0 ,s) e Q m }. 

We need several definitions. A string s' is a 
prefix of a string s e E*, written s' < s, if s' 
can be extended to s, namely, there exists a string 
w in E* such that s'.w = s. The closure of a 
language M c E* is the language M consisting 
of all prefixes of strings in M : 

M := {s' e E* | s' < s for some s in M } 
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A language N over E is (prefix-)closed if it 
contains all its prefixes, namely, N = N. In this 
notation G is said to be nonblocking if L( G) = 
L m (G), namely, any (generated) string in L(G) is 
a prefix of, and so can be extended to, a marked 
string of G. 

The semantics of G (its mathematical mean¬ 
ing) is simply the pair of languages L m ( G), 
L(G). In general the latter may be infinite subsets 
of E*, while G itself is a finite object, considered 
to represent an algorithm for the generation of 
its behaviors. Unless G is trivial (has empty state 
set), it is always true that € e L(G). 

Transition labeling of G is deterministic: at 
every q , at most one transition is defined for each 
given event cr, namely, 

8{q , cr) = q' & 8(q, cr) = q" implies q' = q ,r . 

It is quite acceptable, however, that at distinct 
states q and r, both 8(q,o)\ and 8(r,a)l (where 
these evaluations are usually not equal). 

To formulate a control problem for G, we 
first adjoin a control technology or mechanism 
by which G may be actuated to affect its tem¬ 
poral behavior, namely, determine the strings it 
is permitted to generate. To this end we assume 
that a subset of events E c c E, called the 
controllable events, are capable of being enabled 
or disabled by an external controller. Think of a 
traffic light being turned green or red to allow 
or prohibit passage (vehicle transition) through 
an intersection. The complementary event subset 
E M := E — E c is uncontrollable; events cr e E M 
cannot be externally disabled but may be consid¬ 
ered permanently enabled. For G = MACH one 
might reasonably assume E c = {w, r}, E M = 
{c,b}. At a given state q of G, it will be true in 
general that 8(q,o)\ both for some (controllable) 
events a e E c and for some (uncontrollable) 
events cr e E M . Among the a e E c , at a given 
time, some may be externally enabled and others 
disabled. So, G will nondeterministically choose 
its next generated event from the subset 

{cr G E M | 8(q,o)\} U {cr e E c | 8(q,a)\ & 

cr is externally enabled} 

( 1 ) 


We formalize external enablement by a supervi¬ 
sory control function V : L( G) —> Pwr(E), 
where Pwr(.) stands for power set. For s e 
L(G), the evaluation F(s) is defined to be the 
event subset 

F(s) := E M U {cr e E c | a is externally enabled 
following s} (2) 

In other words, the set (1) is expressible as 

V(s) n {a e E | s.o e L(G)} (3) 

namely, the subset of events that, immediately 
following the generation of s by G, are either 
enabled by default (executable events in E M ) or 
else by the external controller’s decision (a subset 
of executable events in E c ). 

It is now easy to visualize how the generating 
action of G is restricted by the action of F(.). 
Initially (having generated the empty string) G 
chooses G\ e V(c). Proceeding inductively, after 
G has generated s = o\ • • • &k £ L(G), s is 
fed back to the controller, which evaluates V ( s ) 
according to (2), announcing the result to G, 
which then chooses cr^+i in (3), and the process 
repeats. Of course the process would terminate 
any time the set (3) happened to become empty 
(although it need not). In any case, we denote the 
subset of T(G) so determined as L(F/G), called 
the closed behavior of V/G, where the latter 
symbol (formally undefined) stands for G under 
the supervision of V. It is clear that supervision 
is a feedback process (Fig. 2), inasmuch as the 
choice of cr^+i in (3) is not, in general, known 
in advance, hence must be executed before the 
succeeding evaluation V(s.a/c+ 1 ) can allow the 
generating process to continue. With the closed 
behavior of V/G now determined, we define the 
marked behavior 

L m (V/G) := L(V/G) n L m ( G) (4) 

namely, those marked strings of G that survive 
under supervision by V. Thus supervisory control 
is nonblocking if L(V/G) = L m (V/G). 
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Feedback loop V/G 


Existence of Controls for DES: 
Controllability 

Of fundamental interest is the question: what sub¬ 
languages of L(G) qualify as a language L(V/ G) 
for some choice of supervisory control function 
VI In other words, what is the scope of controlled 
behavior(s) for a given G? So far we know that 
L(V/ G) is a sublanguage of L(G), but it is not 
usually the case that an arbitrary sublanguage 
would qualify. For instance, the empty string 
language {e} ^ L(V/G) for any V as in (2) 
above, in case 8(q 0 , cr)! for some a in X M , for such 
a cannot be disabled. 

Assume G is equipped with the technology of 
controllable events, hence uncontrollable events 
X M ^ E. We make the basic definition: the 
language K C X* is controllable (with respect 
to G) provided 

For all s e K and for all a e X M , 

whenever s.a e L( G) then s.a e K. (5) 

Informally, a string s can never exit from 
K as the result of the execution by G 
of an uncontrollable event: K is invariant 
under the uncontrollable flow. In terms of 
G = MACH, above, the languages {6}, { wb , wc} 
are controllable, but {w}, {w,wcw} are not. 
For instance, H := {w,wcw} has closure 
H = {e, w, wc, wcw}, which contains the string 
s := w, but sb = wb can be executed in 
MACH, b is uncontrollable, and sb has exited 
from H . It is logically trivial from (5) that the 
empty language 0 (with no strings whatever) is 
controllable. 

We can now answer the fundamental question 
posed above. 


Given a nonempty sublanguage K c L(G), 

there exists a supervisory control function V 

( 6 ) 

such that K = L(V/ G), if and only if 

K is controllable. 

This result exhibits the L(V/ G) property in a 
structured way; furthermore, both the contain¬ 
ment K C L(G) and the controllability property 
(5) (or its absence) can be effectively (algorith¬ 
mically) decided in case K itself is the closed or 
marked behavior of some given DES over X. 

A key fact easily provable from (5) is that the 
family of all controllable languages (with respect 
to a fixed G) is algebraically closed under union, 
namely, 

If K\ and K 2 are controllable languages, 
then so is K\ U K 2 . (7) 

In fact (7) can be extended to an arbitrary finite or 
infinite union of controllable languages. 

Given G as above, considered as the plant to 
be controlled, suppose a new (regular) language 
E is specified, as the maximal set of strings 
that we are prepared to tolerate for generation 
by G; for instance, E could be considered the 
legal language for G (irrespective of what G is 
potentially capable of generating, namely, L( G)). 
Let us confine attention to the sublanguage of E 
that contains only marked strings of G, namely, 
E PI L m (G). We now bring in the family C(E PI 
L m (G)) of all controllable sublanguages of E Pi 
L m (G) (including the empty language). From 
(7) and its infinite extension, there follows the 
existence of the controllable language 

^ sup := U{K | K e C{E n L m (G))} (8) 

We have K SU p c E Pi L m (G), and clearly if 
K f is controllable and K f C E Pi L m ( G), then 
K r c K su p. K su p is therefore the supremal 
(largest) controllable sublanguage of E Pi L m (G). 
Furthermore, if A^ sup is nonempty, then by (6) 
there exists a supervisory control V such that 
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Hasse diagram 


K su p = L(V/ G); in this sense V is optimal 
(maximally permissive), allowing the generation 
by G of the largest possible set of marked strings 
that the designer considers legal. We have thus es¬ 
tablished abstractly the existence and uniqueness 
of an optimal control for given G and E. This 
simple conceptual picture is displayed (Fig. 3) 
as a Hasse diagram, in which nodes represent 
sublanguages of X* and rising lines (edges) the 
relation of sublanguage containment. 

In a Hasse diagram it could be that K sup col¬ 
lapses to the empty language 0. This means that 
there is no supervisory control for the problem 
considered, either because the specifications are 
too severe and the problem is over-constrained 
or because the control technology is inadequate 
(more events need to be controllable). 

Under the finite-state assumption, K sup is 
effectively representable by a DES KSUP, which 
may serve as the optimal feedback controller, 
as displayed in Fig. 4. Here a string s generated 
by G drives KSUP; at each state of KSUP, 
the events defined in its transition structure are 
exactly those available to G for nondeterministic 
execution (in its corresponding state) at the next 


synchronization 


KSUP 

II 

G 

8 










Supervisory Control of Discrete-Event Systems, Fig. 4 

Implementation of V/G 


step of the process. In this way the feedback 
control process is inductively well defined. The 
computational complexity of this design (cf. 
(8)) is D(|E| 2 • |G| 2 ) where E is a DES with 
L m ( E) = E and | • | denotes state size. The 
controller state size is |KSUP| < |E| • |G|, the 
product bound being of typical order. 


Supervisory Control Design: 

Small Factory 

The following example, Small Factory (SF), is an 
illustration of supervisor design. As in Fig. 5, SF 
consists of two machines MACH1 and MACH2 
each similar to MACH above, connected by a 
buffer BUF of capacity 2. In case of breakdown 
the machines can be repaired by a SERVICE 
facility as shown. Transition structures of the 
machines and design specifications are also dis¬ 
played in Fig. 5. (S M ) are odd (even) num¬ 
bered events. When self-looped with all irrelevant 
events to form BUFSPEC, the latter specifies 
that the machines must be controlled in such a 
way that BUF is not overflowed (an attempt by 
MACH1 to deposit a workpiece in BUF when 
it is full) or subject to underflow (an attempt by 
MACH2 to take a workpiece from BUF when it 
is empty). In addition, SERVICE must enforce 
priority of repair for MACH2: when the latter 
is down, repair of MACH1 (if in progress) must 
be interrupted and only resumed after MACH2 
has been repaired; this logic is expressed by 
BRSPEC (appropriately self-looped). To form 
the plant model G for the DES to be controlled, 
we compute the synchronous product of MACH1 
and MACH2. The result, say G = FACT, is a 
DES of which the components MACHi are free 
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to execute their events independently except for 
synchronization on events that are shared (here, 
none). Similarly we form the synchronous prod¬ 
uct of BUFSPEC and BRSPEC to obtain the full 
specification DES SPEC. We now execute the 
optimization step in the Hasse diagram (Fig. 3); 
this yields the SF controller KSUP(21,47) with 
21 states and 47 transitions. Online synchroniza¬ 
tion of KSUP with FACT will result in genera¬ 
tion of the optimal controlled behavior ^ sup by 
the feedback loop. Since AT sup c L m ( G) by (8), 
our marking conventions ensure that KSUP is 
nonblocking. 

In general the language ^ S up will include in its 
structure not only the constraints required by con¬ 
trol but also the physical constraints enforced by 
the plant structure itself (here, FACT). The latter 
are thus redundant in the online synchronization 
of the plant with the controller KSUP. A more 
economical controller is obtained if the plant 
constraints are projected out of KSUP to obtain 
a reduced controller, say KSIM. Mathematically, 
projection amounts to constructing a control con¬ 
gruence or dynamically (and control) consistent 
partition on the state set of KSUP and taking 
the cells of this partition, abstractly, as the new 


states for KSIM. In SF KSUP (21,47) is reduced 
to KSIM(5,18), which when synchronized with 
FACT yields exactly KSUP but is less than one- 
quarter the state size. In practice a state size 
reduction factor of ten or more is not uncommon. 


Supervisor Architecture and 
Computation 

As noted earlier, the state size |KSUP| of con¬ 
troller KSUP is on the order of the product 
of state sizes of the plant, say | PLANT |, and 
specification, say |SPEC|. As these in turn are the 
synchronous products of individual plant compo¬ 
nents or partial specifications, |KSUP| tends to 
increase exponentially with the numbers of plant 
components and specifications, the phenomenon 
of exponential state space explosion. The result 
is that centralized or monolithic controllers such 
as KSUP can easily reach astronomical state 
sizes in realistic industrial models, thereby be¬ 
coming infeasible in terms of computer storage 
for practical design. This issue can be addressed 
in two basic ways: by decentralized and hier¬ 
archical architectures, possibly in heterarchical 
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combination, and by symbolic DES representa¬ 
tion and computation, where what is stored are 
not DES and their controller transition struc¬ 
tures in extensional (explicit) form, but instead 
intensional or algorithmic recipes from which the 
required state and control variable evaluations are 
computed online when actually needed. 


Supervisory Control Under Partial 
Observations 

Hierarchical control is one example of control 
under partial observations, a high-level manager 
(say) observing not full low-level operation but 
rather an abstraction. Partial observation has been 
studied mainly for abstractions given by natural 
projections. For a DES G over alphabet E, let 
E 0 c E be a subalphabet interpreted as the 
events that can be recorded by some external 
observer. A mapping P : E* -> E* is called 
a natural projection if its action is simply to erase 
from a string s in E* all the events in s (if 
any) that do not belong to E 0 , while preserving 
the order of events in E 0 . P extends naturally 
to a mapping of languages over E. One can 
then implement an induced operator on DES, say 
Project (G) = PG, with semantics 

L m (PG) = PL m (G),L(PG) = PL( G). 

While in worst cases |PG| can be exponentially 
larger than |G|, such blowup seems to be rare, 
and typically |PG| < |G|, namely, P results 
in simplification of the model G. By use of P 
it is possible to carry over to DES the control- 
theoretic concept of observability. Two strings 
s,s' e E* are look-alikes with respect to P if 
Ps = Ps', namely, are indistinguishable to an 
observer (or channel) modeled by P . Thus, given 
G and P as above, a sublanguage K c L(G) 
is observable if, roughly, look-alike strings in K 
have the same one-step extensions in K that are 
compatible with membership in L(G) and also 
satisfy a consistency condition with respect to 
membership in L m ( G). For control under ob¬ 
servations through P, one defines a supervisory 


control function V : L(G) —> Pwr(E) to be 
feasible if it assumes the same value on look-alike 
strings, in other words respects the observation 
constraint enforced by P. It then turns out that 
a language K c L m ( G) can be synthesized in 
a feedback loop including G and the feedback 
channel P if and only if K is both controllable 
and observable. 

Although this result is conceptually satisfy¬ 
ing, it is computationally inconvenient because, 
by contrast with controllability, the property of 
sublanguage observability is not in general closed 
under union. A substitute for observability is 
sublanguage normality, a stronger property than 
observability but one that is indeed closed un¬ 
der union. Since the family of controllable and 
normal sublanguages of a given specification lan¬ 
guage is nonempty (the empty language belongs) 
and is closed under union, a (unique) supremal 
(or optimal) element exists and can be computed; 
it therefore solves the problem of supervisory 
control under partial observations, albeit under 
the normality restriction. The latter has the fea¬ 
ture that the resulting supervisor can only disable 
a controllable event if the latter is observable, 
i.e., belongs to E 0 . In some applications this 
restriction might preclude the existence of a so¬ 
lution altogether; in others it could be harmless, 
or even desirable as a safety property, in that if 
the intended disablement of a controllable event 
happened to fail, and the event occurred after 
all, the fault would necessarily be observable and 
thus optimistically remediable in good time. 

An intermediate property is known that 
is weaker than normality but stronger than 
observability, called relative observability. The 
family of relatively observable sublanguages of 
a given specification language is closed under 
union and thus does possess a supremal element, 
which in the regular case can be effectively 
computed. When combined with controllability, 
relative observability yields a solution to the 
problem of supervisory control under partial 
observations which places no limitation on the 
disablement of unobservable controllable events. 
Examples show that a nontrivial solution of this 
type may exist in cases where the normality 
solution is empty. 
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Summary and Future Directions 

Supervisory control of discrete-event systems, 
while relatively new, has reached a first level 
of maturity in that it is soundly based in 
a standard framework of (especially) finite- 
state machines and regular languages. It has 
effectively incorporated its own versions of 
control-theoretic concepts like stability (in 
the sense of nonblocking), controllability, 
observability, and optimality (in the sense of 
maximal permissiveness). Modular architectures 
and, on the computational side, symbolic 
approaches enable design of both monolithic 
and heterarchical/distributed controllers for DES 
models of industrial size. Major challenges 
remain, especially to develop criteria by which 
competing architectures can be meaningfully 
compared and to organize control functionality 
in ways that are not only tractable but also 
transparent to the human user and designer. 


Cross-References 

► Applications of Discrete-Event Systems 

► Models for Discrete Event Systems: An 
Overview 
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Abstract 

Switching adaptive control is one of the advanced 
approaches to adaptive control. By employing an 
array of simple candidate controllers, a properly 
designed monitoring function and switching law, 
this approach is capable to search in real time 
for a correct candidate controller to achieve the 
given control objective such as stabilization and 
set-point regulation. This approach can deal with 
large parameter uncertainties and offers good 
robustness against unmodelled dynamics. This 
article offers a brief introduction to switching 
adaptive control, including some historical back¬ 
ground, basic concepts, key design components, 
and technical issues. 

Keywords 

Adaptive control; Hybrid systems; Multiple mod¬ 
els; Supervisory control; Switching logic; Uncer¬ 
tain systems 

Introduction 

Switching adaptive control, also known as 
switched adaptive control or multiple model 
adaptive control, refers to an adaptive control 
technique which deploys a set of controllers 
and a switching law to achieve a given control 
objective. The concept of switching adaptive 
control is generalized from the traditional gain 
scheduling technique (Leith and Leithead 2000). 
As in the standard adaptive control setting, the 
model for the controlled plant is assumed to 
contain uncertain parameters, and the control 
objective is to stablize the system and, in many 
cases, to deliver certain performance using 
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real-time information in the measured output. 
What differentiates switching adaptive control 
from gain scheduling is that the uncertain 
parameters are not directly measured and the 
switching is determined by the system response. 
This seemingly minor difference is very impor¬ 
tant because parameter estimation may not be 
possible due to the lack of persistent excitation; 
moreover, the sensitivity of the measured output 
is often suppressed by the feedback control which 
makes closed-loop identification of the uncertain 
parameters difficult. Compared with classical 
adaptive control, switching adaptive control 
has better inherent robustness against parameter 
uncertainties and unmodelled dynamics. 

By early 1980s, the classical adaptive control 
theory for linear systems had been well estab¬ 
lished under a set of so-called classical assump¬ 
tions, which include: 

• Known order of the plant (or known maximum 
order of the plant) 

• Known relative degree of the plant 

• Minimum phase dynamics 

• Known sign of the high-frequency gain (which 
is the gain of the plant when the input is high- 
frequency sinusodial signal) 

At the same time, it was recognized that the 
classical adaptive control approach has inherent 
robustness problems against even miniature un¬ 
modelled dynamics (Rohrs et al. 1985). While 
this generated a wave of research aiming at robus- 
tification of the classical adaptive control theory 
(see, e.g., Ioannou and Sun 1996), a new line 
of research took place aiming at relaxing the 
classical assumptions. Nussbaum (1983) paved 
the way by showing that knowledge of the sign 
of the high-frequency gain can be avoided for a 
first order linear system. Morse (1985) developed 
a “universal controller” which can adaptively sta- 
blize any strictly proper, minimum-phase system 
with relative degree not exceeding two. Martens- 
son (1985) gave a very surprising result by show¬ 
ing that asymptotic stabilization can be achieved 
adaptively by simply assuming that there exists 
a finite order stabilizer. But Martensson’s con¬ 
troller is impractical due to the need for exhaus¬ 
tive online search of the stabilizer and subsequent 
excessively high overshoots. Switching adaptive 


control was then introduced in Fu and Barmish 
(1986), aiming at achieving adaptive stabilization 
with minimal assumptions and a guarantee of 
exponential convergence rate for the state. In 
contrast to the work of Martensson, a compact¬ 
ness requirement is made on the set of possible 
plants and an upper bound on the order of the 
plant is assumed. These assumptions allow a set 
of possible plants to be partitioned into a finite 
number of subsets, with each stabilizable by a 
single controller. A monitoring function and a 
switching law are then designed to sequentially 
eliminate incorrect candidate controllers until an 
appropriate controller is found. Due to the fact 
that the number of candidate controllers may be 
large, many follow-up works on switching adap¬ 
tive control focused on speeding up the switching 
process by eliminating incorrect candidate con¬ 
trollers without trying them (Zhivoglyadov et al. 
2000, 2001). These results can also deal with 
slowly time-varying parameters and infrequent 
parameter jumps. 

Another major breakthrough came from the 
works of Morse (1996, 1997) under the term 
of supervisory control. His work considers set- 
point regulation for uncertain linear systems. A 
different compactness requirement is used to al¬ 
low unmodelled dynamics in the system. More 
specifically, the given uncertain linear system is 
assumed to belong to a union of sub-families of 
systems, with each sub-family having a linear 
controller capable to achieve set-point regulation. 
Suitably defined output-squared estimation errors 
are used as monitoring functions and a candi¬ 
date controller is selected whose corresponding 
performance signal is the smallest. The major 
advantages of this switching law are that the 
“correct” controller can usually be quickly iden¬ 
tified without cycling through all possible can¬ 
didate controllers, leading to a good closed-loop 
performance. 

More recent research on switching adaptive 
control focuses on more systematic and alterna¬ 
tive approaches to the design of candidate con¬ 
trollers and switching laws; see, e.g., Anderson 
et al. (2000), Hespanha et al. (2001), and Morse 
(2004). Generalizations to nonlinear systems are 
also found Battistelli et al. (2012). 
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Design of Switching Adaptive Control 

A switching adaptive controller consists of the 
following key ingredients: 

• Design of control covering 

• Design of monitoring function 

• Selection of dwell time 

For illustrative purposes, we consider an adaptive 
stabilization problem where the system has the 
following model: 

x (t) = Ax ( t ) + Bu (t) 
y (0 = Cx(t) 

with state x ( t ) e R n for some 1 < n < n max and 
the measured output y (t) e R r . The given set 
of uncertain plants E consits of triplets (A, B, C) 
and we use the notation E^ to denote the subset 
of E consisting of those plants having order n . It 
is assumed that every possible plant (A, B, C) e 
E is a minimal realization (i.e., both controllable 
and observable) and that every E^ is a compact 
set (i.e., it is closed and bounded). The control 
objective is to design an adaptive controller to 
drive the state to zero asymptotically, i.e., x(t) 

0 as t oo. It is clear that each possible 
plant in E admits a linear dynamic stabilizer. 
An alternative description of the uncertain plant 
is introduced in Morse (1996, 1997) where its 
transfer function is a member of a continuously 
parameterized set of admissible transfer functions 
of the form 

SC U {v„ + 8 : ||8|| < s p } 

peV 

In the above, V is a compact set in a finite dimen¬ 
sional space, v p is a nominal transfer function 
with its coefficients depending continuously on 
p, 8 is the transfer function of some unmodelled 
dynamics, ||8|| represents a shifted H norm 
(obtained by first shifting the poles of 8 slightly to 
the right and then computing its H 0Q norm), and 
s p is sufficiently small so that each set of plants 
{vjr? + 8 : |8| < s} is stabilizable by a single con¬ 
troller for all p eV. 


Control covering: The purpose is to decompose 
the given set of plants into a union of subsets 
such that each subset Pj admits a single controller 
Kf (called candidate controller) to achieve the 
given control objective. This is typically done us¬ 
ing two properties: inherent robustness of linear 
controllers and the existence of a finite cover for 
any compact set. More specifically, if a candidate 
controller renders a desired control objective for a 
given plant, then the same objective is maintained 
when the plant is perturbed slightly. For example, 
Fu and Barmish (1986) uses the fact that if a given 
plant is stabilized by a controller then the same 
controller stabilizes all the plants with sufficiently 
small parameter perturbations. Similarly, Morse 
(1996,1997) uses the fact that the same controller 
achieves set-point regulation for a small neigh¬ 
borhood of plants. Combining this property with 
the finite covering property yields 

N 

£ = (J S ' 

i = 1 

such that each subset E, admits a single con¬ 
troller Kj . 

Monitoring Function: The generation of the 
adaptive switching controller is accomplished us¬ 
ing a switching law or switching logic whose 
task is to determine, at each time instant, which 
candidate controller is to be applied. The core of 
the switching law is a monitoring function. Its 
very basic role is to be able to detect whether 
the applied candidate controller is consistent with 
the corresponding plant subset so that wrong 
candidate controllers can be eliminated one by 
one until an appropriate controller is found. A 
major difficulty for switching adaptive control 
design is that persistent excitation is not assumed. 
Consequently, it is not always possible to detect 
the correct plant subset using the measured out¬ 
put. The key idea is to check which plant subsets 
are consistent with the generated output. 

One simple monitoring function uses a finite¬ 
time L 2 norm of the measured output: 

V(t,z)= f ||y(i)|| 2 <fo 

Jt-x 
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where r is the so-called dwell time. It turns 
out that for some properly chosen dwell time, a 
correctly applied candidate controller is able to 
guarantee some decay property for the monitor¬ 
ing function, i.e., V it, r) < e~^ T V ( t — r, r) for 
some X > 0. This property is sufficient to allow 
a wrong candidate controller to be eliminated. 
However, much smarter monitoring functions can 
be designed so that infeasible candidate con¬ 
trollers (those not corresponding to the true plant) 
can be eliminated without even being applied. 
This can be done using the falsification approach 
in parameter estimation where the basic idea 
is to eliminate all plant subsets E 7 - inconsistent 
with the measured output signal. For example, 
consider the following discrete-time model: 

y (0 = - 0 i y it - 1) - 02 y it - 2 ) 

-\~b\u {t — 1) T b 2 u it — 2) w it) 

where at and bi are uncertain parameters and w(t) 
is a bounded disturbance, i.e., \w(t)\ < 8 for 
some 8. For this example, we may eliminate all 
the uncertain parameter subsets which violate the 
following constraint (Zhivoglyadov et al. 2000): 

\y it) + aiy it - 1) + a 2 y it - 2) 

—b\u it — 1) — b 2 u it — 2)| <8 

More generally, one can use the so-called multi¬ 
estimator (Morse 1996, 1997) which involves an 
array of estimators, one for each plant subset E i 
using its nominal model. The output estimation 
error e, (/) for each such estimator is then used to 
construct a monitoring function, e.g., 

Vi(t,x)= f e~ 2X *-4 \\ ei (s)\\ 2 ds 

Jt-r 

where r is the dwell time as before and X > 0 is 
an exponential weighting parameter used to guar¬ 
antee the decay rate of the monitoring function as 
before. Instead of using the monitoring functions 
to eliminate infeasible candidate controllers, the 
candidate controller corresponding to the least 
estimation error, as measured by the least mon¬ 
itoring function, is selected. The main advantage 


of the multi-estimator based monitoring functions 
is that falsification of candidate controllers is 
done implicitly and a “correct” controller can be 
quickly reached, leading to good performance. 

Dwell Time: The dwell time r as defined above 
is a critical component in switching adaptive con¬ 
trol. Serving in the monitoring function, this is the 
minimum nonzero amount of time for a candidate 
controller to be applied before switching. That 
is, this provides a sufficient time lag to build the 
monitoring function so that its exponential decay 
property is detected when a correct candidate 
controller is applied. This will allow detection 
of infeasible plant subsets and selection of a 
“correct” controller. The use of a dwell time also 
avoids arbitrarily fast switching, thus gaurantee- 
ing the solvability of the system dynamics. 

The dwell time can be selected a priori by 
using the fact that if a matrix A is stable, then 
there exist some positive values X and r such that 
| e At | < e~^ T for all i > r. This leads to the 
desired exponential decaying property 

V (t, r) < e~ XzV (t - r, r) 

for the aforementioned monitoring function for 
adaptive stabilization. 

Alternatively, the dwell time can be chosen 
implicitly. Hespanha et al. (2001) suggest a hys¬ 
teresis switching logic method. This method em¬ 
ploys a hysteresis parameter h >0. Suppose 
the candidate controller Kj is applied at time h , 
then Kj is kept until the next switching time f /+1 
which is the minimum t < u, such that 

(1 + h ) min V k it, t - t t ) < Vj it, t - t t ) 

1 <k<N 

Because h > 0, the time difference £/+\ — U >0 
is lower bounded, which implies the existence of 
a dwell time. 


Summary and Future Directions 

Switching adaptive control is a conceptually sim¬ 
ple control technique capable to deal with large 
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parameter uncertainties. The use of simple can¬ 
didate controllers (typically linear) imply good 
closed-loop behavior and good robustness against 
unmodelled dynamics. Although the discussion 
above assumes that the number of plant subsets 
is finite, this assumption is not essential; see 
Anderson et al. (2000). 

Switching adaptive control renders the closed- 
loop system a switched system or hybrid system, 
for which a wide range of tools are available to 
aid the analysis of such a system; see, e.g., Liber- 
zon (2003). However, unique features of such 
a system arise from the fact that the switching 
mechanism is chosen by the designer, rather than 
being a part of the given plant. How to best design 
the switching mechanism is an interesting issue. 

Future works for switching adaptive control 
include: 

1. How to simplify the design of candidate con¬ 
trollers. Finite covering based design often 
yields a large number of plant subsets, hence 
a large number of candidate controllers. Since 
most of the candidate controllers do not need 
to apply (which is the case when falsification 
based switching logic is used, for example), 
smarter ways are needed for the design of 
candidate controllers. 

2. Wider applications. Most of the research so far 
focuses on stabilization and set-point regula¬ 
tion (which is essentially a stabilization prob¬ 
lem). How to incorporate general performance 
criteria is an essential and yet challenging 
issue. 

3. Better design of monitoring functions and the 
corresponding switching logic. Most exist¬ 
ing monitoring functions use a finite-time L 2 
norm of the output (or regulation error), with 
the key feature that some exponential decay 
property is guaranteed when the candidate 
controller is “correct.” Note that the key pur¬ 
pose of the monitoring function and the corre¬ 
sponding switching logic is to allow fast fal¬ 
sification of infeasible candidate controllers. 
Thus, a much wider range of monitoring func¬ 
tions can possibly be used. In particular, how 
to incorporate set membership identification 
techniques (Milanese and Taragna 2005) may 
be of particular interest. 


Cross-References 

► Adaptive Control, Overview 

► Hybrid Dynamical Systems, Feedback Con¬ 
trol of 

► Robust Model-Predictive Control 

► Stability and Performance of Complex Systems 
Affected by Parametric Uncertainty 
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Abstract 

In this entry we review the theory of optimal 
synthesis. We describe the steps necessary to 
solve an optimal control problem and the suf¬ 
ficient conditions for optimality given by the 
theory. We describe some relevant examples that 
have important applications in mechanics, in the 
theory of hypo-elliptic operators and for the study 
of models of geometry of vision. Finally, we 
discuss the problem of optimal stabilization and 
the difficulties encountered if one tries to give the 
solution to the problem in feedback form. 

Keywords 


of calculus of variations under nonholonomic 
constraints: 


4(0 = /M0, w(0), (0 


L(q(t), u(t)) dt —»■ min (T fixed or free), 

( 2 ) 

q(0) = q 0 , q(T)=q l . (3) 


Here we make the following set of assumptions: 

(H) q belongs to a finite-dimensional smooth 
manifold M of dimension n. As a function of 
time q(.) is assumed to be Lipschitz continuous. 
The control u(.) is a L°° function taking values 
in a set U C W n . For simplicity, we assume that 
the functions / and L, defined on M x M m , are 
smooth. 

The dynamics q (t) = f(q(t), u(t )) play the 
role of the nonholonomic constraint (nonholo¬ 
nomic means that it is a constraint on the velocity 
but not necessarily on the position). 

Solving an optimal control problem in general 
is a very difficult task. Usually, to attack such a 
problem, the steps are the following: 

• STEP 0: EXISTENCE. First, one has to guar¬ 
antee the existence of a solution to (l)-(3). 
The most important sufficient condition for 
the existence of minimizers is the famous 
Filippov theorem (see for instance Agrachev 
and Sachkov (2004) for a proof) saying the 
following: introduce a new variable (the so- 
called augmented state) q := £ 

M x M satisfying the following dynamics: 


4(0 


fq°(0) = fL(q(t), u(t))\ 

V 4(0 ) m(0)/ 

/(4(0, «(0) (4) 


Affine control systems; Extremals; Pontryagin 
Maximum Principle; Sub-Riemannian geometry; 
Time-optimal synthesis 

Optimal Control 

An optimal control problem with fixed initial and 
terminal conditions can be seen as a problem 


then if (i) U is compact; (ii) the set of ve¬ 
locities F(q) := {f(q,u)\u e U} is 
convex for every q ; (iii) for every T > 0 and 
qo e M x M , there exists a compact set K C 
M x M such that all solutions of (4) starting 
from qo stay in K for t e [0, T]\ then there 
exist Lipschitz minimizers. Other theorems 
that can be applied in more general functional 
classes or under less restrictive hypotheses can 
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be found in the literature. See for instance 
Bressan and Piccoli (2007), Cesari (1983), and 
Vinter (2010). 

• STEP 1: FIRST ORDER NECESSARY 
CONDITIONS. In optimal control, the first 
order necessary conditions for optimality 
are given by the celebrated Pontryagin 
Maximum Principle (Pontryagin et al. 1961) 
(see also Agrachev and Sachkov (2004) for 
a more recent viewpoint). The Pontryagin 
Maximum Principle (PMP for short) extends 
the (Hamiltonian version of the) Euler- 
Lagrange equations of calculus of variations 
to problems with nonholonomic constraints. 
For a discussion about the relation between 
variational problems under nonholonomic 
constrains and variational principles in 
nonholonomic mechanics, see Bloch (2003). 

The PMP restricts the set of candidate optimal 
trajectories starting from qo to a family of tra¬ 
jectories, called extremals , parameterized by a 
covector />(0) e T* q M. In addition, there are 
two kinds of special extremals: (i) the singular 
extremals for which the maximization condition 
given by the PMP does not permit directly obtain¬ 
ing the control and (ii) the abnormal extremals 
which are candidate optimal trajectories for any 
cost function. For certain classes of problems, 
abnormal extremals and singular trajectories co¬ 
incide. 

The set of all trajectories satisfying the PMP 
(in general having intersections and not being 
all optimal forever) is called an extremal synthe¬ 
sis. The requirement that the trajectories start¬ 
ing from qo reach the final point q\ (at time 
T, fixed or free) is usually not very useful at 
this step. This requirement is rather made at 
STEP 4. 

• STEP 2. HIGHER ORDER CONDITIONS. 
Higher order conditions are used to restrict 
further the set of candidate optimal trajec¬ 
tories. The most important conditions are 
those used to eliminate singular extremals 
(which usually are very hard to treat) as the 
Goh condition and the generalized Legendre- 
Clebsch conditions (see for instance Agrachev 
and Sachkov 2004). Other theories that 
provide higher order conditions (which apply 


also to extremals that are not singular) are for 
instance: higher order maximum principles 
(Bressan 1985; Krener 1977), generalized 
Morse-Maslov index theories (Agrachev 
and Sachkov 2004), and envelope theory 
(Sussmann 1986, 1989, see also Boscain and 
Piccoli 2004, Cap. 1.3.2). 

• STEP 3. SELECTION OF THE OPTIMAL 
TRAJECTORIES. This step is the most 
difficult one. Indeed, one should check that 
each extremal of the extremal synthesis 
does not intersect another extremal having 
a smaller cost at the intersection point. This 
comparison should be done not only among 
extremals which are close, one to the other, 
but among all of them. The problem is indeed 
global. 

One of the techniques to address this prob¬ 
lem in a very elegant way takes the name of 
optimal synthesis theory , and was developed 
almost together with the birth of the Pon¬ 
tryagin Maximum Principle. This theory dates 
back to the paper of Boltyanskii (1966) and 
was further developed by Brunovsky (1980, 
1978), Sussmann (1980, 1979), and Piccoli 
and Sussmann (2000). 

Roughly speaking, the theory of optimal 
synthesis permits to conclude that if one has 
an extremal synthesis having certain regular¬ 
ity properties, then this extremal synthesis is 
indeed an optimal synthesis. 

An optimal synthesis is a collection of opti¬ 
mal trajectories starting from qo and reaching 
the various points of the space: 

Sq 0 = {y q (-) : [0,T 9 ] -» M\q e M, y ? is 
a trajectory of (1) minimizing the cost / 0 " 
L(q(t), u(t) dt with y(0) = qo , y(T) = q} 

An optimal synthesis should also verify the 
following condition: if y q defined on [0, T] 
and y' q defined on [0, T'] (with T f e]0, T[) 
belong to S qo and we have q' = y q (T') then 
y q > = y q \[oj']. More details are given in the 
next section. 

• STEP 4. SELECTION OF THE TRAJEC¬ 
TORY REACHING THE FINAL POINT. 
Once an optimal synthesis is computed, 
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one selects the optimal trajectory reaching 
the desired final point solving the equation 
y(T) = qi, in the set of all trajectories 
belonging to the optimal synthesis. 

Remark 1 Notice that one could require that the 
final point is reached at STEP 1. This would 
considerably reduce the set of candidate optimal 
trajectories already at STEP 1, but would not 
permit to apply the powerful (global) theorems of 
STEP 3. As a consequence, one would be obliged 
to compare by hands all extremals going from qo 
toqi. 


Sufficient Conditions for Optimality: 

The Theory of Optimal Synthesis 

There exists a general principle for which every 
synthesis formed by extremals is optimal under 
very mild regularity conditions. We will illustrate 
a classical case of a feedback smooth on a strat¬ 
ification, due to Boltianskii and Brunovsky, see 
Boltyanskii (1966) and Brunovsky (1980, 1978). 
More general results can be found in Piccoli and 
Sussmann (2000). This principle is very strong 
and is valid only because the synthesis is a global 
object, while given a single trajectory satisfying 
PMP, there is no regularity condition which en¬ 
sures optimality. 

For simplicity, from now on, we assume that 
M = W 1 is an Euclidean space and q 0 = 0 and 
indicate by S a candidate optimal synthesis from 
0, the general case follows easily. A set P C M 
is said a curvilinear open polytope of dimension 
p , if there exists a polytope (i.e., bounded closed 
region intersection of a finite number of half¬ 
spaces) P' C R p and a smooth map 0 : R p —> 
R”, injective with jacobian having maximal rank 
at every point, such that c/)(P'\dP') = P . 

Let Q be an open subset of M (for the induced 
topology) containing the origin in its interior. We 
say that S is a Boltyanskii-Brunovsky regular 
synthesis , briefly BB synthesis, if the following 
holds. 

There exists a 6 -tuple S = (V, V \, V 2 , n> ^ > u ) 
such that 


(BB 1) V is a collection of curvilinear open poly- 
hedra and Q is disjoint union of elements of 
P. If Pj 7 ^ Pu G V and D Pj 7 ^ f) then 
Pk C 3 Pj anddim(i\) < dim(P 7 ). {0} G V 
and the elements of V are called “cells”. 
(BB2) is the disjoint union of V\ (the 

set of “type I cells”) and V 2 (the set of “type II 
cells”), 

(BB3) the feedback u : {q : 3P\ e V\,q G 
Pi} U and [I : 'Pi 'P are maps, E : 

V 2 —► Pi is a multifunction, with non empty 
values, such that the following properties are 
satisfied: 

(i) The function u is of class C 1 on each cell. 

(ii) If Pi e Pi, then f(q,u(q)) G T q P\ (the 
tangent space to P\ at q) for every q e P\. 
In addition, for each q G P \, if we let fj q be 
the maximally defined solution to the initial 
value problem 

I = m = x , $ e p lt 

(5) 

and define t q = sup Dom(£ q ), then the 
limit % q (t q —) := lj q (t) exists and 

belongs to n(^i)- 

(iii) If P 2 e P 2 , then for each q e P 2 and 
P G E(P 2 ) there exists a unique curve 

: [ 0 , tq [ —^ £2 such that the restriction 

of f=£ to ] 0 , t?] is a maximally defined 
integral curve of the vector field /(•, u(-)) 
on P, and (0) = q. 

(iv) On every cell P\ G Pi, q -> t q is a 

continuously differentiable function, and 
(t,q) %q(t), 0 t,q ) -> u q (t) := u(% q (t)) 

are continuously differentiable maps on the 
set 

E(P) := {(t, q) : q € P x , t e [0, t q ]}. 

If P 2 G P 2 the same holds for every 
t q p , «;,with p e e(p 2 ). 

(v) For every q e ^\{0}, the trajectory y q : 
[0, T q ] -> M,y q G S, is obtained by 
piecing together the trajectories on every 
single cell. Moreover, y q changes cell a 
finite number of times. 
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Theorem 1 (Sufficiency theorem for BB syn¬ 
thesis) Let S be a BB synthesis on M formed 
hy extremal trajectories, then S is optimal. 

Remark 2 Theorem 1 can be proved also for syn¬ 
thesis on an open subset Q of M, under suitable 
conditions, see Piccoli and Sussmann (2000). 

Some Relevant Examples 

Even if the sufficient conditions for optimality 
given by the theory of optimal synthesis are 
very powerful, in general computing explicitly an 
optimal synthesis is very hard and the complexity 
grows quickly with the dimension of the space. 
The main difficulties are: 

• The integration of the Hamiltonian equations 
given by the PMP (which in general is not 
integrable, unless there are many symmetries); 

• The characterisation of singular and abnormal 
extremals; 

• The verification of the hypotheses of the suf¬ 
ficient conditions for optimality given by syn¬ 
thesis theory. 

For these reasons, the computation of optimal 
synthesis is already challenging in dimension 2, 
and few examples have been solved in dimension 
3. In higher dimensions, only very symmetric 
problems have been completely solved. In the fol¬ 
lowing, we list some of the most relevant optimal 
synthesis that have been computed up to now. 

Time-Optimal Synthesis for Affine Control 
Systems on 2-D Manifolds 

Let M be a 2-D manifold and consider the prob¬ 
lem of finding the time-optimal synthesis starting 
from a point qo for a system of the type 

q = F(q) + uG(q), \u\ <1, F(q 0 ) = 0 

( 6 ) 

Here we assume that F and G are Lie-bracket 
generating. The condition F(qf) = 0 guaran¬ 
tees local controllability around qo , for a generic 
pair (.F , G). A complete theory for this kind of 
systems, was developed in Bressan and Piccoli 
(1998), Piccoli (1996), and Boscain and Piccoli 
(2004), under generic conditions on the vector 


fields F and G. More precisely, in Boscain and 
Piccoli (2004) it was provided: (i) an algorithm 
building explicitly the time-optimal synthesis; (ii) 
a classification of synthesis in terms of graphs; 
(iii) a classification of synthesis singularities; (iv) 
an analysis of the properties of the minimum time 
function. 

Here we just recall that optimal trajectories 
are a finite concatenation of bang (trajectories 
corresponding to constant control +1 or — 1 ) and 
singular arcs (for which the control may corre¬ 
spond to something different from +1 or — 1 ). 

Under generic conditions, the optimal syn¬ 
thesis provides a stratification of M. In the re¬ 
gions of dimension 2 , the control is either +1 
or — 1. The regions of dimension 1 called Frame 
Curves can be: (i) arcs of optimal trajectories 
(that may be bang or singular); (ii) switching 
curves (i.e., curves made of points in which the 
control switches from +1 or — 1 , or viceversa); 
(iii) overlap curves (i.e., curves made of points 
where the extremals lose their optimality). The 
region of dimension 0 called Frame Points are 
points where frame curves intersect. Generically, 
they can be of 23 types. See Boscain and Piccoli 
(2004, p. 60). 

Some Relevant Time-Optimal Synthesis 
for 3D Problems 

As we saw in the previous section, for minimum 
time problems in dimension 2 , many results can 
be obtained, and in most cases a time-optimal 
synthesis can be constructed. The situation is 
different for time-optimal problems in dimension 
3. Indeed, beside trivial cases, the time-optimal 
synthesis was computed in full details for few 
examples only. One is the Reed and Shepp’s car, 

( x \ / cos 6 \ 

y I = ui I sin 9 I 

6 ) \ 0 ) 



( 7 ) 
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The time-optimal synthesis for this problem was 
computed in Soueres and Laumond (1996). The 
extreme complexity of the optimal synthesis ob¬ 
tained for this simple example had the effect that 
no other time-optimal synthesis in dimension 3 
or larger, with one or two bounded controls, were 
computed up to the last 2-years. 

Very recently, the interest in time-optimal syn¬ 
thesis for systems of the type 

m 

q = ^ «i Fi (q), I Mi I <1, (i = 1 

i = 1 

( 8 ) 

where q belongs to a n -dimensional manifold and 
2 < m < n, has attracted new attention. 

This is indeed a problem of nonstrictly convex 
sub-Finsler geometry that appears in the study of 
asymptotic cones of nilpotent groups in geomet¬ 
ric group theory (Gromov 1981; Breuillard and 
Le Donne 2012). 

Sub-Riemannian Geometry 

A very important class of optimal control prob¬ 
lems is the one called sub-Riemannian. Let M be 
a n -dimensional manifold (n > 2) and consider 
the problem of finding the time-optimal synthesis 
starting from a point qo for the problem 


m » i 

q = Uj Ft (q), / 

i = l J ° 


ujdt —> min, 

N ,=1 


(2 < m < n) 


(9) 


Here we assume that the family of vector fields 
{T 7 / } /=1 m is Lie-bracket generating. This kind of 
optimal-control problems includes Riemannian 
geometry and many of its generalizations that 
usually take the name of sub-Riemannian geom¬ 
etry (see Bellaiche (1996), Montgomery (2002) 
and the pioneering work by Brockett (1982)). The 
complete time optimal synthesis was computed in 
a few relevant cases: 

• The Heisenberg group (Gaveau 1977; Ger¬ 
shkovich and Vershik 1988). 

• The local 3-dimensional contact case, under 
generic conditions (Agrachev 1996; El-Alaoui 
etal. 1996). 


• Some relevant left-invariant problem on sim¬ 
ple Lie groups, i.e., SO( 3), SU( 2), S7(2), see 
Boscain and Rossi (2008). 

• The left-invariant problem on the group of 
rototranslation SE( 2) that has important ap¬ 
plications in models of geometry of vision 
(Boscain et al. 2012; Sachkov 2011; Petitot 
2008). 

• In dimension bigger than 3, only the quasi- 
Heisenberg case (Chariot 2002) and certain 
multidimensional generalizations of the 
Heisenberg case has been computed (Beals 
et al. 1996). 

• In dimension 2, problems of type (8) are 
called problems of almost-Riemannian geom¬ 
etry. The basic example (the so-called Grushin 
case) was studied in Bellaiche (1996) and the 
study of the synthesis in the generic case, 
permitted to obtain some generalizations of 
the Gauss-Bonnet theorem (Agrachev et al. 
2008). 

Some of the synthesis mentioned above permitted 
to obtain important results for the theory of hypo- 
elliptic operators (Hormander 1967). Moreover, 
they permitted to clarify the relation between 
small-time heat kernel asymptotics and the prop¬ 
erties of the value function for the problem (9). 
See for instance Barilari et al. (2012) and refer¬ 
ences therein. 


Connections with the Stabilization 
Problem 

Consider now the control system q(t) = 
f ( q(t ), u(t)) , under the hypothesis (H). Fix 
qo e M and assume that there exists uo e U 
such that f (qo,uo) = 0. A stabilization problem 
can be stated as follows: 

(P): For every q e M, find a trajectory of 
the control system q{t) = f (q(t), u(t)) , 
(under hypothesis (H)) with boundary con¬ 
ditions q(0)=q, q(T) = qo. (Here T could 
be required to be finite or not, depending on 
the problem.) 

An elegant way of giving a solution to the prob¬ 
lem (P) is to give a stabilizing feedback, namely 
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a function K(q) such that for every q e M the 
solution of 

q(t) = f(q(t),K(t)) (10) 

with initial condition q(0) = q steers q to qo. 

It is well known that in general it is not 
possible to give the solution to (P) in feedback 
form. Indeed there may be topological constraints 
(in the sense of Brockett, see for instance Brock- 
ett (1983)) that prevent such a feedback to be 
continuous. Hence, in general, one cannot guar¬ 
antee existence and uniqueness of classical or 
Caratheodory solutions to the ODE (10). This 
problem attracted a lot of attention since the pio¬ 
neering work of Brockett and several approaches 
have been proposed: e.g., via generalized con¬ 
cept of solutions, patchy feedback, time varying 
feedback etc. (see for instance Clarke et al. 1997; 
Ancona and Bressan 1999; Coron 1992). 

Sometimes one considers an “optimal control” 
variant of the problem (P): 

(Po): For every q e M, find the trajectory of 
the control system q{t) = f (q(t),u(t)) , 
(under hypothesis (H)) minimizing the 
cost Jq L (q(t), u(t )) dt (here T can be 
fixed or free), with boundary conditions 
q(0)=q, q(T)=q 0 . 

The cost can be an additional constraint 
given by the problem, or can be added 
artificially to have a method and a good 
concept of solution to solve problem (P). 
Indeed, a way of giving the solution to 
problem (Po) (and hence to (P)) is to find 
the optimal synthesis starting from qo for 
the problem 

(-Po): for every q e M , solve 

q = —/( q , u), u e U 
f 0 L(q(t), u{t)) dt —> min 

q( o) = qo,q(T ) = q, 
and then to reverse the time. In other 
words if y : [0, T] M is the solution 
of (-Po) steering qo in q, then y(T — t) is 
the solution to (Po) steering q in qo. This 
type of solution to problem (Po) is called 
an “optimal stabilizing synthesis”. 


Extracting a Feedback from 
an Optimal Synthesis 

It is interesting to see what happens if one tries 
to extract a feedback from an optimal stabilizing 
synthesis. 

If each optimal trajectory of the optimal syn¬ 
thesis corresponds to a regular enough control 
(e.g., smooth or piecewise) the feedback corre¬ 
sponding to the optimal synthesis can be defined 
easily in the following way: if (y(.), u(.)) defined 
in [0, T] is a pair trajectory-control of the optimal 
synthesis, then K{y(t)) = u{t) for every t e 
[0, T]. 

However, as already mentioned, in most of 
the situations K(q) is not continuous. (Notice 
that even in the case in which all trajectories of 
the optimal synthesis are smooth it may happen 
that K(q) is not continuous.) Hence, in gen¬ 
eral, one cannot guarantee existence and unique¬ 
ness of classical or Caratheodory solutions to the 
ODE (10). 

One could think of enlarging the concept of 
the solution of (10) by using Filippov, Krasowski, 
or CLSS (Clarke et al. 1997) solutions (see for 
instance Marigo and Piccoli 2002, Piccoli and 
Sussmann 2000 and references therein). However 
none of these types of solutions are adapted 
to give the solution of an optimal stabilization 
problem in feedback form. To fix the ideas, let 
us consider the case of Filippov solutions. In 
Piccoli and Sussmann (2000) the authors build 
examples of optimal synthesis for which the cor¬ 
responding feedbacks generate solutions that are 
either Filippov but nonoptimal or optimal but not 
Filippov. The same can be done with the other 
types of solutions mentioned above. Also, it is 
possible to build an example showing an optimal 
stabilizing synthesis for which the corresponding 
feedback generates non optimal trajectories even 
in classical sense. This is presented in the next 
section. 

Hence, at the moment an optimal stabi¬ 
lizing synthesis remains the only possible 
concept of solution for an optimal stabilizing 
problem. 
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Synthesis Theory in 
Optimal Control, Fig. 1 

An optimal stabilizing 
synthesis for which the 
corresponding feedback 
generates nonoptimal 
trajectories 



An Example of a Time-Optimal Synthesis 
Whose Feedback Generates Nonoptimal 
Trajectories 

We present an example exhibiting the phe¬ 
nomenon of nonuniqueness of trajectories 
for the closed-loop equation arising from the 
feedback extracted from an optimal synthesis. 
In particular the optimal feedback admits 
nonoptimal (classical) solutions. This well 
illustrates the importance of using the synthesis 
as concept of solution for an optimal stabilization 
problem. 

Consider the planar system: 

q = F(q) + uG(q), \u\ < 1, 
where q = (x,y) and: 

^(<?) = ( Vh 2 ) > G(q) = ^ x _ft j , 

and the target is the origin. 

The trajectories corresponding to the constant 
control equal to —1 are straight horizontal lines 
going from left to right, while those correspond¬ 
ing to +1 are circles centered at the point (—1,1), 
running counterclockwise. The optimal synthesis 
is described in Fig. 1. For a proof of optimality 
see Piccoli and Sussmann (2000). 


Starting from the point (—1,0), we have an 
infinite number of classical solutions to the dis¬ 
continuous optimal feedback. Indeed at that point 
we have F+G = F — G, so given any natural 
number n , the trajectory running n times on the 
circle centered at (—1,1) and then going to the 
origin with control —1 is a classical solution 
to the discontinuous optimal feeback. However, 
only the one corresponding ton = 0 is optimal. 

About other concepts of solutions starting 
from (—1,0), one can prove the following. Kra- 
sowski or CLSS include classical solutions (and 
hence produce many nonoptimal trajectories). 
There is only one Filippov solution, that is the 
one that rotates indefinitely on the circle and 
never goes to the origin. This trajectory is not a 
solution to the stabilization problem since it does 
not reach the target. 


Cross-References 

► Optimal Control and Mechanics 

► Sub-Riemannian Optimization 
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Abstract 

The past decade has seen tremendous advances 
in DNA recombination and measurement 
techniques. These advances have reached a 
point in which de novo creation of biomolecular 
circuits that accomplish new functions is now 
possible, leading to the birth of a new field called 
synthetic biology. Sophisticated functions that 
are highly sought in synthetic biology range 
from recognizing and killing cancer cells, to 
neutralizing radioactive waste, to efficiently 
transforming feedstock into fuel, to control the 
differentiation of tissue cells. To reach these 


objectives, however, there are a number of 
open problems that the field has to overcome. 
Many of these problems require a system-level 
understanding of the dynamical and robustness 
properties of interacting systems, and hence, the 
field of control and dynamical systems theory 
may highly contribute. In this entry, we review 
the basic technology employed in synthetic 
biology and a number of simple modules and 
complex systems created using this technology 
and discuss key system-level problems along 
with challenging research questions for the field 
of control theory. 

Keywords 

Biomolecular systems; Gene expression; Robust¬ 
ness; Modularity 

Introduction to Synthetic Biology 

Synthetic biology is an emerging engineering dis¬ 
cipline in which the biochemical and biophysical 
principles present in living organisms are used to 
engineer new systems (Baker et al. 2006). These 
systems will have the ability of accomplishing 
a number of remarkable tasks, such as turning 
waste into energy sources, neutralizing radioac¬ 
tive waste, detecting environmental pathogens, or 
recognizing cancer cells with the aim of targeting 
them for deletion. While synthetic biology can be 
employed to create new functionalities, it can also 
enable the understanding of fundamental design 
principles of living systems. In fact, implement¬ 
ing a circuit with a prescribed behavior provides a 
powerful means to test hypotheses regarding the 
underlying biological mechanisms. 

The functions of living organisms are 
controlled by biomolecular circuits, in which 
proteins and genes interact with each other 
through activation and repression interactions 
forming complex networks. A common signal 
carrier is the concentration of the active form 
of a protein, which can be controlled through 
a number of mechanisms, including gene 
expression regulation and post-translational 
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modification. Through the process of gene 
expression, proteins are produced by their 
corresponding genes, whose production rates 
can be activated or repressed by other proteins 
(transcription factors). Once the proteins are 
produced, they can be activated or inhibited, 
by other proteins or smaller molecules, through 
post-translation modification processes including 
covalent modification, such as phosphorylation, 
and allosteric modification (Alon 2007). 
We next describe some salient aspects of 
gene expression focusing, for simplicity, on 
prokaryotic systems. 

A gene is a piece of DNA whose expression 
rate can often be controlled by a DNA sequence 
upstream of the gene itself, called promoter. The 
promoter contains the binding regions for the 
RNA polymerase, an enzyme that transcribes the 
gene into a messenger RNA molecule, which is 
then translated into protein by the ribosomes. 
The promoter also contains operator sites, which 
are binding regions where other proteins, called 
transcription factors, can bind. If these proteins 
are activators, they will help the RNA polymerase 
in binding the promoter to start transcription. By 
contrast, if these proteins are repressors, they will 
prevent the RNA polymerase from binding the 
promoter. These activation and repression inter¬ 
actions are highly nonlinear and often stochastic; 
therefore, the most commonly used modeling 
frameworks include systems of nonlinear ordi¬ 
nary differential equations, stochastic differen¬ 
tial equations, or the chemical master equation 
(Gillespie 1977, 2000). 

The basic technique for constructing synthetic 
circuits is that of assembling, through the pro¬ 
cess of cloning, DNA sequences with prescribed 
combinations of promoters and genes such that 
a desired network of activation and repression 
interaction is created. For example, if we would 
like to create an inverter where protein A re¬ 
presses protein B, we can simply place the gene 
of B under the control of a promoter repressed by 
protein A. Currently, there is a library of parts that 
one can use to assemble a desired circuit this way. 
The set of parts includes promoters, gene cod¬ 
ing sequences, terminators, and ribosome binding 
sites. Terminators are DNA sequences placed at 


the end of a gene to make the RNA polymerase 
terminate transcription, while ribosome binding 
sites are DNA sequences placed at the beginning 
of a gene, which establish the rate at which 
ribosomes will bind to the mRNA, determining 
the overall translation rate (Endy 2005). An area 
of intense research is the expansion of the library 
by creating mutations of existing parts or by 
assembling new ones. 

Once a DNA sequence is created that encodes 
the desired circuit, it is inserted in a living cell 
either on the chromosome itself or on DNA 
plasmids. When the circuit is inserted in the 
chromosome, it will be in one copy, while when 
it is inserted in DNA plasmids, it will be in 
as many copies as the plasmid copy number. 
Plasmid copy number can vary from low copy 
(5-10 copies), to medium copy (20 copies), to 
high copy (about 100 copies). Once in the cell, 
the circuit will have the required resources to 
function, including RNA polymerase, ribosomes, 
amino acids, and ATP (the cell energy currency). 
In this sense, the cell can be viewed as a chassis 
for the synthetic circuits. The operation of the 
circuit can then be observed by monitoring the 
concentration of reporters, that is, of proteins that 
are easy to detect and quantify. These include 
fluorescent proteins, that is, proteins that exhibit 
bright fluorescence when exposed to light of a 
specific wave length. Examples include the green, 
red, blue, and yellow fluorescent proteins. These 
fluorescent proteins are mainly employed in two 
different ways to measure the amount of a protein 
of interest. One can fuse the gene of the fluores¬ 
cent protein with the gene expressing the protein 
of interest. Alternatively, one can use the protein 
of interest as a transcription factor of the fluo¬ 
rescent protein. In both cases, the concentration 
of the fluorescent protein will provide an indirect 
measurement of the concentration of the protein 
of interest. 

It is also possible to apply external inputs to 
a circuit to control the activity of transcription 
factors. This is accomplished through the use of 
inducers, which are small signaling molecules 
that can be injected in the cell culture and en¬ 
ter the cell wall. These inducers bind specific 
transcription factors and either activate them, 
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allowing the transcription factor to bind the pro¬ 
moter operator sites, or inhibit them, reducing the 
transcription factor’s ability to bind the promoter 
operator sites. 


Examples of Synthetic Biology 
Modules 

A number of modules comprising two or three 
genes have been fabricated in the earlier days 
of synthetic biology (Atkinson et al. 2003; 
Becskei and Serrano 2000; Elowitz and Leibler 
2000; Gardner et al. 2000; Strieker et al. 2008). 
We can group them into oscillators (Atkinson 
et al. 2003; Elowitz and Leibler 2000; Strieker 
et al. 2008), mono-stable systems (Becskei and 
Serrano 2000), and bistable systems called toggle 
switches (Gardner et al. 2000). More recently, 
feedforward loops have also been fabricated 
(Bleris et al. 2011). 

Oscillators. The creation of circuits whose 
protein concentrations oscillate periodically in 
time has been a major focus. In fact, the abil¬ 
ity of creating an oscillator has the potential of 
shedding light into the mechanisms at the basis of 
natural clocks, such as circadian rhythms and the 
cell cycle. Oscillator designs can be divided into 
two types: loop oscillators (Elowitz and Leibler 
2000), in which repression/activation interactions 
occur in a loop topology, or oscillators based on 
the interplay between an autocatalytic loop and 
negative feedback (Atkinson et al. 2003; Strieker 
et al. 2008) (see Fig. 1). 

The design requirements of synthetic circuits 
are usually explored through models of varying 
detail, starting with the use of low-dimensional 
“toy models,” which are composed of a set of 
nonlinear ordinary differential equations describ¬ 
ing the rate of change of the circuit’s proteins. 
These models allow application of a number of 
tools from dynamical systems theory to infer 
parameter or structural requirements for a desired 
behavior. After toy models are analyzed, larger- 
scale mechanistic models are constructed, which 
include all the intermediate species taking part in 
the biochemical reactions. These models can be 
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Activator-repressor clock 
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Synthetic Biology, Fig. 1 Early gene circuits that have 
been fabricated in bacteria E. coli : the negatively au- 
toregulated gene (Becskei and Serrano 2000), the toggle 
switch (Gardner et al. 2000), the activator-repressor clock 
(Atkinson et al. 2003), and the repressilator (Elowitz and 
Leibler 2000) 


either deterministic or stochastic. Simulation is 
usually required for the study of these more com¬ 
plicated models, and the Gillespie algorithm is 
often employed for stochastic simulations (Gille¬ 
spie 1977). 

As an example of a toy model and related 
analysis, consider the activator-repressor clock 
of Atkinson et al. (2003) shown in Fig. 1. This 
oscillator is composed of an activator A activating 
itself and a repressor B, which, in turn, represses 
the activator A. Both activation and repression 
occur through transcription regulation. Denoting 
in italics the concentration of species, a toy model 
of this clock can be written as 


1 + (A/K a y + ( B/K„r 


1 + (A/K a y 


YbB , 


( 1 ) 


in which ya and yb represent protein decay (due 
to dilution and/or degradation). The functions 
(P A (A/K a ) n +p 0 ,A)/(l+(A/K a ) n + ( B/K b ) m ) 
and (p B (A/K a y+p 0 ,B)/(l + (A/K a ) n ) are 
called Hill functions and are the most commonly 
used models for transcription regulation (Alon 
2007). The first Hill function in system (1) 
increases with A and decreases with B, while 
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Synthetic Biology, Fig. 2 Activator-repressor clock time trajectory 


the second one increases with A, as expected 
since A is an activator and B is a repressor. The 
key mechanism by which this system displays 
sustained oscillations is a supercritical Hopf bi¬ 
furcation with bifurcation parameter the relative 
timescale of the activator dynamics with respect 
to the repressor dynamics (Del Vecchio 2007). 
Specifically, as the activator dynamics become 
faster than the repressor dynamics, the system 
goes through a supercritical Hopf bifurcation and 
a stable periodic orbit appears (Fig. 2). 

Mono-stable systems. The mono-stable sys¬ 
tem engineered through negative autoregulation 
was fabricated with the aim of understanding the 
role of negative feedback in attenuating biolog¬ 
ical noise. The results of Becskei and Serrano 
(2000) clearly showed that negative autoregula¬ 
tion can reduce intrinsic noise. Furthermore, the 
results of Austin et al. (2005) demonstrated that 
while low frequency noise is attenuated, noise 
at high frequency can be amplified by negative 
autoregulation in accordance with Bode’s integral 
formula (Astrom and Murray 2008). 

Bistable systems. The toggle switch of Gard¬ 
ner et al. (2000) was the first bistable system 
constructed. It constitutes the simplest circuit 
with memory, in which the state of the system 
can be switched from one equilibrium (low, high) 
to the other (high, low) by external inputs. Once 
the system state is switched to one of these 
two equilibria, it will stay there unless another 
external perturbation is applied. 


Feedforward loops. While the early circuits 
described so far were fabricated mainly to in¬ 
vestigate design principles for limit cycles and 
for robustness, many more circuits after those 
have been fabricated with the aim of solving 
concrete engineering problems. As an example, 
the incoherent feedforward circuit of Bleris et al. 
(2011) was fabricated in bacteria E. coli with the 
aim of making protein production independent 
of DNA plasmid copy number. In fact, DNA 
copy number fluctuates stochastically with pos¬ 
sibly large deviations from the nominal value. 
As a consequence, the concentration of proteins 
expressed from genes residing on a plasmid also 
fluctuates stochastically. In order to make protein 
concentration independent of an unknown DNA 
copy number, one could leverage principles for 
disturbance rejection such as integral control. 
While an explicit integral control action is partic¬ 
ularly hard to implement through biological parts, 
incoherent feedforward loops are easier to imple¬ 
ment and can accomplish the same disturbance 
rejection task. In these loops, the disturbance in¬ 
put affects the output through two branches, one 
in which the disturbance activates the output and 
a longer one in which the disturbance represses 
the output (Alon 2007). If these two branches 
are appropriately balanced, the steady-state value 
of the output will be practically independent 
of the disturbance input, leading to disturbance 
rejection to constant or slowly changing distur¬ 
bances. 
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From Modules to Systems 

One approach to creating systems that can ac¬ 
complish sophisticated tasks is to assemble to¬ 
gether simpler modules, such as those described 
in the previous section (Purnick and Weiss 2009). 
For example, the artificial tissue homeostasis cir¬ 
cuit proposed by Miller et al. (2012) is composed 
of several interconnected modules, including an 
activator-repressor clock, a toggle switch, a cou¬ 
ple of inverters, and an “and” gate. Control of 
tissue homeostasis refers to the ability of regulat¬ 
ing a cell type to a constant level in a multicellu¬ 
lar community. This ability is central in several 
diseases such as cancer and diabetes, in which 
tissue homeostasis is misregulated. The design 
proposed by Miller et al. (2012) illustrates how 
a synthetic biological circuit can be modularly 
created to accomplish this complicated regulation 
function. 

Layered logic gates are often necessary in 
order to integrate multiple signals. Moon et al. 
(2012) have constructed an “and” gate that inte¬ 
grates more than two signals by cascading pairs 
of “and” gates. Of course, problems of latency 
become more relevant as the number of layers 
increases and methods to mitigate these effects 
are being developed. 

An application that requires the integration 
of multiple signals is the cell-type classifier of 
Xie et al. (2011). Here, a synthetic gene cir¬ 
cuit is created that integrates sensory informa¬ 
tion from a number of molecular markers to 
determine whether a cell is in a specific state, 
that is, cancer, and, in such a case, produces a 
protein output triggering cell death. The design 
of this circuit is based on the composition of 
three key modules. Specifically, a double in¬ 
version module senses high levels of a molec¬ 
ular marker, a single inversion module senses 
low levels of a molecular marker, and a logical 
“and” module finally integrates the outputs of 
the other two modules to produce the output 
protein. 

Finally, biofuels are another high-impact 
application of synthetic biology (Peralta-Yahya 
et al. 2012). Metabolic engineering has been 


employed for a long time in order to engineer 
microbes to produce advanced biofuels with 
similar properties to petroleum-based fuels. One 
challenge in using microbes (or other living 
organisms) to convert feedstock into biofuel 
is that of overcoming the endogenous cell 
regulation to achieve sufficiently high yields 
such that advanced biofuels are economically 
advantageous. Specifically, engineered pathways 
are optimized on the basis of nominal operating 
conditions, but these conditions often change 
when microbes are in bioreactors. To mitigate 
this problem, synthetic gene circuits have 
been designed to sense the metabolic status 
of the host and regulate key points in the 
metabolic pathway to optimize yield (Zhang 
et al. 2012). 

Main System-Level Challenges 
to Design 

One major challenge in synthetic biology is the 
ability of going from simple modules to larger 
sophisticated systems (Purnick and Weiss 2009). 
Problems in advancing in this direction can be 
divided into two categories: “hardware” problems 
and system-level problems. Hardware problems 
include issues such as the availability of enough 
orthogonal parts to allow scaling up the size 
of synthetic circuits. We do not expand on this 
here and instead focus on system-level problems. 
These include issues such as context dependence 
(Cardinale and Arkin 2012), that is, the fact that 
modules behave in a poorly predictable way once 
interacting together in the cell environment. This 
is a major obstacle to creating larger circuits that 
behave predictably. 

Problems of context dependence can be 
further divided into three qualitatively different 
types: (a) inter-modular interactions, (b) interac¬ 
tions of synthetic circuits with the cell machinery, 
(c) perturbations in the external environment. We 
analyze each of them separately. 

(a) When modules are connected to each other 
to create larger systems, a protein in an 
upstream module is used as an “input” to 
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a downstream module. This fact creates a 
“loading” on the upstream system due to the 
fact that the output protein cannot take part 
in the upstream module reactions whenever 
it is taking part in the downstream module 
reactions. As a consequence, the behavior 
of the upstream system changes compared 
to when the system functions in isolation 
(Del Vecchio et al. 2008; Saez-Rodriguez 
et al. 2004). These loading effects have 
been called retroactivity to extend the notion 
of loading and impedance to biomolecular 
systems. Accordingly, solutions to mitigate 
this problem are being investigated (Franco 
et al. 2011; Jayanthi and Del Vecchio 2011; 
Mishra et al. 2013). 

(b) Ideally, the cell should function as a “chassis” 
for synthetic biology circuits. In practice, 
this is not the case because the endogenous 
circuitry interacts with synthetic circuits 
even when parts that are orthogonal to the 
endogenous systems are employed. A major 
example of this interaction is the depletion 
of cellular resources, such as ATP, RNA 
polymerase, and ribosomes, which are re¬ 
quired for the operation of synthetic circuits. 
This depletion reduces cell fitness, with 
deleterious consequences also for synthetic 
circuits, a phenomenon called “metabolic 
burden” (Bentley et al. 1990). A more subtle 
phenomenon than purely reducing cell fitness 
is that synthetic circuits compete with each 
other for the same resources. This fact creates 
implicit and unwanted coupling among 
circuits with unpredictable consequences. 
Approaches to mitigate these problems are 
under investigation. One direction is the 
use of orthogonal RNA polymerase and 
ribosomes (Wenlin and Chin 2009; Rackham 
and Chin 2005). A completely different, 
but complementary, direction is that of 
establishing implementable design principles 
that allow circuits to function robustly 
despite fluctuations in the resources they 
use. 

(c) The external environment where a cell 
operates has a number of physical attributes, 
which may also be subject to perturba¬ 


tions. These physical attributes include 
temperature, acidity, nutrients’ level, etc. 
Perturbations in these attributes often lead 
to poor cell fitness or to nonstandard growth 
conditions, ultimately leading to synthetic 
circuits malfunctions. 


Summary and Future Directions 

The future of synthetic biology highly depends 
on the ability of scaling up the complexity of 
design to create more sophisticated functions. 
While a number of issues, such as the avail¬ 
ability of enough orthogonal parts, can be suc¬ 
cessfully addressed by (nontrivial) fabrication of 
new parts, issues such as context dependence 
require a system-level dynamic understanding of 
circuits and their interactions. Here is where con¬ 
trol and dynamical systems theory could greatly 
contribute. Control theory has proven critical to 
reason about and engineer robustness in a number 
of concrete applications including aerospace and 
automotive systems, robotics and intelligent ma¬ 
chines, manufacturing chains, electrical, power, 
and information networks. Similarly, control the¬ 
ory could enable the understanding of principles 
that ensure robust behavior of synthetic circuits 
once interacting with each other in the cell en¬ 
vironment, leading to the ultimate progress of 
synthetic biology. 

A number of challenges need to be addressed 
for the successful application of control and dy¬ 
namical systems theory to synthetic biology. The 
behavior of synthetic circuits is highly nonlinear 
and, as a consequence, control theoretic tools de¬ 
signed for understanding robustness in linear sys¬ 
tems are not directly applicable. Understanding 
how to exploit the rich structure of biomolecular 
circuits to quantitatively reason about robustness 
to interconnections, competition for shared re¬ 
sources, and fluctuations of temperature and nu¬ 
trients is likely to have a major impact. Even with 
this understanding, however, the question of how 
to implement robust designs with the currently 
available biomolecular mechanisms must be ad¬ 
dressed. Stochasticity is another major problem 
since the behavior of synthetic circuits is intrin- 
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sically noisy. Unfortunately, the availability of 
analytical tools that allow quantification of how 
perturbations and uncertainty propagate through 
a nonlinear stochastic system is still limited, and 
designers often resort to stochastic simulation. 
Finally, the values of the salient parameters of 
the available parts are poorly known. Physical 
attributes such as binding affinities, ribosome 
binding site strengths, promoter strengths, etc. are 
only known within very coarse bounds. These 
bounds are also usually determined based on a 
specific organism and in specific growth condi¬ 
tions, which may be different from the ones in 
which the circuit is ultimately running. Hence, a 
central question is how to design and implement a 
system such that the prescribed behavior is robust 
to all sources of perturbations described above 
within a large range of possible parameter values. 


Cross-References 

► Deterministic Description of Biochemical Net¬ 
works 

► Identification and Control of Cell Populations 

► Robustness Analysis of Biological Models 

► Stochastic Description of Biochemical 
Networks 
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Abstract 

This contribution discusses various aspects im¬ 
portant to software for system identification. Es¬ 
sential functionality for existing practice and the 
algorithmic fundamentals this relies on are con¬ 
sidered together with a brief discussion of ad¬ 
ditional commonly useful support tools. Since 
software is intimately tied to the hardware that it 
runs on, a discussion on this topic follows with an 
emphasis on considering how future system iden¬ 
tification software developments might best align 
with clear current and future trends in computer 
architecture developments. 


Keywords 

System identification; Computer-aided design; 
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Introduction 

Fundamental to the practice of system identi¬ 
fication is the employment of appropriate soft¬ 


ware to compute system estimates and evaluate 
their properties. One option is for the user to 
code the necessary routines themselves in their 
computer language of choice. For simple situ¬ 
ations, such as least-squares estimation with a 
linearly parametrized model, this approach is 
feasible. 

However, it quickly becomes onerous and time 
consuming as one moves even slightly beyond 
this simple example. In response to this, re¬ 
searchers have developed a number of software 
packages designed to accommodate classes of 
data formats, model structures, and estimation 
methods. 

The purpose of this contribution is to profile 
the support that available system identification 
software provides, the underlying foundations 
on which this software depends, and the future 
capabilities that may be expected due to trends in 
desktop and portable computer capacity. 

The material to follow depends on explana¬ 
tions, definitions, and background presented in 
► System Identification: An Overview, by Ljung, 
which should be read in conjunction with this 
contribution. 


Essential Functionality 

The essence of system identification software 
packages is that they implement an identification 
method X as defined in ► System Identification: 
An Overview. 

Typically, this involves taking a model struc¬ 
ture specification A4(0) together with N ob¬ 
served data points Zn and translating that to a 
cost function Vn (0) for which a minimizer 

6 = argmin Ujv($) (1) 

OeD^i 

is then computed in order to deliver a system 
estimate M(0). 

While the details of these fundamental opera¬ 
tions vary according to the chosen model struc¬ 
ture and method, there are some shared aspects. 
To pick a starting point, subspace-based estima¬ 
tion methods (► Subspace Techniques in System 




System Identification Software 


1425 


Identification) have been one of the most signif¬ 
icant developments in the near history of system 
identification, and they fundamentally involve a 
first stage of setting up and solving the optimiza¬ 
tion problem 

P = argmin ||K - <t>P\\ 2 F , (2) 

where Y, O are data-dependent matrices, /3 is a 
8- dependent matrix, and || • || p is the Frobenius 
norm, which, for anmx/i matrix A, is defined as 


u\\f= EEki 2 - ( 3 ) 

\ i = 1 J = 1 

This is a classic least-squares optimization prob¬ 
lem, which also arises in other system identifi¬ 
cation contexts, particularly when the prediction 
y (t | 8) is a linear function of 8. 

As is well known Golub and Loan (1989), the 
minimizer ft satisfies the “normal equations” 


RxP = Q\Y. (7) 

Since R\ is upper triangular, the solution /3 may 
then be found by elementary and numerically 
robust backward substitution (Golub and Loan 
1989). 

The importance of efficient and accurate solu¬ 
tion of normal equations to any system identifi¬ 
cation software is not limited to these subspace 
or linearly parametrized cases. For instance, the 
very general class of prediction error methods 
encompassed by the formulation (1) involves a 
cost Vn ( 8 ) that depends on the vector 

£(0) = Hh,,s(t N ,9)] T (8) 

of differences between the observed data and the 
response of a model parametrized by 8. In the 
case of time-domain data, the elements of (8) are 
defined by 

e{t,6) = y(t)-y{t \0). (9) 


= ® T Y, (4) 

and if <f> r O is invertible, this allows for a closed- 
form solution 


fl = (<$> T $y l $ T Y. (5) 


While formally correct, no system identification 
software packages would compute ft in this man¬ 
ner since it is computationally inefficient and 
sensitive to numerical rounding errors. 

Drawing on decades of study on this topic 
in the numerical computations literature (Golub 
and Loan 1989), system identification software 
packages rely on the QR factorization 


®=QR = [Q i | 0 2 ] 


( 6 ) 


where Q is square and satisfies Q T Q = I 
(the identity matrix) and R contains the upper 
triangular square and invertible block R\. This 
decomposition of O allows the normal Eq. (4) to 
be re-expressed as 


In this general situation, it is most commonly 
the case that no closed-form solution for the 
optimization problem (1) exists. 

The strategy then taken by most system 
identification software packages is to employ 
a gradient-based search for a minimizer. These 
methods are motivated by the use of a linear 
approximation of E{6) about a current putative 
minimizer 8k according to 


E(0)*tE(0 k ) + J(0 k )(p-e k ), (10) 


where J(6k) denotes the Jacobian matrix 


J(0k) 


3 E(8) 


3 8 


( 11 ) 


e=e k 


In the very common situation where Vn(8) is 
a quadratic function of E(8 ), this implies the 
associated approximation 

V n (6) = Trace{ E T (9) E (9)} 

= \\E\\ 2 f « \\E(0 k ) + J(9 k )(e-0 k )\\ 2 F . 

( 12 ) 
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Via this reasoning, computation of an appropriate 
“search direction” p = 6 — 6 k again involves 
the efficient solution of a linear least-squares 
problem of the form (2), namely, 

p = argmin||£(0jt) + J(6 k )p\\ 2 F . (13) 

P 

More generally, system identification software 
packages extend this rationale and solve (1) by 
generating a sequence of iterations { Op }, which 
are refined according to 

0k+ 1 = 6 k + [ip, (14) 

where /x is a step length that at each iteration k 
may be altered until a cost decrease 

V N (0 k+l ) < V N (0 k ) (15) 

is achieved and the search direction p again 
involves the solution of normal equations 

[j(e k ) T J{0 k ) + A I] p = -J(0 k ) T E{0 k ). 

(16) 

The choice A > 0 implies what is called a 
Levenberg-Marquardt method, while A = 0 
leads to a so-called Gauss-Newton update strat¬ 
egy, and there are further variants such as “trust 
region” methods that are typically offered as 
options. 

Via (16) we see that again system identifi¬ 
cation software comes to fundamentally depend 
on underpinning numerical linear algebra, in this 
case, again via the QR decomposition. 

Another decomposition, the singular value de¬ 
composition (SVD), also has a significant role to 
play, particularly with respect to subspace-based 
methods where it is essential to the extraction of 
an estimated system parametrization 6 from /3 
referred to in (2). 

In addition to matrix decompositions, other 
system identification methods depend on many 
other even more fundamental linear algebra tools 
such as basic matrix/vector operations, matrix 
inversion, and eigen-decomposition. Because of 
this dependence, most (Ljung 2012; Kollar et al. 
2006; Young and Taylor 2012; Gamier et al. 


2012; Ninness et al. 2013) but not all (Hjalmars- 
son and Sjoberg 2012) currently available system 
identification software packages are built upon 
the Math Works MATLAB (originally short for 
“matrix laboratory”) package, which provides an 
efficient interface to the widely accepted standard 
numerical linear algebra libraries LAPACK and 
EISPACK. For example, solving (2) efficiently 
and robustly via QR decomposition and back- 
substitution of (7) is achieved transparently using 
the MATLAB backslash operator with the simple 
command: beta = Phi\Y. 


Additional Functionality and the 
Decision-Making Process 

As emphasized in ►System Identification: An 
Overview, the provision of an estimated model is 
typically an iterative process (illustrated diagram- 
matically in Fig. 4 of ► System Identification: An 
Overview) of which just one component is the 
implementation of an identification method X to 
deliver a system estimate M(6). 

In addition to this “essential functionality,” 
system identification software must also provide 
tools and a logistical support for the decision¬ 
making process of assessing A4(6) and, based on 
this, perhaps altering aspects such as the choice 
of model structure M, the experiment design X, 
or indeed the identification method X. 

To support this, system identification software 
packages may offer further capabilities such as: 

1. Nonparametric estimation methods that 
deliver estimates of linear system frequency 
response without involving a parametrized 
model structure A4(6) and hence not 
involving ( 1 ) 

2. Data preprocessing tools, such as to remove 
trends and to frequency selectively prefilter 
data before use 

3. Visualization tools to display and compare 
the time- and frequency-domain response of 
estimated models 

4. Model validation tools to determine if esti¬ 
mated models can be falsified by observed 
data 
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5. Model accuracy measures that deliver statis¬ 
tical confidence bounds on estimated parame¬ 
ters 

6. Additional data processing tools such as 
Kalman filtering and smoothing routines 
and sequential Monte Carlo (particle filter) 
routines that are used to compute Vn(9) but 
have many other applications 

7. Graphical user interface (GUI) support in 
order to aid organization of the various 
aspects of data preprocessing, model structure 
selection, algorithm selection, estimate 
computation, model validation, and model 
visualization 

8. The employment of symbolic computation 
capabilities to aid complex model structure 
specification and preprocessing for efficient 
numerical implementation (Hjalmarsson and 
Sjoberg 2012) 

Note that with the exception of this last point (8), 
the computations associated with this additional 
functionality again depend fundamentally on ef¬ 
ficient numerical linear algebra software. 

Computing Platforms 

Currently available system identification 
software packages are designed for standard 
desktop computing environments, and as such 


their capabilities are intimately tied to those of 
the central processing unit (CPU), memory, and 
other architectural features of this hardware. 

For instance, the linear algebra underpinnings 
just discussed are typically implemented in se¬ 
rially coded form, and hence bus bandwidth, 
together with memory and CPU speed, will be 
the fundamental factor affecting software perfor¬ 
mance. Taking CPU speed as an example, the 
evolution of clock speed for the very commonly 
used Intel architecture CPUs is shown as the red 
curve in Fig. 1 and, as can be seen, has largely 
plateaued over the last decade after two orders of 
magnitude growth in the decade preceding it. 

As a result, and roughly speaking, system 
estimates that took a minute to compute in the 
early 1990s took under a second to compute in 
the early part of this century, but are essentially 
no faster to compute now, a further decade later. 

As a result, while system identification 
software has continued to grow in sophistication, 
in areas that involve high computational burdens, 
such as estimation of complex and high¬ 
dimensional model structures, or the imple¬ 
mentation of compute intensive algorithms, the 
capability of system identification software has 
been hardware limited for some time. 

At the same time, as the blue line in Fig. 1 
illustrates, Moore’s law continues to hold, and 


System Identification 
Software, Fig. 1 Trends 
in desktop CPU capacity 
taking Intel as an example. 
Serial throughput speeds 
have long plateaued, but 
transistor density continues 
to grow, which delivers 
growing multiple cores 
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transistor densities continue to increase. While 
this is delivering no greater serial CPU speed, it is 
delivering multiple CPU core availability. Future 
advances in system identification software capa¬ 
bility will therefore need to exploit the potential 
for parallel computation. 

Indeed, in current MATLAB, the fundamen¬ 
tal numerical linear algebra routines previously 
mentioned such as QR-based solution of normal 
equations, eigenvalue, and SVD decompositions 
will all automatically execute on multiple compu¬ 
tational threads on multicore-enabled machines. 
Expanding this to take advantage of even higher 
levels of parallelism is the subject of current 
research. 

While these developments will deliver perfor¬ 
mance enhancements for existing system iden¬ 
tification methods, they will also open up the 
possibility for new tools to be added to system 
identification software suites. 

For example, in addition to the existing sub¬ 
space, prediction error, and maximum likelihood 
methods just mentioned, there is another impor¬ 
tant estimation approach that does not involve the 
solution of an optimization problem such as (1) 
or (2) and for which there is always a closed- 
form expression for the parameter estimate. It is 
the conditional mean estimate 

6 = E{0\Y}, (17) 

which is a Bayesian approach that depends on 
the calculation of the posterior density of the 
parameters 6 given the data Y according to 


In this sense, the conditional mean (17) is the 
most accurate estimate. Furthermore, quantifi¬ 
cations of estimation accuracy may be directly 
obtained via the marginal densities p(6i \ Y) of 
individual parameter vector values ft. 

Nevertheless, it is currently not widely used. 
There are no doubt philosophical reasons for this 
stemming from the well-known debate between 
frequentist and Bayesian perspectives on infer¬ 
ence (Efron 2013). 

Another key reason is that it is difficult to 
compute. It requires the evaluation of a multidi¬ 
mensional integral, 

E{6\Y} = J J — Jo P (P \y N )d9i---dO n 

( 20 ) 

as does the computation of the marginal densities 

p(Qi I y N ) = J ■■■ J p(6 I y N )A9\ 

•••d0 ,_id0 H .i---d0 l ,. (21) 

Evaluating these quantities requires adding fun¬ 
damentally new capability beyond efficient linear 
algebra support to system identification software. 
It involves adding capability for numerical inte¬ 
gration. 

Integration in one dimension is straightfor¬ 
ward. The well-known and used Simpson’s rule 
is remarkably efficient in that the relationship 
between the computational error and the number 
of grid points m obeys 


= P(X | 0)p(0) 
P(Y) ’ 


( 18 ) 


where p(6) is a prior that allows for incorpo¬ 
ration of user knowledge (before observing the 
data) and p(Y | 6) is the usual data likelihood. 

Not only does this estimate have an explicit 
formulation; it is also the minimum mean square 
error estimate in that for any other estimate /3 = 
f(Y) computed as any other measurable function 
/ of the data Y, it holds that 


e{||0-0|| 2 }<e{||0-£|| 2 }- (19) 


Error = 0(m ~ 4 ) (22) 

so that every order of magnitude increase in m 
delivers four extra digits of precision. However, 
(20) is an no = dim {6} dimensional integral, and 
m grid points on each of no axes imply 

M = m ne (23) 

function evaluations. This can blow up quite 
quickly, as illustrated in Fig. 2 for the case of 
only modest m = 30 grid points and with 
respect to the very simple problem of estimating a 
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System Identification Software, Fig. 2 Increase in 
number of function evaluations M required for Simpson’s 
rule integration with m = 30 grid points on each pa¬ 
rameter axis associated with linear output-error models of 
increasing order. Note that accounting for both numerator 
and denominator parameters, no = 2 x model order + 1 

straightforward linear output-error model of in¬ 
creasing order. 

On a serial CPU platform, there is an upper 
limit of time available to wait for a result and 
hence an upper limit M of function evaluations 
that are tolerable. Viewed as a function of this, 
the accuracy of simple Simpson’s rule methods is 

Error = 0(M _4/ "«), (24) 

which is not attractive as model complexity and 
hence no grows. 

A further and vitally important problem is that 
it will generally not be clear where to allocate the 
m grid points on each axis since the support of the 
posterior p(Q \ Y) is not readily known. Indeed, 
a main point of computing the multidimensional 
integrals associated with the marginals (21) is to 
determine this support. 

A strategy to address these difficulties is based 
on the strong law of large numbers (SLLN). 
Namely, if random draws x l ~ p(x) from a den¬ 
sity p(x) can be obtained, then sample averages 
of functions of them converge with probability 
one to the ensemble average expectation, which 
is an integral: 


1 M i r 

E {/(x')} = / f(x)p(x)dx. 

i = 1 

(25) 

This principle may then be used as a “random¬ 
ized” method to compute an estimate Im of an 
integral /; viz., 

C I M 

1=1 f(x)p(x)dx % i M = /(*'). 

i = 1 

(26) 

Furthermore, if the x l are independent draws, 
then 

1 M i 

Var{/ M } = Var{/(x')} = —Var{/(*)}, 

i = 1 

(27) 

and hence the absolute error in integral evaluation 
is 

0(\I-I M \) « 0(M~ 1 ' 2 ). (28) 

The vital point is that as opposed to (24), this 
error is independent of the dimension of x and 
hence independent of the dimension of the in¬ 
tegral I. Furthermore, the grid points are the 
realizations {V }, which naturally will lie within 
the support of the integrand / (x)p(x) and do not 
need to be otherwise designed. 

Of course, this depends on a means to draw 
samples from an arbitrary density p(-) of inter¬ 
est, but simple methods such as the Metropolis- 
Hastings methods and “slice sampler” exist to 
achieve this Mackay (2003). 

Importantly too, these randomized methods 
are ideally suited to exploiting the growing 
availability of desktop multicore computing 
platforms. Generating M realizations to form 
the integral approximation Im in (26) may be 
achieved in one-tenth the time simply by running 
ten independently initialized random number 
generators in parallel, each generating one M/10 
length realization. The method (26) is thus (in 
principal) trivial to parallelize. 

Furthermore, much greater parallelization 
and hence also speedup may be achieved by 
employing the “graphics processing units” 
(GPUs) in desktop computers. These GPUs are 
inexpensive because they service a high volume 
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consumer demand for interactive gaming, which 
requires high-speed numerical computation for 
3D-projected graphics. As such these GPUs 
have evolved to provide hundreds of parallel 
processing cores, each clocked in the gigahertz 
range. 

To give an impression of the computational 
capability of GPU-based platforms, the single¬ 
precision giga-FLOPS (floating-point operations 
per second) performance history for NVIDIA 
brand GPUs and Intel architecture processors 
designed for desktop applications is profiled in 
Fig. 3. 

This shows theoretical performance, assuming 
all cores may be fully utilized constantly. In 
reality, this is never possible due to communi¬ 
cation and architecture restrictions. For example, 
GPU architectures are based on an SIMD (single 
instruction, multiple data) design, so at any one 
time many cores must execute the identical in¬ 
struction, but may do so on different data. Analy¬ 
sis of these and other aspects relevant for system 
identification software implementation requires 
detailed study (Lee et al. 2010). 

The fact that desktop hardware architectures 
have and will continue to offer more but not 
faster processing cores may be exploited in sys¬ 
tem identification software beyond this Bayesian 
setting. For example, the last decade has seen 
great interest in delivering estimation methods for 
an increasingly broad range of nonlinear model 
structures, a quite general version of which can 
be expressed in the nonlinear state-space form 

x(t + 1) ~ p(x(t + 1) | x(t\ 0) (29) 

y(0 ~ p(y(0 \x(t),0). (30) 

In principle, there is no reason why this can¬ 
not be straightforwardly addressed by the usual 
maximum likelihood approach of forming the 
likelihood 

N 

p(Y N \0) = Y\p(y(t)\Y t - U 0), 

t = 1 

V, = {y(l),-,y(t)} (31) 


and then using this as the cost function V^(6) 
in ( 1 ) and then proceeding with the usual 
gradient-based search. Indeed, there exist explicit 
formulae for computing the predictive densities 
p{y(t) | Y t -i,d) required in (31). Namely, the 
coupled measurement update 

n(x(t ) | y m _ P(y(t ) I X(t),6)p(x(t) | 7,-1,61) 
p(x(t) 1 Y " e) ~ -- 

p(y(t) | 7,_i, 9) = 

J p(y(t ) I x(t),9)p(x(t) I 7,-0 dx (0 
and time update 
p{x{t + 1) I Y,,6) = 

Ip(x(t + 1) | x(t),0) p(x(t) | Y(t),9)dx(t ) 

(32) 

equations. 

However, again we are faced with the problem 
of numerically evaluating multidimensional inte¬ 
grals. The integral dimension this time is that of 
the state vector x(t), which may be less than that 
of the parameter vector 6 just discussed, but 2 N 
of these integrals needs to be evaluated in order to 
compute the likelihood (31), and this needs to be 
redone for each step of any associated gradient- 
based search. 

Again, a randomized algorithm approach 
based on the SLLN could be considered as a 
way forward in system identification software 
development. Indeed, sequential Monte Carlo 
(SMC) algorithms (aka particle filtering) (Doucet 
and Johansen 2011) have been specifically 
developed to compute the above integrals 
involved in the time and measurement update, 
and there has been recent work (Schon et al. 
2011; Andrieu et al. 2010) on employing this to 
develop software for the estimation of the general 
nonlinear model (29) and (30). 

The resulting algorithms are computationally 
intensive, to the point where implementation on 
serial CPU architectures means they are lim¬ 
ited to deployment on nonlinear model structures 
of very low state dimension. However, again 
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System Identification 
Software, Fig. 3 

Historical trend of 
theoretical single-precision 
giga- FLOPS performance 
of commodity NVIDIA 
brand GPUs versus Intel 
architecture CPUs 
designed for desktop 
computing 


Theoretical GLOP/second 



because the SLLN is at the heart of the methods, 
and averaging over one long run on a serial ma¬ 
chine is numerically equivalent (but potentially 
much faster) to averaging over multiple shorter 
runs computed in parallel, there is scope for 
future system identification software to employ 
these approaches. 


Examples of Available System 
Identification Software 

With the features of current and perhaps future 
system identification software packages profiled, 
it may be useful to make specific mention 
of particular system identification software 
packages that have been under active develop¬ 
ment for a substantial period of time. These 
include the following commercially available 
packages: 

1. The MathWorks System Identification Tool¬ 
box (Ljung 2012), which is arguably the most 
mature and comprehensive system identifica¬ 
tion software available 

2. The GAM AX Frequency Domain System Iden¬ 
tification Toolbox (Kollar et al. 2006), which 
specializes in estimation of models based on 
measurements in the frequency domain 


3. The Adaptx software (Larimore 2000) special¬ 
izing in the estimation of state-space models 
using subspace-based methods 
Noncommercial and freely available system 
identification software packages that are relevant 
include: 

1. The “computer-aided program for time- 
series analysis and identification of noisy 
systems” (CAPTAIN) toolbox (Young and 
Taylor 2012), which provides a platform 
supporting the “refined instrumental vari¬ 
able” (RIV) algorithm for linear system 
estimation; 

2. The “continuous-time system identification” 
(CONTSID) toolbox (Gamier et al. 2012), 
which specializes in the estimation of 
continuous-time models 

3. The “interactive software tool for system 
identification education” (ITSIE) tool¬ 
box (Guzman et al. 2012), which has an 
emphasis on education and training in system 
identification principles 

4. The “University of Newcastle identification 
toolbox” (UNIT) software (Ninness et al. 
2013) that is designed as an open platform 
for researchers to evaluate the performance 
of new methods relative to established 
ones 
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Summary and Future Directions 

A case can be mounted that at its heart, system 
identification is about the design of software and 
the understanding of the results provided by it. 
Certainly, the field has been built on decades 
of deep theoretical contributions, but this has 
been very practically focused either on delivering 
new algorithms that may be directly implemented 
or on better understanding the performance of 
existing algorithms. 

Efficient numerical linear algebra routines 
have traditionally been the foundation of 
the resulting proven and effective system 
identification methods and software to date, and 
these have scaled in effectiveness as desktop 
computing clock speeds have scaled. 

However, the recent past and the foreseeable 
future see CPU speed as static and with an in¬ 
creasing number of available processor cores. 
Delivering greater system identification capacity 
will require the development of methods whose 
software implementations can harness this grow¬ 
ing availability of multiple processor cores. 

Cross-References 

► Frequency Domain System Identification 

► Nonlinear System Identification Using Particle 
Filters 

► System Identification: An Overview 

► System Identification Techniques: Convexifica- 
tion, Regularization, and Relaxation 


Recommended Reading 

For readers wishing to gain a deeper under¬ 
standing of the numerical linear algebra aspects 
discussed here, the classic text (Golub and Loan 
1989) is recommended. Those wishing further 
background on the calculation of multidimen¬ 
sional integrals via randomized algorithms such 
as Metropolis-Hastings and slice sampling will 
find (Mackay 2003) useful. The particle filtering 
methods mentioned here for nonlinear estima¬ 
tion problems are clearly explained in Doucet 


and Johansen (2011). Readers interested in fur¬ 
ther detail on numerical computations on GPU- 
based platforms supporting these computations 
will find (Lee et al. 2010) useful. 
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Abstract 

System identification has been developed, by 
and large, following the classical parametric ap¬ 
proach. In this entry we discuss how regulariza¬ 
tion theory can be employed to tackle the system 
identification problem from a nonparametric (or 
semi-parametric) point of view. Both regulariza¬ 
tion for smoothness and regularization for sparse¬ 
ness are discussed, as flexible means to face 
the bias/variance dilemma and to perform model 
selection. These techniques have also advantages 
from the computational point of view, leading 
sometimes to convex optimization problems. 


Keywords 

Kernel methods; Nonparametric methods; Opti¬ 
mization; Sparse Bayesian learning; Sparsity 


Introduction 

System identification is concerned with auto¬ 
matic model building from measured data. Under 
this unifying umbrella, this field spans a rather 
broad spectrum of topics, considering different 
model classes (linear, hybrid, nonlinear, contin¬ 
uous, and discrete time) as well as a variety 
of methodologies and algorithms, bringing to¬ 
gether in a nontrivial way concepts from classical 
statistics, machine learning, and dynamical sys¬ 
tems. 

Even though considerable effort has been de¬ 
voted to specific areas, such as parametric meth¬ 
ods for linear system identification which are by 
now well developed (see the introductory article 


► System Identification: An Overview), it is fair 
to say that modeling still is, by far, the most time- 
consuming and costly step in advanced process 
control applications. As such, the demand for 
fast and reliable automated procedures for system 
identification makes this exciting field still a very 
active and lively one. 

Suffices here to recall that, following 
this classic parametric maximum likelihood 
(ML)/prediction error (PE) framework, the 
candidate models are described using a finite 
number of parameters 6 e W 1 . After the model 
classes have been specified, the following two 
steps have to be undertaken: 

(i) Estimate the model complexity h . 

(ii) Find the estimator 0 e M" minimizing a cost 
function J(9), e.g., the prediction error or 
(minus) the log-likelihood. 

Both of these steps are critical, yet for different 
reasons: step (ii) boils down to an optimization 
problem which, in general, is non-convex and as 
such it is very hard to guarantee that a global 
minimum is achieved. The regularization tech¬ 
niques discussed in this entry sometimes allow 
to reformulate the identification problem as a 
convex program, thus solving the issue of local 
minima. 

In addition fixing the system complexity equal 
to the “true” one is a rather unrealistic assump¬ 
tion and in practice the complexity n has to be 
estimated as per step (i). In practice there is never 
a “true” model, certainly not in the model class 
considered. The problem of statistical modeling 
is first of all an approximation problem; one 
seeks for an approximate description of “real¬ 
ity” which is at the same time simple enough 
to be learned with the available data and also 
accurate enough for the purpose at hand. On this 
issue see also the section “Trade-off Between 
Bias and Variance” in ► System Identification: 
An Overview. This has nontrivial implications, 
chiefly the facts that classical order selection 
criteria are based on asymptotic arguments and 
that the statistical properties of estimators 9 after 
model selection, called post-model-selection esti¬ 
mators (PMSEs), are in general difficult to study 
(Leeb and Potscher 2005) and may lead to un¬ 
desirable behavior. Experimental evidence shows 
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that this is not only a theoretical problem but also 
a practical one (Pillonetto et al. 2011; Chen et al. 
2012). On top of this statistical aspect, there is 
also a computational one. In fact the model se¬ 
lection step, which includes as special cases also 
variable selection and structure selection, may 
lead to computationally intractable combinatorial 
problems. Two simple examples which reveal the 
combinatorial explosion of candidate models are 
the following: (a) Variable selection : consider 
a high-dimensional time series (MIMO) where 
not all inputs/outputs are relevant and one would 
like to select k out of m available input signals 
where k is not known and needs to be inferred 
from data; (see, e.g., Banbura et al. (2010) and 
Chiuso and Pillonetto (2012)), and (b) structure 
selection : consider all autoregressive models of 
maximal lag p with only po < p nonzero coeffi¬ 
cients and one would like to estimate how many 
( po ) and which coefficients are nonzero. The 
same combinatorial problem arises in hybrid sys¬ 
tem identification (e.g., switching ARX models). 
Given that enumeration of all possible models 
is essentially impossible due the combinatorial 
explosion of candidates, selection could be per¬ 
formed using greedy approaches from multivari¬ 
ate statistics, such as stepwise methods (Hocking 
1976). 

The system identification community, inspired 
by work in statistics (Tibshirani 1996; Mackay 
1994), machine learning (Rasmussen and 
Williams 2006; Tipping 2001; Bach et al. 
2004), and signal processing (Donoho 2006; 
Wipf et al. 2011), has recently developed and 
adapted methods based on regularization to 
jointly perform model selection and estimation 
in a computationally efficient and statistically 
robust manner. Different regularization strategies 
have been employed which can be classified 
in two main classes: regularization induced 
by so-called smoothness priors (aka Tikhonov 
regularization; see Kitagawa and Gersh (1984) 
and Doan et al. (1984) for early references in the 
field of dynamical systems) and regularization 
for selection. This latter is usually achieved by 
convex relaxation of the Iq quasinorm (such 
as l\ norm and variations thereof such as sum 
of norms, nuclear norm, etc.) or other non- 


convex sparsity-inducing penalties which can be 
conveniently derived in a Bayesian framework, 
aka sparse Bayesian learning (SBL) (Mackay 
1994; Tipping 2001; Wipf et al. 2011). 

The purpose of this entry is to guide the 
reader through the most interesting and promis¬ 
ing results on this topic as well as areas of 
active research; of course this subjective view 
only reflects the author’s opinion, and of course 
different authors could have offered a different 
perspective. 

While, as mentioned above, system identifica¬ 
tion studies various classes of models (ranging 
from linear to general “nonlinear” models), in 
this entry, we shall restrict our attention to spe¬ 
cific ones, namely, linear and hybrid dynamical 
systems. The field of nonlinear system identifi¬ 
cation is so vast (a quote sometimes attributed 
to S. Ulam has it that the study of nonlinear 
systems is a sort of “non-elephant zoology”) 
that even though it has largely benefitted from 
the use of regularization, it cannot be addressed 
within the limited space of this contribution. The 
reader is referred to the Encyclopedia chapters 
► Nonlinear System Identification: An Overview 
of Common Approaches and ► Nonlinear System 
Identification Using Particle Filters for more de¬ 
tails on nonlinear model identification. 


System Identification 

Let u t G M m , y t G be, respectively, the 
measured input and output signals in a dynamical 
system; the purpose of system identification is 
to find, from a finite collection of input-output 
data {u t , y t }t€[i,N], a “good” dynamical model 
which describes the phenomenon under observa¬ 
tion. The candidate model will be searched for 
within a so-called “model set” denoted by AT 
This set can be described in parametric form 
(see, e.g., Eq. (3) in ► System Identification: An 
Overview) or in a nonparametric form. In this en¬ 
try we shall use the symbol M n {0) for parametric 
model classes where the subscript n denotes the 
model complexity, i.e., the number of free param¬ 
eters. 
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Linear Models 

The first part of the entry will address identifica¬ 
tion of linear models, i.e., models described by a 
convolution 

oo oo 

yt = ^ ' St—k^k ^ ' ht—k^k t G Zi (1) 

k =1 k =0 

where g and h are the so-called impulse re¬ 
sponses of the system and {e t } t ez is a zero- 
mean white noise process which under suitable 
assumptions is the one-step-ahead prediction er¬ 
ror; a convenient description of the linear system 
(1) is given in terms of the transfer functions 

oo oo 

G(q)-=J2 gkq ~ k H kl)-=y^h k q~ k 

k= 1 k= 0 


The linear model (1) naturally yields an “opti¬ 
mal” (in the mean square sense) output predictor 
which shall be denoted later on by y t \t- l- As 
mentioned above, under suitable assumptions, the 
noise e t in (1) is the so-called innovation process 
e t = y t — y t \ t -\. See also Eq. (8) in ►System 
Identification: An Overview. 

When g and h are described in a parametric 
form, we shall use the notation gk(9 ), hk(9), and, 
likewise, G(q , 9 ), H(q , 9 ), and y t \ t -i(9). 

Example 7 Consider the so-called “output-error” 
model, i.e., assume H(q) = 1. An example 
of parametric model class is obtained restricting 
G(q, 9) to be a rational function 


G{q,0) = K]\ 

i = 1 


q -Zi 
q- Pi 


where 9 \= [K , p\, zi,..., p n , z n \ is the parame¬ 
ter vector. Note that the parameter vector 9 may 
subjected to constraints 9 e (9, e.g., enforcing 
that the system be bounded input, bounded output 
(BIBO) stable (\pi \ < 1) or that the impulse 
response be real (K e M and poles pi and zeros 
Zi appear in complex conjugate pairs). 

An example of nonparametric model is ob¬ 
tained, e.g., postulating that gk is a realization 
of a Gaussian process (Rasmussen and Williams 


2006) with zero mean and a certain covariance 
function R(t,s ) = cov(g t ,g s ). For instance, the 
choice R(t,s) = X*8 t - s , where |A| < 1 and 
8k is the Kronecker symbol, postulates that the 
gt and g s are uncorrelated for t ^ s and that 
the variance of g t decays exponentially in t ; this 
latter condition ensures that each realization gk , 
k > 0, is BIBO stable with probability one. The 
exponential decay of g t guarantees that, to any 
practical purpose, it can be considered zero for 
t > T for a suitably large T . This allows to 
approximate the OE model with a “long” finite 
impulse response (FIR) model 

T 

G{q) = Y J SkZ~ k ( 2 ) 

k= 1 

where gk, k = 1,..., T, is modeled as a zero- 
mean Gaussian vector with covariance X, with 
elements [E] ts = R(t,s). 

Remark 1 Note that the model (2), which has 
been obtained from truncation of a nonparametric 
model, could in principle be thought as a para¬ 
metric model in which the parameter vector 9 
contains all the entries of gk, k = 1 ,,T. Yet 
the truncation index T may have to be large even 
for relatively “simple” impulse responses; for 
instance, {gk(9)} ke %+ may be a simple decaying 
exponential, gk(9) = ap k , which is described by 
two parameters (amplitude and decay rate), yet 
if |p| ~ 1, the truncation index T needs to be 
large (ideally T —> oo) to obtain sensible results 
(e.g., with low bias). Therefore, the number of 
parameters T(m x p) may be larger (and in fact 
much larger) than the available number of data 
points N . Under these conditions, the parameter 
9 cannot be estimated from any finite data seg¬ 
ment unless further constraints are imposed. 


The Role of Regularization in Linear 
System Identification 

In order to simplify the presentation, we shall 
refer to the linear model (1) and assume that 
H(q) = 1, i.e., we consider the so-called linear 
output-error (OE) models. The extension to more 
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general model classes can be found in Pillonetto 
et al. (2011), Chen et al. (2012), Chiuso and 
Pillonetto (2012), and references therein. 

The main purpose of regularization is to con¬ 
trol the model complexity in a flexible manner, 
moving from families of rigid, finite dimensional 
parametric model classes M n (6) to flexible, pos¬ 
sibly infinite dimensional, models. To this pur¬ 
pose one starts with a “suitably large” model 
class which is constrained through the use of so- 
called regularization functionals. To simplify the 
presentation, we consider the FIR (2). The esti¬ 
mator 6 is found as the solution of the following 
optimization problem 


0 = arg min 0€R „ J F (0) + J R (0\ A) (3) 

where J F (0) is the “fit” term often measured in 
terms of average squared prediction errors: 

MO :=££,=, II*-*|«m(O il 2 W 

while Jr{6\ A) is a regularization term which 
penalizes certain parameter vectors 0 associated 
to “unlikely” systems. Equation (3) can be seen 
as a way to deal with the bias-variance trade¬ 
off. The regularization term Jr (O’ A) may depend 
upon some regularization parameters A which 
need to be tuned using measured data. In its 
simplest instance, 

Jr(0’X) = XJr(0) 

where A is a scale factor that controls “how 
much” regularization is needed. We now discuss 
different forms of regularization Jr(0; A) which 
have been studied in the literature. 

Example 8 Let us consider the FIR model in 
Eq. (2) and let 0 be a vector containing all the 
unknown coefficients of the impulse response 
{gk}k=i,...,T- The linear least squares estimator 

1 N 

6 ls := arg min 0 — ^ || y, - j < | < _ 1 (0)|| 2 (5) 

V t = 1 


is ill-posed unless the number of data N is larger 
(and in fact much larger) that the number of 
parameters T. From the statistical point of view, 
the estimator (5) would result for large T in 
small bias and large variance. The purpose of 
regularization is to render the inverse problem of 
finding 6 from the data {yt}t=\,...,N well posed, 
thus better trading bias versus variance. The sim¬ 
plest form of regularization is indeed the so- 
called ridge regression or its weighted version 
(aka generalized Tikhonov regularization), where 
the 2-norm of the vector 0 is weighted w.r.t. a 
positive semidefinite matrix Q , 

1 N 

6» Reg := arg min^ — ^ \\y, - j r | r _ 1 (0)|| 2 

V t = 1 

+ \6 t Q6 (6) 

which result in so-called regularization for 
smoothness; see section “Regularization for 
Smoothness.” The choice of the weighting Q 
is highly nontrivial in the system identification 
context, and the performance of the regularized 
estimator 0R eg heavily depends on this. 

Remark 2 In order to formalize these ideas for 
nonparametric models or, equivalently, when the 
parameter 0 is infinite dimensional, one has to 
bring in functional analytic tools, such as re¬ 
producing kernel Hilbert spaces (RKHS). This 
is rather standard in the literature on ill-posed 
inverse problems and has been recently intro¬ 
duced also in the system identification setting 
(Pillonetto et al. 201 1). We shall not discuss these 
issues here because, we believe, the formalism 
would render the content less accessible. 

Note that this regularization approach admits 
a completely equivalent Bayesian formulation 
simply setting 

p(y\6) oc e- jF(0) p(0 |A) a e ~ jR( - e;k) (7) 

The densities p(y\6) and p(0 |A) are, respec¬ 
tively, the likelihood function and the prior, which 
in turn may depend on the unknown regulariza¬ 
tion parameters A, aka hyperparameters in this 
Bayesian formulation. This is straightforward in 
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the finite dimensional setting, while it requires 
some care when 9 is infinite dimensional. With 
reference to Example 7, and assuming 0 con¬ 
tains the impulse response coefficients gk in (2), 
p(6\X) is a Gaussian density with zero mean and 
covariance E which may be depend upon some 
regularization parameters A. From the definitions 
(7), it follows that 

p(6\y,X) oc p(y\0)p(d\X) (8) 

from which point estimators of 6 can be obtained 
(e.g., as posterior mean, MAP, etc.). As such, with 
some abuse of terminology, we shall indifferently 
refer to Jr(9; A) as the “regularization term” or 
the “prior.” The unknown parameter A is used 
to introduce some flexibility in the regularization 
term Jr(9; A) or equivalently in the prior p(9 |A) 
and is tuned based on measured data as discussed 
later on. 

The regularization term Jr(9; A) can be 
roughly classified in regularization for smooth¬ 
ness , which attempts to control complexity in a 
smooth fashion and regularization for sparseness 
which, on top of estimation, also aims at selecting 
among a finite (yet possibly very large) number 
of candidate model classes. 


Regularization for Smoothness 

Let us consider a single-input, single-output FIR 
model of length T (arbitrarily large) and let 
0 := [gi g 2 • • • gr] T e be the (finite) 
impulse response; define also y e R N be the 
vector of output observations, <P the regressor 
matrix with past input samples, and e the vector 
with innovations (zero mean, variance a 2 1). With 
this notation the convolution input-output equa¬ 
tion (1) takes the form 

y = <P9 + e 

Following the prescriptions of ridge regression, 
a regularized estimator 6 can be found 
setting 

(9) 


where the matrix K{ A), aka kernel, is tailored to 
capture specific properties of impulse responses 
(exponential decay, BIBO stability, smoothness, 
etc.). Early references include Doan et al. (1984) 
and Kitagawa and Gersh (1984), while more 
recent work can be found in Pillonetto and De 
Nicolao (2010), Pillonetto et al. (2011) and Chen 
et al. (2012) where several choices of kernels are 
discussed. 

Example 9 The simplest example of kernel is the 
so-called “exponentially decaying” kernel 

K(X) := yD(p) D(p) := diag{p, ...,p T } 

( 10 ) 

where A := (y, p) with 0 < p < 1 and y > 0. 

For fixed A, the estimator 9( A) is the solution 
of a quadratic problem and can be written in 
closed form (aka ridge regression): 

9(X) = K(X)<P T (<PK(X)<P t + a 2 /) -1 y 

(ID 

Two common strategies adopted to estimate the 
parameters A are cross validation (Ljung 1999) 
and marginal likelihood maximization. This latter 
approach is based on the Bayesian interpretation 
given in Eqs. (7) from which one can compute 
the so-called “empirical Bayes” estimator #eb := 
<9 (A ml) of 0 plugging in (11) the estimator of A 
which maximizes the marginal likelihood: 

A M l := arg max p(X\y) 

x 

= arg max J p(X,9\y)d9 (12) 

The main strength of the marginal likelihood 
is that, by integrating the joint posterior over 
the unknown hyperparameters 6 , it automatically 
accounts for the residual uncertainty in 9 for 
fixed A. When both Jp and Jr are quadratic 
costs, which corresponds to assuming that e and 
9 are independent and Gaussian, the marginal 
likelihood in (12) can be computed in closed form 
so that 


J R (0; A) = 6 T K~ l (X)9 
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A ml := arg min log(det(£ (A))) 

x 

+ j t S- 1 (A)j 

E(A) := <PK(X)0 t + a 1 1 (13) 

It is here interesting to observe that Aml which 
solves (12) under certain conditions leads to 
^(Aml) = 0 (see Example 10), so that the 
estimator of 9 in (11) satisfies 9( Aml) = 0. 
This simple observation is the basis of so-called 
sparse Bayesian learning (SBL); we shall return 
to this issue in the next section when discussing 
regularization for sparsity and selection. 

Unfortunately the optimization problem (12) 
(or (13)) is not convex and thus subjected to the 
issue of local minima. However, both experimen¬ 
tal evidence and some theoretical results support 
the use of marginal likelihood maximization for 
estimating regularization parameters; see, e.g., 
Rasmussen and Williams (2006) and Aravkin 
etal. (2014). 

Regularization for Sparsity: Variable 
Selection and Order Estimation 

The main purpose of regularization for sparseness 
is to provide estimators 0 in which subsets or 
functions of the estimated parameters are equal 
to zero. 

Consider the multi-input, multi-output OE 
model 

m T 

ytj = y ] y ] gk,ij u t—k,i H" e t,i j — • • • 9 P 

i= 1 k =1 

(14) 

where y t j denotes the j th component of y t e 
let also 9 e be the vector containing 

all the impulse response coefficients gkjj , j = 
1 i = 1,..., m, and k = 1,..., T. With 

reference to Eq. (14), simple examples of sparsity 
one may be interested in are: 

(i) Single elements of the parameter vector 9 , 
which corresponds to eliminating specific 
lags of some variables from the model (14). 

(ii) Groups of parameters such as the impulse 
response from i th input to the yth output 


gk,ij 9 k = 1,..., T, thereby eliminating the 
i th input from the model for the j th output, 

(iii) The singular values of the Hankel matrix 
7-1(9) formed with the impulse response 
coefficients gk\ in fact the rank of the 
Hankel matrix equals the order (i.e., the 
McMillan degree) of the system. (Strictly 
speaking any full rank FIR model of length 
T has McMillan degree T x p. Yet, we 
consider {gk}k=i,...,T to be the truncation of 
some “true” impulse response {gk}k= i,...,oo» 
and, as such, the finite Hankel matrix 
built with the coefficients gk will have 
rank equal to the McMillan degree of 
G{q) = YZUgkZ~ k .) 

To this purpose one would like to penalize the 
number of nonzero terms, let them be entries of 
9 , groups, singular values, etc. This is measured 
by the lo quasinorm or its variations: group l o 
and lo quasinorm of the Hankel singular values, 
i.e., the rank of the Hankel matrix. Unfortunately 
if Jr is a function of the lo quasinorm, the 
resulting optimization problem is computation¬ 
ally intractable; as such one usually resorts to 
relaxations. Three common ones are described 
below. 

One possibility is to resort to greedy 
algorithms such as orthogonal matching pursuit; 
generically it is not possible to guarantee 
convergence to a global minimum point. 

A very popular alternative is to replace the 
lo quasinorm by its convex envelope , i.e., the l\ 
norm, leading to algorithms known in statistics 
as LASSO (Tibshirani 1996) or its group version 
Group LASSO (Yuan and Lin 2006): 

/*(0;A) = A||0||i (15) 

Similarly the convex relaxation of the rank (i.e., 
the lo quasinorm of the singular values) is the 
so-called nuclear norm (aka Ky Fan /7-norm or 
trace norm), which is the sum of the singular 
values ||A||* := trace{VA T A} where ^denotes 
the matrix square root which is well defined for 
positive semidefinite matrices. In order to control 
the order (McMillan degree) of a linear system, 
which is equal to the rank of the Hankel matrix 
71(9) built with the impulse response described 
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by the parameter 0, it is then possible to use the 
regularization term 

j R (e;X) = x\\nm\* ( 16 ) 

thus leading to convex optimization problems 
(Fazel et al. 2001). Both (16) and (15) induce 
sparse or nearly sparse solutions (in terms of 
elements or groups of 0 (15) or in terms of 
Hankel singular values (16)), making them at¬ 
tractive for selection. It is interesting to observe 
that both 1 1 and group i\ are special cases of 
the nuclear norm if one considers matrices with 
fixed eigenspaces. Yet, as well documented in 
the statistics literature, both (16) and (15) do not 
provide a satisfactory trade-off between sparsity 
and shrinking, which is controlled by the regu¬ 
larization parameter A. As A varies one obtains 
the so-called regularization path. Increasing A 
the solution gets sparser but, unfortunately, it 
suffers from shrinking of nonzero parameters. To 
overcome these problems, several variations of 
LASSO have been developed and studied, such 
as adaptive LASSO (Zou 2006), SCAD (Fan 
and Li 2001), and so on. We shall now discuss 
a Bayesian alternative which, to some extent, 
provides a better trade-off between sparsity and 
shrinking than the l\ norm. 

This Bayesian procedure goes under the name 
of sparse Bayesian learning and can be seen 
as an extension of the Bayesian procedure for 
regularization described in the previous section. 
In order to illustrate the method, we consider its 
simplest instance. Consider an MIMO system as 
in (14) with p — 1 and m = 2, i.e., 

yt — Ylk =1 Sk,lUt-k,l + Ylk =1 gk,2 u t-k,2 + e t 
— <t>7\g 1 + 4>t2%2 + e t 

(17) 

where g, := [g u ,..., g tJ ], Let 9 := [#/" gJ] T 
and assume that the gi ’s are independent Gaus¬ 
sian random vectors with zero mean and co- 
variances A iK. Letting := [0^,...,0jv,/] T 
and following the formulation in (7) and (8), it 
follows that the marginal likelihood estimator of 
A takes the form 


A ml := arg min log(det(E (A))) + j T S 1 (A) v 

Xi>0 

E(A) := Ai<£i K<Pj + X 2 <P 2 K<Pj +<j 2 I 

(18) 

After Aml has been found, the estimator of 9 is 
found in closed form as per Eq. (11). It can be 
shown that under certain conditions on the obser¬ 
vation vector y, the estimated hyperparameters 
Aml,/ he at the boundary, i.e., are exactly equal 
to zero. If Aml,/ = 0, then, from Eq. (11), also 
gi =0; this reveals that in (17) the / th input does 
not enter into the model; see also Example 10 for 
a simple illustration. 

These Bayesian methods for sparsity have 
been studied in a general regression framework 
in Wipf et al. (2011) under the name of “type- 
II” maximum likelihood. Further results can be 
found in Aravkin et al. (2014) which suggest 
that these Bayesian methods provide a better 
trade-off between sparsity and shrinking (i.e., 
are able to provide sparse solution without 
inducing excessive shrinkage on the nonzero 
parameters). 

Remark 3 A more detailed analysis, see, for 
instance, Aravkin et al. (2014), shows that 
LASSO/GLASSO (i.e., t\ penalties) and SBL 
using the “empirical Bayes” approach can be 
derived under a common Bayesian framework 
starting from the joint posterior p(X,0\y). 
While SBL is derived from the maximization 
A of the marginal posterior, LASSO/GLASSO 
corresponds to maximizing the joint posterior 
after a suitable change of variables. For reasons 
of space, we refer the interested reader to the 
literature for details. 

Recent work on the use of sparseness for 
variable selection and model order estimation 
can be found in Wang et al. (2007), Chiuso and 
Pillonetto (2012); and references therein. 

Example 10 In order to illustrate how sparse 
Bayesian learning leads to sparse solutions, we 
consider a very simplified scenario in which the 
measurements equation is 


yt — 0u t ~i + e t 
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where e t is zero-mean, unit variance Gaussian 
and white and u t is a deterministic signal. The 
purpose is to estimate the coefficient 6 , which 
could be possibly equal to zero. Thus, the esti¬ 
mator should reveal whether u t ~\ influences y t 
or not. 

Following the SBL framework, we model 6 as 
a Gaussian random variable, with zero mean and 
variance A, independent of e t . Therefore, y t is 
also Gaussian, zero mean, and variance u 2 _ x A +1. 
Therefore, assuming N data points are available, 
the likelihood function for A is given by 

HD=n 

i=\ J2 tt(u 2 _ x \ + 1 ) 

Defining now 

Aml •= arg min — 21og L(A) 

A>0 


1 N 

-~y. 

9 


yt 


i = 1 ^ 


_jA + 1 


is smaller than the variance of e t , which was 
assumed to be equal to 1. Thus, the empirical 
Bayes estimator of 0, as per Eq. (11), is given by 


6 = 


^ML 


£/=i + 1 i =1 


1 

A ,■ — i 


which is clearly equal to zero when Aml = 0. 


Extensions: Regularization for Hybrid 
Systems Identification and Model 
Segmentation 

An interesting extension of linear systems is a 
class of so-called hybrid models described by a 
relation of the form 

yt =yo k (t\t-\) + e, 
y 0k (t\t - 1) = Le k (y~,u ~) (19) 

fteP k = l,...,K 


one obtains that 

Aml = max(0, A*) 
where A* is the solution of 

u t-i^ + (i yf) _ n 

ui t - + 1 

which unfortunately doesn’t have a closed form 
solution. If however we assume that the input u t 
is constant (without loss of generality say that 
u t = 1), we obtain that 



Clearly this is a threshold estimator which sets 
to zero Aml when the sample variance of y t 


where the predictor ye k {t\t — 1), which is 
a linear function Lo k (y~,u~) of the “past” 

histories y~ := {jy_i,jy_ 2 ,_} and u~ : = 

{u t -\,u t - 2 ,_}, is parametrized by a parameter 

vector 6k e R nk ; there are K different parameter 
vectors 9k, k = 1,..., K, whose evolution over 
time is determined by a so-called switching 
mechanism. The name hybrid hints at the fact 
that the model is described continuous-valued (y , 
u , and e) and discrete-valued ( k ) variables. 

A well-studied subclass of (19) is composed 
by the so-called switching ARX models, where 
the predictor takes the special form 

ye k {t\t-\) = <pjQ k fter (20) 

The regressor <p t is a finite vector containing 
inputs u s and outputs y s in a finite past window 
s G [t — 1, t — T], plus possibly a constant com¬ 
ponent to model changing “means.” The value 
of A G [l,K] is determined by the switching 
mechanism p{(j) t ,t ) : R nk xR->{1,..., K}. 

Two extreme but interesting cases are (i) 
p(<p t ,t ) = p t , where p(-) is an exogenous and 
not measurable signal, and (ii) p(<p t ,t ) = p((pt), 
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where p {•) is an endogenous unknown measur¬ 
able function of the regression vector (f > t . In any 
case, from the identification point of view, k 
at time t is not assumed to be known and, as 
such, the identification algorithm has to operate 
without knowledge of this switching mechanism. 

Identification of systems in the form (20) re¬ 
quires to estimate (a) the number of models K 
and the position of the switches between different 
models, (b) the “dimension” of each model rik, 

(c) the value of the parameters Ok, and, possibly, 

(d) the function p((pt,t ) which determines the 
switching mechanism. 

Steps (b) and (c) are essentially as in 
section “System Identification” (see also the 
introductory paper ►System Identification: An 
Overview); however, this is complicated by steps 
(a) and (d), which in particular require that one is 
able to estimate, from data alone, which system 
is “active” at each time t. 

Step (a), which is also related to the problem 
of model segmentation , has been tackled in the 
literature; see e.g., Ozay et al. (2012), Ohlsson 
and Ljung (2013), and references therein, by 
applying suitable penalties on the number of 
different models K and/or on the number of 
switches. Note that p(cj) t ,t) ^ p(4> s ,s ) if and 
only if 9 t j^ 6 s . Based on this simple observation, 
one can construct a regularization which counts 
either the number of switches, i.e., 

N 

J R (d-,y):=yJ2 II lift - ft-, II Ik ( 21 ) 

t= 2 

or attempts to approximate the total number of 
different models computing 

N 

Jr(0\y) ■= Y X! w CM)||||0< @s IIIIo (22) 

t,S= 1 

for a suitable weighting w(t , s ); see Ohlsson and 
Ljung (2013). 

As discussed above, these quasinorms lead, 
in general, to unfeasible optimization problems 
(NP-hard). An exception is the case where one 
considers bounded noise, i.e., solves a problem 
of the form 


N 

min ||0, — ||° s.t. \\y t - <pj9 t \\oo < £ 

6 ‘ t=2 

(23) 

which is shown to be a convex problem; see 
Ozay et al. (2012). In general relaxations are 
used, typically using the £i/group-£i penalties, 
thus relaxing (21) and (22) to 

Jr( 0 -,\)-.= A£"211 ft-ft-!Ill (24) 

jR(9;X):=k'ZZ, =l w(s>tm-6sh 

This yields to the convex optimization problems: 

N 

mill V] ( y, - <pje t ) 2 + X V] \\e t - 6,-1 lit (25) 

Of 

t t= 2 

or 

N 

min T2 (y> - $7^ J7 w ( s ’11 ^ - Qs Hi 

01 t t,s=l 

(26) 


Summary and Future Directions 

We have presented a bird’s eye overview of reg¬ 
ularization methods in system identification. By 
necessity this overview was certainly incomplete 
and we encourage the reader to browse through 
the recent literature for new developments on 
this exciting topic; we hope the references we 
have provided are a good starting point. While 
regularization is quite an old topic, we believe it is 
fair to say that the nontrivial interaction between 
regularization and system theoretic concepts pro¬ 
vides a wealth of interesting and challenging 
problems. Just to mention a few open questions: 
(i) how and why smoothness priors relate to 
system order (McMillan degree), (ii) how can 
one design kernels which, at the same time, are 
descriptive for dynamical systems and lead to 
computationally attractive problems suited for 
online identification, (iii) how should kernels 
for multi-output systems be designed, and (iv) 
which are the statistical properties of Bayesian 
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procedures such as SBL and its extensions in 
the context of system identification. Last but not 
least, while some results are available, nonlinear 
system identification still offers significant chal¬ 
lenges. 


Cross-References 

► Nonlinear System Identification Using Particle 
Filters 

► Nonlinear System Identification: An Overview 
of Common Approaches 

► Subspace Techniques in System Identification 

► System Identification: An Overview 


Recommended Reading 

The use of regularization methods for system 
identification can be traced back to the 1980s, 
see Doan et al. (1984) and Kitagawa and Gersh 
(1984); yet it is fair to say that the most signif¬ 
icant developments are rather recent and there¬ 
fore the literature is not established yet. The 
reader may consult Fazel et al. (2001), Pillonetto 
et al. (2011), Chen et al. (2012), Chiuso and 
Pillonetto (2012) and references therein. Clearly 
all this work has largely benefitted from cross 
fertilization with neighboring areas and, as such, 
very relevant work can be found in the fields 
of machine learning (Bach et al. 2004; Mackay 
1994; Tipping 2001; Rasmussen and Williams 
2006), statistics (Hocking 1976; Tibshirani 1996; 
Fan and Li 2001; Wang et al. 2007; Yuan and 
Lin 2006; Zou 2006), signal processing (Donoho 
2006; Wipf et al. 2011) and econometrics (Ban- 
bura et al. 2010). 
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System Identification: An Overview 
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Linkoping, Sweden 

Abstract 

This entry gives an overview of system identifi¬ 
cation. It outlines the basic concepts in the area 
and also serves as an umbrella contribution for 
the related nine articles on system identifications 
in this encyclopedia. The basis is the classical 
statistical approach of parametric methods using 
maximum likelihood and prediction error meth¬ 
ods. The paper also describes the properties of the 
estimated models for large data sets. 


Keywords 

Asymptotic model properties; Dynamical sys¬ 
tems; Estimation; Mathematica models; Maxi¬ 
mum likelihood; Parameter estimates; Prediction 
error method; Regularization 


An Introductory Example 

System identification is the theory and art of 
estimating models of dynamical systems, based 
on observed inputs and outputs. Consider as a 
concrete example the Swedish aircraft fighter 
Gripen; see Fig. 1. From one of the earlier test 
flights, some data were recorded as depicted in 
Fig. 2. 


To design the simulation software and the 
autopilot, the aircraft manufacturer, the SAAB 
company, needed a mathematical model for the 
dynamics of the system. It is a question to de¬ 
scribe how, in this case, the pitch rate is affected 
by the three inputs. A fair amount of knowledge 
exists about aircraft dynamics, and in industrial 
practice, “gray-box” models based on Newton’s 
laws of motion and unknown parameters like 
aerodynamical derivatives are employed to esti¬ 
mate the flight dynamics. Here, for the purpose 
of illustrating basic principles, let us just try a 
simple “black-box” difference equation relation. 
Denote the output, the pitch rate, at sample num¬ 
ber t by y(t), and three control inputs at the same 
time by Uk(t), k = 1,2,3. Then assume that we 
can write 

y(t) = - a x y(t - 1) - a 2 y(t - 2) - a 3 y(t - 3) 
+ b\\U\(t — 1) + b\2U\(t — 2) 

+ b 2 ,\u 2 (t — 1) + b 2f 2 U 2 (t — 2) 

+ b 3 \u 3 (t — 1) + b 32 u 3 {t — 2) (1) 

In this simple relationship, we can adjust the 
parameters to fit the observed data as well as 
possible by a common least squares fit. We use 
only the 90 first data points of the observed 
data. That gives certain numerical values of the 
9 parameters above: 

a x = —1.15, a 2 = 0.50, a 3 = —0.35, 
b u = -0.54 b it2 = 0.4, b 2 ,i = 0.15, 
b 2 , 2 = 0.16, 6 3 ,i = 0.16, 63,2 = 0.07 (2) 

We may note that this model is unstable - it has 
a pole at 1.0026, but that is in order, because the 
pitch channel of the real aircraft is unstable at the 
velocity and altitude in question. 

How can we test if this model is reasonable? 
Since we used only half of the observed data 
for the estimation, we can test the model on the 
whole data record. Since the model is unstable 
it is natural to test it by letting it predict future 
outputs, say five samples ahead, and compare 
with the measured outputs. That is done in 
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System Identification: 
An Overview, Fig. 1 The 

Swedish aircraft Gripen 


System 
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System Identification: An Overview, Fig. 2 Data from (b) Control input 1: elevator angle, (c) Control input 2: 

an early test flight of Gripen. These data cover 3 s of leading edge flap, (d) Control input 3: canard angle 

flight and are sampled at 60 Hz. (a) The output: pitch rate. 


Fig. 3. We see that the simple model (2) provides than (1) were tried out. Also, in practice more 
quite reasonable predictions over data it has advanced techniques would be required to 
not seen before. This could conceivably be validate that the estimated model is sufficiently 
improved if more elaborate model structures reliable. 
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System Identification: An Overview, Fig. 3 The mea¬ 
sured output {solid line ) compared to the 5-step-ahead 
prediction one {dashed line ) 


This simple introductory example points to the 
basic flow of system identification and it also 
points to the pertinent issues, which will be listed 
in the section “The State-of-the-Art Identification 
Setup.” 


Models and System Identification 

The Omnipresent Model 

It is clear to everyone in science and engineer¬ 
ing that mathematical models are playing in¬ 
creasingly important roles. Today, model-based 
design and optimization is the dominant engi¬ 
neering paradigm to systematic design and main¬ 
tenance of engineering systems. It has proven 
very successful and is widely used in basically 
all engineering disciplines. Concerning control 
applications, the aerospace industry is the earliest 
example on a grand-scale of this paradigm. This 
industry was very quick to adopt the theory for 
model-based optimal control that emerged in the 
1960s and is spending great efforts and resources 
on developing models. In the process industry, 
model predictive control (MPC) has during the 
last 25 years become the dominant method to 
optimize production on an intermediate level. 
MPC uses dynamical models to predict future 


process behavior and to optimize the manipulated 
variables subject to process constraints. 

Increasing demands on performance, ef¬ 
ficiency, safety, and environmental aspects 
are pushing engineering systems to become 
increasingly complex. Advances in (wireless) 
communications systems and microelectronics 
are key enablers for this rapid development, 
allowing systems to be efficiently interconnected 
in networks, reducing costs and size, and paving 
the way for new sensors and actuators. 

Model-based techniques are also gaining im¬ 
portance outside engineering applications. Let us 
just mention systems biology and health care. In 
the latter case it is expected that personalized 
health systems will become more and more im¬ 
portant in the future. 

Common to the examples given above are the 
requirements of permeating sensing, actuation, 
communication, and computation abilities of the 
engineering systems, in many cases in distributed 
architectures. It is also clear that these systems 
should be able to operate in a reliable way in 
an uncertain and temporally and spatially chang¬ 
ing environment. In many applications, cognitive 
abilities and abilities to adapt will be important. 
With systems being decentralized and typically 
containing many actuators, sensors, states, and 
nonlinearities, but with limited access to sensor 
information, model building that delivers models 
of sufficient fidelity becomes very challenging. 

System Identification: Data-Driven 
Modeling 

Construction of models requires access to 
observed data. It could be that the model is 
developed entirely from information in signals 
from the system (“black-box models”) or it 
could be that physical/engineering insights are 
combined with such information (“gray-box 
models”). In any case, verification (validation) 
of a model must be done in the light of measured 
data. Theories and methodologies for such 
model construction have been developed in 
many different research communities (to some 
extent independently). System identification is 
the term used in the control community for 
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the area of constructing mathematical models 
of dynamical systems from measured input- 
output signals. Other communities use other 
terms for often very similar techniques. The term 
machine learning has become very common 
in recent years, e.g., Rasmussen and Williams 
(2006). 

System identification has a history of more 
than 50 years, since the term was coined by 
Lotfi Zadeh (1956). It is a mature research field 
with numerous publications, textbooks, confer¬ 
ence series, and software packages. It is often 
used as an example in the control field of an 
area with good interaction between theory and 
industrial practice. The backbone of the the¬ 
ory relies upon statistical grounds, with maxi¬ 
mum likelihood methods and asymptotic analy¬ 
sis (in the number of observed data). The goal 
of the system identification field is to find a 
model of the plant in question as well as of 
its disturbances and also to find a characteriza¬ 
tion of the uncertainty bounds of the descrip¬ 
tion. 


The State-of-the-Art Identification 
Setup 

To approach a system identification problem, like 
in section “An Introductory Example,” a number 
of questions need to be answered, such as 

• What model type, e.g., (1) should be used? 

• How should the parameters in the model be 
adjusted? 

• What inputs should be applied to obtain a 
good model? 

• How do we assess the quality of the model? 

• How do we gain confidence in an estimated 
model? 

There is a very extensive literature on the sub¬ 
ject, with many textbooks, like Ljung (1999), 
Soderstrom and Stoica (1989), and Pintelon and 
Schoukens (2012). 

System identification is characterized by five 
basic concepts: 

• A’: The experimental conditions under which 
the data is generated 

• V: The data 



System Identification: An Overview, Fig. 4 The iden¬ 
tification work loop 

• M : The model structure and its parameters 0 

• X\ The identification method by which a pa¬ 
rameter value 6 in the model structure M(0) 
is determined based on the data V 

• V: The validation process that scrutinizes the 
identified model 

See Fig. 4. It is typically an iterative process 
to navigate to a model that passes through the 
validation test (“is not falsified”), involving re¬ 
visions of the necessary choices. For several of 
the steps in this loop, helpful support tools have 
been developed. It is however not quite possible 
or desirable to fully automate the choices, since 
subjective perspectives related to the intended use 
of the model are very important. 

Ai: Model Structures 

A model structure Ad is a parameterized collec¬ 
tion of models that describe the relations between 
the inputs u and outputs y of the system. The pa¬ 
rameters are denoted by 6 so M (6) is a particular 
model. The model set then is 

M = {M(0)\0 e D m } (3) 

Many ways exist to collect mathematical 
expressions that encompass a model; see, e.g., 
► Modeling of Dynamic Systems from First 
Principles, ►Nonlinear System Identification: 
An Overview of Common Approaches, and 
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► Nonlinear System Identification Using Particle 
Filters. The models may be both linear and 
nonlinear as well as time invariant and time 
varying, and it is useful to have as a common 
ground that a model gives a rule to predict 
(one-step-ahead) the output at time t , i.e., y{t) 
(a p -dimensional column vector), based on 
observations of previous input-output data up 
to time t — 1 (denoted by Z r_1 ). 

Km = g(t,e,z‘~ l ) ( 4 ) 

This covers a broad variety of model descriptions, 
sometimes in a somewhat abstract way. The de¬ 
scriptions become much more explicit when we 
specialize to linear models. 

A note on “inputs” It is important to include 
all measurable disturbances that affect y among 
the inputs u to the system, even if they cannot 
be manipulated as control inputs. In some 
cases the system may entirely lack measurable 
inputs, so the model (4) then just describes 
how future outputs can be predicted from 
past ones. Such models are called time series 
and correspond to systems that are driven by 
unobservable disturbances. Most of the tech¬ 
niques described in this entry apply also to such 
models. 

A note on disturbances A complete model 
involves both a description of the input-output 
relations and a description of how various 
noise sources affect the measurements. The 
noise description is essential to understand both 
the quality of the model predictions and the 
model uncertainty. Proper control design also 
requires a picture of the disturbances in the 
system. 

Linear Models 

For linear time invariant systems, a general model 
structure is given by the transfer function G from 
input u to output y and the transfer function H 
from a white noise source e to output additive 
disturbances (for notational convenience, we spe¬ 
cialize to single-input-single-output systems, but 
all expressions are valid in the multivariable case 
with simple notational changes): 


y(t) = G(q , d)u(t) + H(q, 6)e(t) (5a) 

Ee 2 (t) = cr 2 ; Ee(t)e T (k) = 0 if k ^ t 

(5b) 

(E denotes mathematical expectation.) This 
model is in discrete time and q denotes 
the shift operator qy(t) = y(t + 1). We 
assume for simplicity that the sampling 
interval is a one-time unit. The expansion 
of G(q , 0) in the inverse (backwards) shift 
operator gives the impulse response of the 
system: 

oo 

G(q,9)u(t ) = Ej Sk(9)q~ k u{t) 

k=\ 

OO 

= -fc) (6) 

k =1 

The discrete time Fourier transform (or the z- 
transform of the impulse response, evaluated in 
z = e la) ) gives the frequency response of the 
system: 

oo 

G{e ia> ,6) = Esk{0)e- ikm ( 7 ) 

k= 1 

The function G describes how an input sinusoid 
shifts phase and amplitude when it passes through 
the system. 

The additive noise term v = He is quite 
versatile, and with a suitable choice of H, it can 
describe a disturbance with arbitrary spectrum. 
To link with the predictor as a unifying model 
concept, it is useful to compute the predictor for 
(5 a) (the conditional mean of y(t ) given past 
data), which is 

y(t\6) = G{q,0)u(t) + [\ - H~ l (q,e)] 

[y(t)-G{q,6)u(t)} (8) 

Note that the expansion of H~ l starts with “1,” 
so the first term starts with h\q~ l so there is a 
delay in y. It is easy to interpret the first term 
as a simulation using the input u , adjusted with 
a prediction of the additive disturbance v(t) at 
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time t , based on past values of v. The predictor 
is thus an easy reformulation of the basic transfer 
functions G and H. The question now is how to 
parameterize these. 


parameters are unknown, like the aerodynam¬ 
ical derivatives. Then it is natural to build a 
continuous-time state-space model from physical 
equations: 


Black-Box Models 

A black-box model uses no physical insight or 
interpretation, but is just a general and flexible 


parameterization. It is 

i natural to ] 

let G and 

H be 

rational in the shift operator: 



G(q,9) = 

B(q) 

F(q)’ 

H(q,9) = 

C(q) 

D(q) 

(9a) 

B(q) = 

b\q~ l + 

b 2 q~ 2 + .. 

■ b nb q- nb 

(9b) 

F(q) = 

i + f\q 

1 + • • • + fnfq 

(9c) 

9 = 

[bub 2 ,. 

■ •./»/] 


(9d) 


x(t) = A(6)x(t) + B(0)u(t) 

y(t ) = C(6)x(t) + D(0)u(t) + v(t) 


(ID 


C and D are like F monic , i.e., start with a “1.” 

A very common case is that F = D = 
A and C = 1 which gives the ARX model 
(autoregressive with exogenous input): 


y(t) = A (< q)B(q)u(t ) + A ( q)e(t ) or 

(10a) 

A(q)y(t) = B(q)u(t) + e(t) or (10b) 

y{t) + ai j(f - 1) + ... + a na y(t - no) 

(10c) 


Here 0 are simply some entries of the matrices 
A, B,C, D, corresponding to unknown physical 
parameters, while the other matrix entries sig¬ 
nify known physical behavior. This model can 
be sampled with well-known sampling formulas 
(obeying the input inter-sample properties, zero- 
order hold, or first-order hold) to give 

x(t + 1) = F{6)x(t) + G(Q)u(t) 

y(t) = C(0)x(t) + D(6)u(t) + w(t) 

( 12 ) 

The model (12) has the transfer function from u 
to y 

G(q, 9) = C(9)[qI - 7{9)]~ x g{9) + D(6) 

(13) 

so we have achieved a particular parameterization 
of the general linear model (5 a). 


= b\u(t — 1) + ... + b n bu(t — nb ) 

(10d) 

This is the model structure we used in (1) in the 
introductory example, but for several inputs. 

Other common black-box structures of 
this kind are FIR (finite impulse response 
model, F = C = D = 1), ARMAX (autore¬ 
gressive moving average with exogenous input, 
F = D = A), and BJ (Box-Jenkins, all four 
polynomials are different.) 

Gray-Box Models 

If some physical facts are known about the sys¬ 
tem, it is possible to build that into a gray-box 
model. It could, for example, be that for the 
airplane in the introduction, the motion equa¬ 
tions are known from Newton’s laws, but certain 


Continuous-Time Models 

The general model description (4) describes how 
the predictions evolve in discrete time. But in 
many cases, we are interested in continuous¬ 
time (CT) models, like models for physical in¬ 
terpretation and simulation (e.g., electrical cir¬ 
cuit simulators like ADS, Spice, Spectre, and 
Microwave Office use continuous-time models). 
But CT model estimation is contained in the 
described framework, as the linear state-space 
model (11) illustrates. More comments on direct 
estimation of CT models are given in section “Es¬ 
timating Continuous Time Models.” 

Nonlinear Models 

A nonlinear model is a relation (4), where the 
function g is nonlinear in the input-output data 
Z. There is a rich variation in how to specify the 
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function g more explicitly. A quite general way 
is the nonlinear state-space equation, which is a 
counterpart to (12): 

x(t + 1) = f(x(t ), v(t), 9) 

(14) 

y(t) = h(x(t), e(t), 9) 

where v and e are white noises. This is further 
discussed in ►Nonlinear System Identification: 
An Overview of Common Approaches, where v 
is described as a Markov process with v defining 
the transitions, and in ► Nonlinear System Identi¬ 
fication: An Overview of Common Approaches, 
where (14) (v = 0) is related to a continuous¬ 
time gray-box model. The latter article also dis¬ 
cusses several other nonlinear model structures 
that can be seen as extensions and modifications 
of linear models: nonlinear mappings of past 
input-output data corresponding to (10), mixing 
static nonlinearities with linear dynamical mod¬ 
els, etc. 


X: Identification Methods: Criteria 

The goal of identification is to match the model 
to the data. Here the basic techniques for such 
matching will be discussed. 

Time Domain Data 

Suppose now we have collected a data record in 
the time domain 

Z N = {u(l),y(l),...,u(N),y(N)} (15) 

Since the model is in essence a predictor, it is 
quite natural to evaluate it by how well it predicts 
the measured output. So, form the prediction 
errors for (4): 

s(t,e) = y(t) -y(t\0) (16) 

The “size” of this error can be measured by some 

scalar norm: 

Ue(t,0)) (17) 


and the performance of the predictor over the 
whole data record Z N becomes 

N 

V N {0) = Y,l{<t>0)) ( 18 ) 

t = 1 

A natural parameter estimate is then 

0 N = argminFjv(0) (19) 

OeDj^i 

This is the prediction error method (PEM) and is 
applicable to general model structures. See, e.g., 
Ljung (1999) or (2002) for more details. See also 
► Nonlinear System Identification: An Overview 
of Common Approaches. 

The PEM approach can be embedded in a 
statistical setting to guarantee optimal statistical 
properties. The ML methodology below offers a 
systematic framework to do so: 

A Maximum Likelihood View 
If the system innovations e have a probability 
density function (pdf) f(x) 9 then the criterion 
function (18) with t (v) = — log / (x) will be the 
logarithm of the likelihood function. See Lemma 
5.1 in Ljung (1999). More specifically, assume 
that the system has p outputs and that the innova¬ 
tions are Gaussian with zero mean and covariance 
matrix A, so that 

y(t) = y(t\0)+e(t), e(t) e N(0, A) (20) 

for the 0 that generated the data. Then it follows 
that the negative logarithm of the likelihood func¬ 
tion for estimating 6 from y is 

L n {0)= 1 -[V n (6) + WlogdeM + Np log 2n] 

( 21 ) 

where Vn(9) is defined by (18), with 

9)) = £ r (t, 9)A~ 1 £(t, 9) (22) 

So the maximum likelihood model estimate 
(MLE) for known A is obtained by minimizing 
Vm(9). If A is not known, it can be included 
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among the parameters and estimated, (Ljung 
1999, page 218), which results in a criterion 

N 

D n {6) = det^ s(t, 9)s T (t, 9) (23) 

t =i 


Other useful regularization penalties could be 
to add an l\ norm of the parameter. Such tech¬ 
niques are further discussed in ► System Identifi¬ 
cation Techniques: Convexification, Regulariza¬ 
tion, and Relaxation. 


to be minimized. 


The EM Algorithm 

The EM algorithm (Dempster et al. 1977) is 
closely related to the ML technique. It is a method 
that is especially useful when the ML criterion 
is difficult to evaluate from the observed data 
but would be easier to find if certain unobserved 
latent variables were known. The algorithm alter¬ 
nates between an expectation step estimating the 
log likelihood and a maximization step bringing 
the parameter estimate closer in each step to 
the MLE. Its application to the nonlinear state- 
space model (14) is described in ►Nonlinear 
System Identification: An Overview of Common 
Approaches. 


Regularization 

Solving for the estimate in (19) is a so-called 
inverse problem , which means that the solution 
may be ill conditioned. To deal with that in (18), 
we could add a quadratic norm: 


Bayesian View 

Lor a broader perspective it is useful to invoke a 
Bayesian view. Then the sought parameter vector 
6 is itself a random vector with a certain pdf. This 
random vector will of course be correlated with 
the observations y. If we assume that the prior 
distribution of 0 (before y has been observed) is 
Gaussian with mean and covariance matrix 77, 

6>eA(<9*,/7) (25) 


its prior pdf is 


p{&) = 


_ [ _ p —{9—8*) T n~ l {S—S*)/2 

s/(2n)P det(/7) 


(26) 


The posterior (after y has been measured) pdf 
then is by Bayes rule (T denoting all measured 
y signals) 


P(9\Y) = 


P(9,Y) 
P(Y ) 


P(Y\d)P(d) 

P(Y) 


(27) 


W n (6) = V N (0) + X(0 - 6> t ) r R(0 - 6» f ) (24) 

(A is a scaling, R is a positive semidefinite 
(psd) matrix, and 9^ is a nominal value of the 
parameters). The estimate is then found by 
minimizing Wn(9). The criterion (24) makes 
sense in a classical estimation framework as 
an ad hoc modification of the MLE to deal 
with possible ill-conditioned minimization 
problems. The added quadratic term then 
serves as proper (Tikhonov) regularization of 
an ill-conditioned inverse problem; see, for 
example, Tikhonov and Arsenin (1977). This 
criterion is a clear-cut balance between model 
fit and a penalty on the model parameter 
size. The amount of penalty is governed by A 
and R. 


In the last step P(Y\9) is the likelihood function 
(cf. the negative log likelihood function L^(0) in 
(21)), P(9) is the prior pdf (26), and P(Y) is a 9- 
independent normalization. Apart from this nor¬ 
malization, and other (9-independent terms, twice 
the negative logarithm of (27) equals W^(9) in 
(24) with 

XR = n~ x (28) 

That means that with (28), the regularized 
estimate from (24) is the maximum a posteriori 
(MAP) estimate. As more and more data become 
available, the role of the prior will tend to zero, 
so as N -> oo the MAP Estimate MLE. 

This Bayesian interpretation of the regularized 
estimate also gives a clue to select the regulariza¬ 
tion quantities A , R, 9*. 
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For black-box models, a reasonable prior 
(77, 0*) may not be available. Then it is possible 
to parameterize them with hyperparameters a 
and then estimate these through the marginal 
likelihood: 


Simple least squares (LS) curve fitting of (34) 
says that we should fit observations with weights 
that are inversely proportional to the measure¬ 
ment variance. That gives the weighted LS cri¬ 
terion 


a = arg max7 , (F|o') (29) 

A survey of how such techniques may improve 
system identification techniques is given in Pil- 
lonetti et al. (2014). 

More aspects of the Bayesian view of system 
identification are given in ► System Identification 
Techniques: Convexification, Regularization, and 
Relaxation and in ► Nonlinear System Identifica¬ 
tion Using Particle Filters. 

Frequency Domain Data 

Frequency domain data are obtained either from 
frequency analysers or by applying the Fourier 
transform to measured time domain data. The 
data could be in the input-output form 


M 

Vn(6) = J2 I Y(e ia *) 

k =1 

- G(e iwk , 9)U N (e io)k )\ 2 /\H(e ia>k , 9)\ 2 

(36) 


(the constant a 2 does not affect the minimization 
of V N ). 

It can readily be verified that (36) coincides 
with (18), (1(e) = \s\ 2 ) by Parseval’s identity in 
case M = N and the frequencies a>k are selected 
as the DFT grid. 

Notice that (36) can be written as 


M 


VnW) = 

k =1 


U N (e i(0k ) 


G(e ic ° k ,0) 


2 


Y N (e iu>k ), U N (e i( ° k ), k = l,2,...,M (30) 

1 N 

Y N (Z) = -= £>(A)z"* (31) 

k =1 

or being observed samples from the frequency 
function 

G N (e ic ° k ), k = 1,2,..., M (32) 

e.g,G„(0 = ^Py(ETFE) (33) 

((33) is the empirical transfer function estimate , 
ETFE). 

Linear Parametric Models 

By taking the Fourier transform of (5a), we see 
that 


U N (e ia *) 
H(e ic ° k , 0) 


(37) 


We can see that as a properly weighted curve 
fitting of the frequency function to the ETFE (33). 

See ►Frequency Domain System Identifica¬ 
tion for more details of using frequency domain 
data for estimating dynamical systems. 


Nonparametric Methods 

From frequency domain data, the frequency 
response functions G(e l0J ), H(e lco ) can also 
be estimated directly as functions without 
any parametric model. See ► Nonparametric 
Techniques in System Identification for a detailed 
account of this. 


IV and Subspace Methods 


Y(e l0> ) = G(e l0) , 0)U(e la) ) (34) 

plus a noise term that has variance 

o 2 \H(e im , 6)\ 2 (35) 


Instrumental Variables 

The family of identification methods that can 
be described as minimizing a specific criterion 
function, like (19), covers many theoretically and 
practically important techniques. Still, several 
methods do not belong to this family. A useful 
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technique is to characterize a good model, as one 
that gives prediction errors that are uncorrelated 
with available information: 

N 

0 = sol 6eDM e(t, Om, 6)= 0 (38) 

t = 1 


Here, s(t,0 ) is the prediction error (16), and sol 
means “solution to.” The sequence {%(?), t = 
1,..., N} is constructed from the observed data, 
possibly also dependent on some design variables 
that are included in 9. Typically t,(t) is con¬ 
structed from past inputs, so a good model should 
not have prediction errors that are correlated 
with past observations. The variables £ are called 
instrumental variables , and there is an extensive 
literature about how to select these. See, e.g., 
Ljung (1999), Section 7.5, Soderstrom and Stoica 
(1983), and Young (2011). 

Subspace Methods 

A related technique is to estimate black-box state- 
space models like (12) (without any internal para¬ 
metric structure) by realizing the states from 
data and then estimating the matrices by least 
squares method. This gives a powerful family of 
methods for state-space model estimation. They 
are described in detail in ► Subspace Techniques 
in System Identification. The major advantage 
of subspace methods is that they easily apply to 
multiple-input-multiple-output systems and are 
non-iterative. A drawback is that the model prop¬ 
erties and their dependence on certain design 
variables are not fully known. 

Errors-in-Variables (EIV) Techniques 

The estimation techniques described so far as¬ 
sume that the input has been measured without 
errors. In some cases, it is natural to assume that 
both inputs and outputs have measurement errors. 
The estimation problem then becomes more diffi¬ 
cult, and some kind of knowledge about the mea¬ 
surement errors is typically required. In Pintelon 
and Schoukens (2012), Section 8.2, it is described 
how criteria of the type (36) are modified in the 
presence of input noise, and Soderstrom (2007) 
can be consulted for a summarizing treatise on 


EIV techniques. See also the section “Errors-in- 
Variables Framework” in ►Frequency Domain 
System Identification. 


Asymptotic Properties of 
the Estimated Models 

An estimated model is useless, unless something 
is known about its reliability and error bounds. 
Therefore, it is important to analyze the model 
properties. 


Bias and Variance 

The observations, certainly of the output from the 
system, are affected by noise and disturbances, 
which of course also will influence the esti¬ 
mated model parameters (19). The disturbances 
are typically described as stochastic processes, 
which makes the estimate On a random variable. 
This has a certain pdf and often the analysis is 
restricted to its mean and variance only. The dif¬ 
ference between the mean and a true description 
of the system measures the bias of the model. 
If the mean coincides with the true system, the 
estimate is said to be unbiased. The total error in 
a model thus has two contributions: the bias and 
the variance. 


Properties of the PEM Estimate (19) 

as N oo 

Except in simple special cases, it is quite difficult 
to compute the pdf of the estimate On- However, 
its asymptotic properties as N —> oo are easier 
to establish. The basic results can be summarized 
as follows (E denotes mathematical expectation; 
see Ljung (1999), chapters 8 and 9, for a more 
complete treatment): 

• Limit Model: 


0 N -> 0 * 


arg mm 


Jim T 7 V n (6) ss Et(e(t,0)) 

oo JN 

( 39 ) 
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So the estimate will converge to the best 
possible model, in the sense that it gives the 
smallest average prediction error. 

• Asymptotic Covariance Matrix for Scalar 
Output Models: 

In case the prediction errors e(t) = s(t,0*) 
for the limit model are approximately white, 
the covariance matrix of the parameters is 
asymptotically given by: 


CovOjv 


k(Q 

N 


Cov-j(rlO) 



(40) 


So the covariance matrix of the parameter 
estimate is given by the inverse covariance 
matrix of the gradient of the predictor wrt the 
parameters. Here (prime denoting derivatives) 


That is, the frequency function of the limiting 
model will approximate the true frequency 
function as well as possible in a frequency 
norm given by the input spectrum and the 
noise model. 

For a linear black-box model 

Co vG(e l(t) , §n) ~ ^ v | as n, N -> oo 

N ® u (co) 

(44) 


where n is the model order and <& v is the noise 
spectrum G 2 \Ho(e lc °)\ 2 . The variance of the 
estimated frequency function at a given fre¬ 
quency is thus, for a high-order model, propor¬ 
tional to the noise-to-signal ratio at that fre¬ 
quency. That is a natural and intuitive result. 


K(l) = 


E\i\e{t ))] 2 

Ei"(e(t)] 2 


(41) 


Note that 


k(1) — g 2 — Ee 2 (t ) if 1(e) = e 2 /l 


If the model structure contains the true system, 
it can be shown that this covariance matrix is 
the smallest that can be achieved by any unbi¬ 
ased estimate, in case the norm i is chosen as 
the logarithm of the pdf of e. That is, it fulfills 
the the Cramer-Rao inequality (Cramer 1946). 
These results are valid for quite general model 
structures. Now, specialize to linear models (5a) 
and assume that the true system is described by 

y{t) = G 0 (q)u(t) + H 0 (q)e(t) (42) 


Trade-Off Between Bias and Variance 

Generally speaking the quality of the model de¬ 
pends on the quality of the measured data and 
the flexibility of the chosen model structure (3). 
A more flexible model structure typically has 
smaller bias, since it is easier to come closer to 
the true system. At the same time, it will have 
a higher variance: With higher flexibility it is 
easier to be fooled by disturbances. So the trade¬ 
off between bias and variance to reach a small 
total error is a choice of balanced flexibility of 
the model structure. 

As the model gets more flexible, the fit to 
the estimation data in (19), Vn(0n), will always 
improve. To account for the variance contribu¬ 
tion, it is thus necessary to modify this fit to 
assess the total quality of the model. A much used 
technique for this is Akaike’s criterion, (AIC) 
(Akaike 1974): 


which could be general transfer functions, pos¬ 
sibly much more complicated than the model. 
Then 


r 

J —71 


0* = argmin / \G(e l<0 ,9) 


®u(co) 

\H(e im ,e)\ 2 


G 0 (O | 2 

(43) 


On = argmin 2 L N (0) + 2dim0 (45) 

M,9eD m 


where Ln is the negative log likelihood function. 
The minimization also takes place over a family 
of model structures with different number of 
parameters (dim 0). 

For Gaussian innovations e with unknown and 
estimated variance, AIC takes the form 
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0 N = argmin 

J\4,0eDsA 


log det 


1 N 

-J2s(t,e)s T (t,6) 


+ 2 


dim# 

N 


(46) 


after normalization and omission of model- 
independent quantities. 

A variant of AIC is to put a higher penalty on 
the model complexity: 

§n = arg min [2 L N (#) + dim# log N] (47) 

This is known as Bayesian information criterion 
(BIC) or Rissanen’s minimum description length 
(MDL) criterion (Rissanen 1978). 

Section “V: Model Validation” contains fur¬ 
ther aspects on the choice of model structure. 


X: Experiment Design 

Experiment design is the question of choosing 
which signal to measure, the sampling rate, and 
designing the input. 

The theory of experiment design primarily 
relies upon analysis of how the asymptotic pa¬ 
rameter covariance matrix (40) depends on the 
design variables: so the essence of experiment 
design can be symbolized as 

min trace{ C [ E xjr (t) ifr T (£)] _1 } 
x 

where x/r is the gradient of the prediction wrt the 
parameters and the matrix C is used to weight 
variables reflecting the intended use of the model. 

For linear systems the input design is often 
expressed as selecting the spectrum (frequency 
contents) of u. 

This leads to the following recipe: Let the in¬ 
put's power be concentrated to frequency regions 
where a good model fit is essential and where 
disturbances are dominating. 

Issues of experiment design are treated in 
much more detail in ► Experiment Design and 
Identification for Control. 


The measurement setup, like if band-limited 
inputs are used to estimate continuous-time 
models and how the experiment equipment is 
instrumented with band pass filters (see, e.g., 
Pintelon and Schoukens 2012, Sections 13.2-3), 
also belongs to the important experiment design 
questions. 

V: Model Validation 

Model validation is about examining and 
scrutinizing an estimated model to check if 
it can be used for its purpose. These methods 
unavoidably are problem dependent and contain 
several subjective elements, and no conclusive 
procedure for validation can be given. A 
few useful techniques will be listed in this 
section. Basically it is a matter of trying to 
falsify a model under the conditions it will 
be used for and also to gain confidence in 
its ability to reproduce new data from the 
system. 

Falsifying Models: Residual Analysis 

An estimated model is never a correct 
description of a true system. In that sense, 
a model cannot be “validated.” Instead it is 
instructive to try and falsify it, i.e., confront 
it with facts that may contradict its correct¬ 
ness. A good principle is to look for the 
simplest unfalsified model ; see, e.g., Popper 
(1934). 

Residual analysis is the leading technique 
for falsifying models: The residuals, or one- 
step-ahead prediction errors s(t) = s(t, ##) = 
y(t) — y(t\6 n) should ideally not contain 
any traces of past inputs or past residuals. If 
they did, it means that the predictions are not 
ideal. So, it is natural to test the correlation 
functions 
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h,u( k ) = Y ^ + k)uty,) (48) 

v t =1 

1 N 

h( k ) = Y + k )e(t) (49) 

and check that they are not larger than 
certain thresholds. Here N is the length 
of the data record and k typically ranges 
over a fraction of the interval [— NN]. See, 
e.g., Ljung (1999), Section 16.6 for more 
details. 

Comparing Different Models 

When several models have been estimated, it 
is a question to choose the “best one.” Then, 
models that employ more parameters naturally 
show a better fit to the data, and it is necessary to 
outweigh that. The model selection criteria AIC 
(46) and BIC (47) are examples of how such 
decisions can be taken. They can be extended 
to regular hypothesis tests where more complex 
models are accepted or rejected at various test 
levels (Ljung 1999, Sect. 16.4). 

Making comparisons in the frequency domain 
is a very useful complement for domain experts 
who are used to think in terms of natural frequen¬ 
cies, natural damping, etc. 

Cross Validation 

Cross validation is an important statistical con¬ 
cept that loosely means that the model perfor¬ 
mance is tested on a data set (validation data) 
other than the estimation data. There is an ex¬ 
tensive literature on cross validation, e.g., Stone 
(1977), and many ways to split up available data 
into estimation and validation parts have been 
suggested. A simple way, often used in system 
identification, is to use one-half of the data to 
estimate the model and the other half to evaluate 
simulation or prediction fit. Trying out different 
model structures (or other decision variables, 
like regularization parameters), one then picks 
the choice that gives the best performance on 
validation data. 


Other Topics 

Numerical Algorithms and Software 
Support 

The central numerical task to estimate the model 
lies in the innocent-looking “arg min” in (38). 
Since the criterion often is non-convex, this 
global minimization can be nontrivial. Typically 
some iterative numerical optimization method, 
like Gauss-Newton, Levenberg-Marquardt, or 
trust regions, e.g., Nocedal and Wright (2012), 
is employed. The iterations are initiated at a 
carefully selected point, for black-box linear 
systems often based on ARX or subspace 
estimates. 

The practical use of system identification 
relies upon efficient software support. Many 
such packages exist. They are further treated 
along with numerical and computational aspects 
in ► System Identification Software. 

Estimating Continuous-Time Models 

Most of the techniques described here formally 
seem to deal with estimating discrete time 
model. However continuous-time (CT) models 
are to be preferred in many contexts, and most 
of the modeling of physical systems really 
concern CT models. A natural approach is to 
do physical modeling in continuous time as in 
(11) and then do estimation of the CT matrices 
via the sampled model (12). All the described 
algorithms and results apply to this approach 
to CT model estimation. Another approach is 
to use band-limited inputs and compute the CT 
Fourier transforms of data (that coincide with 
the discrete time transforms for band-limited 
data) and apply ►Frequency Domain System 
Identification. 

Yet another approach is to directly fit CT 
model parameters to discrete time data, using 
specially designed filters; see, e.g., Gamier and 
Wang (2008). 

Recursive Estimation 

For certain adaptive and in-line applications, it 
may be necessary to continuously compute the 
models by recursively updating the estimates. 
The techniques for that resemble state-estimation 
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algorithms and are dealt with in a general setting 
in ► Nonlinear System Identification Using Parti¬ 
cle Filters. See also Ch 11 in Ljung (1999). 

Data Management 

The collected data often requires particular atten¬ 
tion before it can be used for estimation. Issues 
like missing observations, obviously erroneous 
values (outliers), slowly varying disturbances, 
trends, etc., need attention. In industrial appli¬ 
cations, a practical question is often to select 
portions of the data records that contain rele¬ 
vant information for the model building. Such 
questions are application dependent and related 
to experiment design and also to database man¬ 
agement. Some techniques for preparing data for 
identification are mentioned in Ch 14 of Ljung 
(1999). 

Summary and Future Directions 

System identification is a mature and well- 
established area in automatic control. The 
methods are successfully and routinely applied 
in industrial practice, and the understanding 
of theoretical issues is mostly excellent. The 
standard theory relies very much on basic 
statistical concepts and methods. 

What is exciting about future development is 
what increased computation power may mean 
for the area: Can nonlinear models be efficiently 
estimated by massive computational efforts? Will 
tools inspired by machine learning turn out to 
be superior to the conventional approaches? 
Can reliable uncertainty regions be computed 
for arbitrary noises and without the asymptotic 
formulas? 

Several of these questions are illuminated in 
the articles listed under Cross-References. 

Cross-References 

There are several articles in this encyclopedia 
that deal with aspects of system identification. 
They have been coordinated with this overview 
and the text has listed how they complement the 


issues treated here. For easy reference, here is a 

complete list of associated articles: 

► Experiment Design and Identification for Con¬ 
trol 

► Frequency Domain System Identification 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Nonlinear System Identification: An Overview 
of Common Approaches 

► Nonlinear System Identification Using Particle 
Filters 

► Nonparametric Techniques in System Identifi¬ 
cation 

► Subspace Techniques in System 
Identification 

► System Identification Software 

► System Identification Techniques: Convexifica- 
tion, Regularization, and Relaxation 


Recommended Reading 

A text book that covers and extends the material 
in this contribution is Ljung (1999). Another text 
book in the same spirit is Soderstrom and Stoica 
(1989), while Pintelon and Schoukens (2012) 
gives a comprehensive treatment of frequency 
domain methods. Recursive methods are treated 
in Young (2011). 
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Tactical Missile Autopilots 
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Abstract 

Tactical missile autopilots are part of the 
wider guidance navigation and control missile 
system whose goal is to achieve a successful 
intercept. The missile autopilot task is to turn 
guidance commands into fin deflection and is 
generally divided into two lateral direction (pitch 
and yaw) controllers and the roll orientation 
or roll rate controller. These three “channel 
control” outputs are then mixed to produce fin 
commands. The controllers can be composed 
of different architectures but most lateral 
autopilots use a three loop structure with 
acceleration and angular rate feedback. The 
roll controller is usually either a proportional 
integral (PI) or proportional integral derivative 
(PID) controller. The controllers are designed 
using gain scheduling for large flight envelope 
applications and have nonlinear elements to 
shape the time response. Integrator reset logic, 
to deal with control surface saturation, is also an 
integral part of tactical missile autopilots. 


Keywords 

Classical control; Control surfaces; Pitch; 
Proportional and integral control; Roll channel; 
Tactical missile; Yaw channels 

Introduction 

The purpose of a tactical missile is to intercept 
targets, and since tactical missile autopilots are 
part of the larger tactical missile system, they 
must contribute to that goal. The process by 
which a missile executes an intercept is by first 
sensing the target. The target information is then 
used to generate guidance commands. The guid¬ 
ance commands are determined such that if fol¬ 
lowed with precision the missile will intercept the 
target. The problem is to follow with precision. 
This is where the autopilot comes in. The missile 
autopilot receives guidance commands and pro¬ 
duces control deflections to move the missile in a 
manner consistent with completing the intercept. 
There are many control challenges unique to 
tactical missiles, namely closing velocities can 
be very high and targets very small and very 
maneuverable. Usually the guidance commands 
are acceleration commands though other quan¬ 
tities are sometimes used. For this discussion, 
acceleration commands will be the autopilot 


J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control , DOI 10.1007/978-1-4471-5058-9, 
© Springer-Verlag London 2015 
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commands. Once the acceleration commands or 
demands (as some in the guidance community 
call the autopilot inputs) are presented to the au¬ 
topilot, the autopilot’s only concern is to produce 
the desired command as fast as possible with 
some level of robustness. The key performance 
metric is the time of response. The response time 
is a key factor that drives the miss distance, and 
thus the probability of a successful intercept. 
Another metric, though less important than the 
response time, is the available maneuverability. 
As mentioned, the autopilot achieves the desired 
acceleration through moving the control surfaces. 
Usually, the control surfaces are aerodynamic and 
either positioned in front of (canard control) or in 
back of (tail control) the center of gravity. Both 
tail and canard control surfaces will be called fins 
for the purposes of this analysis. Some recent 
missile designs have significantly altered the au¬ 
topilot design problem by using both canards and 
tails or other effectors like reaction jets. 

The tactical missile autopilot control problem 
is therefore to produce accelerations by moving 
the fins in a controlled manner such that the 
response is as fast as possible while remaining 
under control under various flight conditions and 
in the presence of uncertainties (being robust). 
Tactical missiles autopilots are a classic control 
challenge in that there is a direct trade between 
performance and robustness. The tactical autopi¬ 
lot tends to lean toward the performance instead 
of the robustness because, as the continuing argu¬ 
ment goes, “what good is the missile being stable 
if you miss the target” versus “if the missile is 
unstable it may never get to the target.” So far, 
relevant analysis has mentioned the controller but 
mostly ignored robustness. Achieving robustness 
is done through the use of feedback, and in 
tactical missiles, inertial sensing devices are used 
to provide this critical information. These tactical 
sensors are currently packaged as a complete 
inertial sensor suite. This suite usually consists of 
three orthogonal linear accelerometers and three 
angular rate gyros. One reason guidance com¬ 
mands are the linear acceleration is that the sens¬ 
ing device directly measures this desired quantity. 

There are two noticeable differences between 
tactical missile control and other aerodynamic 


control applications. The first is that the dynam¬ 
ics and controls are divided into three distinct 
channels with each channel nearly independent 
of the other two. These are the lateral (pitch and 
yaw) channels and the axial (roll) channel. The 
pitch and yaw designs are usually very similar, 
if not identical, and the roll channel is separate. 
The second is that the controllers (fins) are in¬ 
tertwined. That is, there are no predominately 
pitch, yaw, or roll controllers, such as there are 
on airplanes. At least two and sometimes four 
fins are used in a single channel. This mixing 
of controls is through what is called a fin mix. 
This fin mixing occurs in the software (used 
to be hardware in analog controllers) after the 
autopilot and prior to the signals being sent to the 
individual fins. 

Historically tactical missile autopilot devel¬ 
opment has consisted of both a design phase 
and an analysis phase. This distinction is due 
to the controller being designed on a subset of 
the operating envelope. That design is then eval¬ 
uated at many more conditions to determine if 
the design works well enough everywhere to be 
deployed. In both the design and analysis phases, 
models are used to establish performance and 
robustness. Linear planar, linear coupled, and 
nonlinear models are used. The linear models 
are usually restricted to the early design phases 
and the frequency response determination of the 
system. The nonlinear models are used for time 
domain analysis. 

The remainder of the chapter is organized 
by examining the linear planar pitch and yaw 
autopilots, followed by roll control. The concept 
of combining controllers is then presented. This 
is followed by a short section on other consid¬ 
erations, such as coupled designs and nonlinear 
elements. 


Pitch and Yaw Control 

For tactical missile autopilot development, the 
equations of motion are usually derived in a 
body-fixed system with the two lateral velocities 
replaced by the local angle of attack (a) and 
sideslip angle (|3). It should be noted that the 
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sideslip is not defined as the aircraft sideslip 
but instead as the equivalent of the angle of 
attack in the horizontal plane. This is because 
of the symmetry that is found in missiles that 
does not exist in aircraft. The nonlinear equations 
of motion can be found in Blakelock (1991). 
For tactical missiles there is usually no axial 
acceleration control, and thus the total veloc¬ 
ity equation is uncontrollable and removed from 
both the design and analysis. For the coupled 
equations of motion of the system, there are five 
equations of motion and three control inputs. 
For a planar view of the problem, the pitch and 
yaw channels in a tactical missile autopilot are 
usually separated, and with the appropriate sign 
changes in the feedback signals can use the same 
gains. These channels use an inertial measuring 
device for feedback. Usually these sensors come 
in a package with three accelerometers and the 
gyros. The outputs of these devices used in the 
pitch are the linear acceleration perpendicular 
to the axial direction and the angular rate of 
the missile about the other perpendicular axis. 
That is, the z linear acceleration (A zm ) and the 
y angular rate (q m ). Using the other four sensors 
would cause coupling between pitch and yaw and 
roll, and thus this sensor information is usually 
ignored in the pitch channel. They are available 
and used in select cases where there is strong 
aerodynamic coupling, in which case these cross 
channels can be used to decouple the system. 
Without getting into the actual definitions of 
all the variables and the numerical values (see 
Mracek and Ridgely 2005a for full details), the 
state space linearized equations of motion for the 
pitch plane are: 


Wmot^f-Axo] 1 ' 
M ao /lYY 0_ 

Zs po/mV mo 1 
M^o/Iyy 

Zoio /mg — M ao x/ gI YY 0 
0 1 

Zs po/mg - M Spo x/gI yy] 


where 


x = Ax + Bu _ a 
y = Cx + Du q 


U = 8p y= 


A -ni 

qm 


Thus in the most reduced form, the tactical 
missile autopilot equations of motion reduce to 
two equations with two variables and one control. 
This is a very simple control problem. Since 
full state feedback can be used to provide an 
“optimal” control solution, only two feedback 
signals are needed for the above state space prob¬ 
lem. For a tail controlled missile, the two state 
control leads ultimately to increasing missile ac¬ 
celeration in the wrong direction, as faster and 
faster designs are realized. This is because the 
system is, in controls language, “non-minimum 
phase.” That is, tail controlled missiles move 
in the wrong direction before they move in the 
commanded direction. Canard controlled mis¬ 
siles do not suffer this problem. See Mracek 
(2005) and Gutman (2003) on the relative mer¬ 
its of canards and tails. If the control rate is 
used as the input instead of the control position, 
there would be three states in the basic plant 
used in the analysis, and three signals would 
need to be included. Now if we consider the 
fin position as a variable for feedback with the 
accelerometer and gyro feedbacks, there are a 
number of different combinations of sensor feed¬ 
back signals that can be used to solve the three 
state problem. There are, in fact, nine possible 
topologies, two of which are consistently robust, 
with one topology showing excellent robustness 
characteristic. For a complete comparison see 
Mracek and Ridgely (2005b). This topology is 
shown in Fig. 1. Notice that there is an inte¬ 
gral in the formulation. This limits the actual 
command rate from being infinite when a step 
command is input to the system. Without the 
command going through the integrator the con¬ 
troller would see the step, and, since the force 
instantly produces an acceleration, the feedback 
would jump (given no actuation delay). A typ¬ 
ical acceleration response to a step acceleration 
command and control deflection rate needed to 
produce the response is presented in Figs. 2 and 3, 
respectively. 
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Tactical Missile 
Autopilots, Fig. 1 Three 
loop pitch topology 



Tactical Missile 
Autopilots, Fig. 2 

Acceleration response to a 
step input 



Tactical Missile 
Autopilots, Fig. 3 

Control rate usage 
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The feedback control law is: 

8 P = K IAz K ss J A Zc dt — Kia z JA Z m dt 
+K e J q m dt + K q qm 

Clearly, other components within the autopilot 
loop have to be considered. The control actu¬ 
ation system (CAS) and inertial measurement 
device characteristics need to be included in the 
design and synthesis of the autopilot. To this 
end, the gains are usually selected to provide the 
best performance (in the time domain) based on 
constraints. The above optimal control solution 
provides guaranteed margins, but when the ad¬ 
ditional components are included in the analysis 
the margins are an important constraint in the 
ultimate performance that can be achieved. Like 
most control problems, the constraints are both 
time and frequency dependent. Because of the 
emphasis on performance, some of the margin 
constraints must be examined closely. For a more 
detailed treatment of the three loop autopilot see 
Zarchan (2002). 

Roll Control 

Thus far we have discussed the two lateral chan¬ 
nels of the missile. That is because those two are 
the channels that directly affect the miss distance. 
The third channel does not directly influence the 
miss but it still is usually controlled. The roll 
channel is usually the fastest of the channels for 
a tactical missile. Historically, the three channels 
were decoupled by moving the roll “out of the 


way” of the other channels by designing to a 
higher bandwidth than the pitch and yaw chan¬ 
nels. Because of the need for squeezing perfor¬ 
mance this practice is not always employed. The 
cost of not increasing the bandwidth of the roll 
beyond the pitch and yaw is that the interde¬ 
pendence of the channels needs more scrutiny. 
The roll channel has only one sensor element, 
the roll rate senor. This measures the angular 
rate of the body about its central axis relative 
to the inertial frame. The objective, and thus the 
autopilot, can differ depending on the missile 
application. Mostly the objective would be one 
of the following: maintain zero roll rate, zero 
integral of roll rate, or some preferred Euler angle 
orientation. The last two can be accomplished 
with the same autopilot architecture, with excep¬ 
tion handling for the Euler roll control based on 
the singularity in the Euler roll angle at ±90° 
pitch orientations. 

Since there is only one sensor and one control, 
this channel is a classical SISO system and can be 
controlled with a proportional derivative (PD) or 
proportional integral derivative (PID) controller. 
The three loop topologies with integral roll rate 
reference are presented in Fig. 4. 


Gain Scheduling 

Early generation missiles had analog autopilots 
and some were marvels of ingenuity. Now digital 
control is used almost exclusively. As can be 
readily seen from the above discussion, the 
autopilots performance is largely dictated by 
gains within a given topology. Unlike with early 
autopilots, with digital control the gains can be 
set precisely and can vary greatly as needed 



Tactical Missile 
Autopilots, Fig. 4 Three 
loop roll topology 
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over a wide range of flight conditions. Rarely 
can a single set of gains be found that provides 
adequate performance under all conditions. Thus, 
the autopilot design process is to design for a 
large number of flight conditions and then join 
the individual designs into a coherent whole. 
Typically, the conditions for the individual 
designs would be something like Mach number, 
altitude, and center of mass location. Once the 
individual gains are designed, they are joined 
together through an algorithm. Most likely they 
are “looked up” as a continuous function of 
the independent variables through some sort of 
interpolation. This gain changing philosophy is 
called gain scheduling. There have been some 
successful attempts for full envelope autopilot 
design. Dynamic inversion or model-based 
approaches have also been developed, most 
notably JDAM (Wise et al. 2005) where the 
autopilot was borrowed and adjusted on the fly 
from a sister design. The argument for the validity 
of this approach is that the flight conditions are 
not changing rapidly so they can be ignored. 
Of course the synthesis of the design needs to 
include examination of “off break point” condi¬ 
tions (flight conditions within the flight envelope 
that were not considered in the design process) 
to ensure compliance with stability requirements. 
History has shown that tactical missile autopilot 
gains tend to be somewhat power functions 
of dynamic pressure based on the design 
constraints. 


Other Considerations 

The selection of gains using planar linear models 
and then scheduling them is not the complete 
autopilot design exercise. There are other chal¬ 
lenges that must be considered. First, the plant 
equations are coupled through both the kinematic 
equations and the aerodynamics of the problem. 
There are two predominant ways to attack this 
problem in autopilot design. The first is as dis¬ 
cussed earlier in which the system is made to 
be as decoupled as possible, create gains for 


the decoupled system and analyze them in the 
coupled system. The second is to use feedback 
to create a more integrated design through cross 
coupling terms. 

Besides the coupling there can be other prob¬ 
lems. The design problem is hard enough as de¬ 
scribed above, but we have learned over the years 
that the models developed earlier have neglected 
certain aspects of the missile that can lead to 
problems. One aspect is that missiles can be very 
flexible, and since an inertial sensor is being used 
for feedback, the flexible characteristics can drive 
the missile unstable. The flexible characteristics 
were examined by Nesline and Nesline (1985). 
In that paper, the flexible model is presented and a 
technique for ignoring the first mode is discussed. 
(It should be noted that the model presented in the 
appendix has some “typos” and should be used 
with caution.) 

Another aspect is the consideration of 
nonlinear elements of the autopilot. These 
could include integrator reset logic, command 
error limits, and acceleration limits. The 
three loop autopilot has an integrator and 
the fact that integrators “wind up” when the 
output is saturated. For tactical missiles this 
saturation could be caused by position or 
rate limits. The integrator should be reset 
to account for these conditions so that the 
missile responds quicker when the system 
is no longer in a saturated condition. The 
command error limits can be used to modify 
the response characteristics to achieve a 
more consistent response. Finally, acceleration 
limits are used to limit the input into the 
system such that the guidance commands 
do not put the missile into a position from 
which it cannot maintain controlled flight. 
Generating acceleration limits is a complex topic 
itself. 


Summary and Future Directions 

Tactical missile autopilots are generally designed 
by separating the problem into two independent 
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lateral controls (pitch and yaw), with a third 
control governing the roll attitude. A good au¬ 
topilot design produces a balance between per¬ 
formance and robustness and incorporates non¬ 
linear elements and integrator resets. The design 
process take into account robustness throughout 
the flight envelope and structural elements. 

From a controls standpoint, the future direc¬ 
tion of tactical missile autopilot development is 
in nonlinear, adaptive, and fault tolerant control. 
Adaptive control is useful not only because it 
provides a more predictable flight response but 
also because of the potential in reducing or maybe 
even eliminating development time. 


Cross-References 

► Aircraft Flight Control 

► PID Control 
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Abstract 

Large power systems often exhibit slow and 
fast electromechanical oscillations between 
interconnected synchronous machines. The 
slow interarea oscillations involve coherent 
groups of machines swinging together. This 
coherency phenomenon can be attributed to 
the coherent areas of machines being weakly 
coupled, either because of higher impedance 
transmission lines, heavily loaded transmission 
lines, or fewer connections between the coherent 
areas compared to the connections within a 
coherent area. Singular perturbations can be used 
to display the time-scale separation of the slow 
interarea modes and the faster local modes. 


Keywords 

Model reduction; Power system oscillations; Sin¬ 
gular perturbations; Two-time-scale systems 


Interarea Mode Oscillation in a Power 
System 

A large power system consists of interconnected 
synchronous machines supplying power to 
loads via transmission lines. As a dynamical 
system, it can be considered as the rotating 
inertias of the synchronous machines interacting 
electrically through the impedances of the 
transmission system. During a disturbance, such 
as a lightning strike on a transmission line, the 
rotating inertias will oscillate against each other. 
The frequency and extent of these oscillations 
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Time-Scale Separation in Power System Swing Dynamics: Singular Perturbations and Coherency, Fig. 1 Two- 
area, four-machine system example 


Time-Scale Separation in 
Power System Swing 
Dynamics: Singular 
Perturbations and 
Coherency, Fig. 2 
Machine speed response of 
the two-area, four-machine 
system 



may vary: the local modes of frequencies 
1-2.5 Hz originate from the interactions of 
a few close-by machines, and the interarea 
modes of frequencies 0.2-0.8 Hz involve groups 
of machines swinging against other groups. 
Coherency is this phenomenon of groups of 
machines swinging together against other groups 
of machines during disturbances. 

Coherency can be illustrated in the simple 
power system shown in Fig. 1 (Rogers 2000). The 
system consists of two areas: Generators 1 and 
2 in Area 1 and Generators 11 and 12 in Area 
2. For a disturbance in Area 1, Fig. 2 shows the 
response of the machine speeds. The interarea 
mode consists of Generators 1 and 2 swinging 
coherently against Generators 11 and 12. The 


difference between the responses of Generators 
1 and 2 is due to the local mode in Area 1, which 
is excited by the disturbance. 

Coherency Analysis 

Coherency with respect to the slow interarea 
modes, also known as slow coherency, is an 
inherent property of many power systems. Tradi¬ 
tional power systems consist of operating regions 
dictated by physical or administrative constraints 
with relatively strong connections within an 
operating region. These control regions are also 
interconnected with tielines to share base-load 
and seasonal power resources as well as to rely on 
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each other for reserves. Thus a practical intercon¬ 
nected power system will, by design, necessarily 
have strong connections within each operating 
region and weaker connections between the 
regions. Due to the time-scale separation of the 
slow interarea modes and the faster local modes, 
the coherency phenomenon can be analyzed 
using singular perturbations method provided 
a suitable small parameter can be identified. 

For a simplified coherency analysis, the lin¬ 
earized second-order model of an N -machine 
power system 


K = K 1 + sK e (3) 

where K 1 is the matrix of internal connections 
and K e is the matrix of external connections 
scaled by e. If the machine angles in each co¬ 
herent area are arranged in consecutive order in 
the vector 8 , then K 1 is block diagonal with r 
zero eigenvalues, that is, one system mode per 
area. 

Singular Perturbation Analysis 


M 


d 2 A8 
dt 2 


KA8 


( 1 ) 


can be used. In (1), 8 is the N -dimensional 
vector of individual machine rotor angles <5 Z , i = 
1,..., N, A denotes small perturbations, M is 
the diagonal matrix of machine rotational inertias 
mi, i = 1,..., N, and the connection matrix K 
consists of the linearized synchronizing coeffi¬ 
cients Kij between machines i and j , denoting 
the restoring force between the two machines. 

An important property of K is 


N 

Ku =- J2 K ‘J (2) 

that is, the sum of each row of K is zero. Thus 
K has a zero eigenvalue, which is known as 
the system mode. This mode arises due to the 
lack of a reference, as only the relative angles 
between the machines are important. It can be 
eliminated when one of the machines is chosen 
as the reference. 

Suppose the A-machine system has r areas of 
coherent machines, whose internal connections 
within the areas are stronger than the external 
connections between the areas. The weak connec¬ 
tion strength is denoted by a small parameter s, 
which can be the ratio of the relative stiffness of 
the internal transmission lines versus the external 
transmission lines, or the ratio of the smaller 
number of external connections versus the larger 
number of internal connections, or both. Thus 
the connection matrix of linearized synchronizing 
coefficients can be rewritten as 


To exhibit the time scales in (1) and (3), a trans¬ 
formation to obtain the slow variables and the 
fast variables is introduced. The slow motion is 
obtained by defining for each area, an inertia- 
weighted aggregate variable 


n a 

y a = 

i = 1 

n a 

m a = m®, a = 1,2,... ,r 

i = 1 

(4) 

where n a is the number of machines in area a, m f 
is the inertia of machine i in area a, and m a is the 
aggregate inertia of area a. For the fast dynamics, 
we select in each area a reference machine, say 
the first machine, and define the motions of the 
other machines in the same area relative to this 
reference machine by the local variables 

= A8f- A8“, i = 2,3,..., n a , 
a = 1,2,..., r 

(5) 

The transformations (4) and (5) can be combined 
to form 


"j" 


~ m~ 1 u t m~ 

_ z _ 


G 


( 6 ) 


where 


U = blockdiag(^i, U 2 ,... ,u r ) 


(7) 
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is the grouping matrix with n a x 1 column vectors 

u a = [1 1 ... l] r , a = 1,2,... ,r (8) 

M a = diag(m',m 2 ,... ,m r ) = U T MU (9) 
and 

G = blockdiag(Gi, G 2 ,..., G r ) (10) 
with G a being the (n a — 1) x n a matrix 


singular perturbation parameter, giving rise to 
slow coherency. 

The dynamics of the singularly perturbed 
system (13) are approximated by the interarea 
modes =b j j—sX(M~ l K a ) and the local 

modes =b j K dd ), where A denotes 

eigenvalues. 

Identifying Coherent Areas 


- 110.0 

- 101.0 

-10 0.1 


( 11 ) 


The inverse of this transformation is explicitly 
known 


A 8 = [U G T (GG T )~ l ] 



( 12 ) 


Applying the transformation (6) to the model 
(1) and (3), the electromechanical model be¬ 
comes 


M a y = sK a y + sK ad z 

M d z = sK da y + ( K d + sK d d)z (13) 


Several methods can be used to identify coherent 

areas, including the following: 

1. Time simulation method (Podmore 1978): 
This method simulates the dynamic responses 
to a selected set of disturbances and groups 
the machines having similar time responses 
as coherent areas. For a faster simulation, a 
linearized power system model can be used. 

2. Eigenvector method (Chow et al. 1982): This 
method computes the slow eigenvalues of the 
matrix M~ l K and identifies machines with 
similar row vectors of the slow eigenvector 
matrix as coherent machines. 

3. Weak link methods (Nath et al. 1985; 
Zaborszky et al. 1982): These methods search 
through the transmission line impedances to 
find the weak links between the areas. 


where 

M d = (GM~ l G T )-\ K a = U t K e U 
K da = U T K E M~ l G T M d , 

K da = M d GM~ x K E U 
K d = M d GM~ l K‘ M~ l G T M d , 

K dd = M d GM~ l K E M~ l G T M d (14) 

Note that K a , K ad , and K da are independent 
of the internal connection matrix K 1 because 
K 1 U — 0. Furthermore, K a is negative semi- 
definite and K dd is negative definite. System 
(13) is in the standard singularly perturbed form 
(Kokotovic et al. 1986) showing that y is the 
slow variable and z is the fast variable. Thus s 
is both the weak connection parameter and the 


Applications 

The applications of the coherency concept in¬ 
clude: 

1. Dynamic model reduction (deMello et al. 
1975): The synchronous machines in a 
coherent area can be aggregated into a single 
equivalent machine, thus reducing the system 
size. Model reduction programs capable 
of handling upwards of 30,000 buses are 
available (Morison and Wang 2013). 

2. Interarea mode analysis and damping con¬ 
trol design (Larsen et al. 1995): Damping 
of interarea modes is an operational concern 
for systems with heavily loaded long-distance 
transmission lines. The slow coherency con¬ 
cept contributes to the development of damp¬ 
ing controller design. 
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3. Islanding as a defense mechanism (You et al. 
2004): During system disturbances causing 
severe power flow interruption, the last resort 
may be to separate the systems into viable 
islands, avoiding a total system blackout. Co¬ 
herent areas tend to be natural choices of 
islands. 

In addition to power system analysis, the 
coherency concept and methods can potentially 
be applied to dynamic systems with a system 
mode (eigenvalue equal to 0 for a continuous¬ 
time model and eigenvalue equal to 1 for 
a discrete-time model). An example is the 
PageRank computation in (Ishii et al. 2012). 

Cross-References 

► Consensus of Complex Multi-agent Systems 

► Lyapunov Methods in Power System Stability 

► Markov Chains and Ranking Problems in Web 
Search 

► Model Order Reduction: Techniques and Tools 

► Small Signal Stability in Electric Power Sys¬ 
tems 

Recommended Reading 

An early investigation of coherency was reported 
in Podmore and Germond (1977). A recent com¬ 
pilation of power system coherency, model reduc¬ 
tion, and interarea oscillation results can be found 
in Chow (2013). 
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Abstract 

Tracking and regulation refer to the ability of 
a control system to track/reject a given family 
of reference/disturbance signals modelled as so¬ 
lutions of a differential/difference equation. The 
problem can be posed as a stabilization prob¬ 
lem with a constraint on the steady-state re¬ 
sponse of the system. For linear, time-invariant, 
systems, the problem can be solved provided 
a system of linear matrix equations admits a 
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solution. Properties of this system of equations 
are discussed, together with a general property of 
all controllers achieving tracking and regulation: 
the so-called internal model principle. 


Keywords 

Internal model principle; Linear systems; Regu¬ 
lation; Tracking 


Introduction 


Consider a linear system affected by disturbances 
and such that its output is required to asymp¬ 
totically track a certain, prespecified, reference 
signal. In what follows, we discuss and solve 
this control problem known as the tracking and 
regulation problem. 

Consider a linear control system described by 
equations of the form 


gx = Ax + Bu + Pd , 
e = Cx + Qd, 


( 1 ) 


ad = Sd, (2) 

with S a matrix with constant entries. Note 
that, under this assumption, it is possible to 
generate, for example, constant or polynomial 
references/disturbances and sinusoidal refer¬ 
ences/disturbances with any given frequency. 

The variable e(t ), denoted tracking error, is 
a measure of the error between the ideal be¬ 
havior of the system and the actual behavior. 
Ideally, the variable e(t) should be regulated 
to zero, i.e., should converge asymptotically to 
zero, despite the presence of the disturbances. If 
this happens, we say that the tracking error is 
regulated to zero, i.e., converges asymptotically 
to zero; hence, the disturbances are not affect¬ 
ing the asymptotic behavior of the system and 
the output Cx(t) is asymptotically tracking the 
reference signal —Qd(t). In general the tracking 
error does not naturally converge to zero; hence, 
it is necessary to determine an input signal u(t) 
which drives it to zero. The simplest possible way 
to construct such an input signal is to assume that 
it is generated via static feedback of the state x(t) 
of the system to be controlled and of the state d(t ) 
of the exosystem, i.e., 


with x(t) G R", u(t) G R m , e(t) G RP, 
d{t) G R r , and A, B , P, C, and Q constant 
matrices. In Eq. (1), gx = Gx{t) stands for 
x(t), if the system is continuous-time, and for 
x(t + 1), if the system is discrete-time. Since the 
system is time-invariant, it is assumed, without 
loss of generality, that all signals are defined for 
t > 0, that is, if the system is continuous-time, 
then t G IR + , i.e., the set of nonnegative real 
numbers, whereas if the system is discrete-time, 
then t G Z + , i.e., the set of nonnegative integers. 
For ease of notation, the argument “t” is dropped 
whenever this does not cause confusion, and we 
use the notation t > 0 to denote either IR + or . 

The signal d(t ), denoted exogenous signal, 
is in general composed of two components: the 
former models a set of disturbances acting on 
the system to be controlled and the latter a set 
of reference signals. In what follows we assume 
that the exogenous signal is generated by a lin¬ 
ear system, denoted exosystem, described by the 
equation 


u = Kx + Ld. (3) 

In practice it is unrealistic to assume that both 
x(t) and d(t) are measurable; hence, it may be 
more natural to assume that the input signal u(t) 
is generated via dynamic feedback of the error 
signal only, i.e., it is generated by the system 

GX = F X + Ge 


with x(t) G R y , for some v > 0, and F, G, and 
H matrices with constant entries. 

Using the above definitions, it is possible to 
formally pose the regulator problem as follows. 

Definition 1 (Full information regulator prob¬ 
lem) Consider the system (1), driven by the 
exosystem (2) and interconnected with the con¬ 
troller (3). The full information regulator problem 
is the problem of determining the matrices K 
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and L of the controller such that ((S) stands for 
stability and (R) for regulation): 

(S) The system ax = (A + BK)x is asymptoti¬ 
cally stable. 

(R) All trajectories of the system 

ad = Sd, 

ax = (A + BK)x + (BL + P)d, (5) 
e = Cx + Qd, 

are such that lim e(t ) = 0. 

t —>OQ 

Definition 2 (Error feedback regulator prob¬ 
lem) Consider the system (1), driven by the 
exosystem (2) and interconnected with the con¬ 
troller (4). The error feedback regulator problem 
is the problem of determining the matrices F, G, 
and H of the controller such that: 

(S) The system 

ax = Ax + BHx , 
ax = F/ + GCx, 

is asymptotically stable. 

(R) All trajectories of the system 

ad = Sd, 

ax = Ax + BHx + Pd, 

( 6 ) 

<*X = F X + G(C* + Qd), 

e = Cv + 

are such that lim e(t) = 0. 

t —>oo 

The Full Information Regulator 
Problem 

Consider the full information regulator problem 
and assume the following. 

Assumption 1 The matrix S of the exosystem 
has all eigenvalues with nonnegative real part, 
in the case of continuous-time systems, or with 
modulo not smaller than one, in the case of 
discrete-time systems. 


Assumption 2 The system (1) with d = 0 is 
reachable. 

Assumption 1 implies that there are no initial 
conditions <i(0) such that the signal d(t) con¬ 
verges (asymptotically) to zero. This assumption 
is not restrictive. In fact, disturbances converging 
to zero do not have any effect on the asymptotic 
behavior of the system, and references which 
converge to zero can be tracked simply by driving 
the state of the system to zero, i.e., by stabilizing 
the system. Assumption 2 implies that it is pos¬ 
sible to arbitrarily assign the eigenvalues of the 
matrix A + BK by a proper selection of K. Note 
that, in practice, this assumption can be replaced 
by the weaker assumption that the system (1) with 
d = 0 is stabilizable. 

We now present a preliminary result which 
is instrumental to derive a solution to the full 
information regulator problem. 

Lemma 1 Consider the full information regula¬ 
tor problem. Suppose Assumption 1 holds. Sup¬ 
pose, in addition, that there exist matrices K and 
L such that condition (S) holds. 

Then condition (R) holds if and only if there 
exists a matrix fl e R nxr such that the equations 

ns = (A + BK) n + (P + BL), 

(7) 

o = cn + q, 

hold. 

Proof Consider the system (5) and the coordi¬ 
nates transformation 

d = d, 
x = x — Tld, 

where n is the solution of the equation 

n S = (A + BK) n + (P + BL). 

This equation is a so-called Sylvester equation. 
The Sylvester equation is a (matrix) equation of 
the form 

AiX = XA 2 + A 3 , 
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in the unknown X. This equation has a unique 
solution, for any A 3 , if and only if the matrices A\ 
and A 2 do not have common eigenvalues. Note 
that, by condition (S) and Assumption 1, there is 
a unique matrix II which solves this equation. 
In the new coordinates x and d , the system is 
described by the equations 

ad = Sd , 
ax = (A + BK)x, 
e — Cx T" (CII T- Q)d. 

By condition (S) lim v(7) = 0, hence condi- 

/ —>00 

tion (R) holds, by Assumption 1, if and only if 
CII + Q = 0. In summary, under the stated 
assumptions, condition (R) holds if and only if 
there exists a matrix II such that Eqs. (7) hold. 

We are now ready to state and prove the result 
which provides conditions for the solvability of 
the full information regulator problem. 

Theorem 1 Consider the full information reg¬ 
ulator problem. Suppose Assumptions 1 and 2 
hold. There exists a full information control law 
described by Eq. (3) which solves the full infor¬ 
mation regulator problem if and only if there exist 
two matrices II and T such that the equations 

TIS = ATI + BY + P, 

0 = cn + Q, 


such a matrix K does exist. The matrix L is 
selected as L = T — Kn. This selection is 
such that condition (S) of the full information 
regulator problem holds; hence, to complete 
the proof, we have only to show that, with K 
and L as selected above, Eqs. (7) hold. This is 
trivially the case. In fact, replacing L in (7) 
yields Eqs. ( 8 ), which hold by assumption. 
As a result, also condition (R) of the full 
information regulator problem holds, and this 
completes the proof. 

The proof of Theorem 1 implies that a con¬ 
troller which solves the full information regulator 
problem is described by the equation 

u = Kx + (T — KTl)d, 

with K such that a stability condition holds, 
and II and T such that Eqs. ( 8 ) hold. By As¬ 
sumption 2 , the stability condition can be always 
satisfied. As a result, the solution of the full 
information regulator problem relies upon the 
existence of a solution of Eqs. ( 8 ). 


The FBI Equations 

Equations ( 8 ), known as the Francis-Byrnes- 
Isidori (FBI) equations, are linear equations in 
( 8 ) the unknowns II and T, for which the following 
statement holds. 


hold. 

Proof (Necessity) Suppose there exist two ma¬ 
trices K and L such that conditions (S) and 
(R) of the full information regulator problem 
hold. Then, by Lemma 1 , there exists a matrix 
II such that Eqs. (7) hold. As a result, the 
matrices II and T = Kn + L are such that 
Eqs. ( 8 ) hold. 

(Sufficiency) The proof of the sufficiency is con¬ 
structive. Suppose there are two matrices II 
and T such that Eqs. ( 8 ) hold. The full infor¬ 
mation regulator problem is solved selecting 
K and L as follows. The matrix K is any 
matrix such that the system ax = (A + BK)x 
is asymptotically stable. By Assumption 2, 


Lemma 2 Equations (8), in the unknowns II and 
T, are solvable for any P and Q if and only if 


rank 


si -A B 
C 0 


= n + p, 


(9) 


for all s which are eigenvalues of the matrix S. 

For single-input, single-output systems (i.e., 
m = p = 1 ), the condition expressed by 

Lemma 2 has a very simple interpretation. In fact, 
the complex numbers s such that 


rank 


si -A B 


C 0 


< n + 1 
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are the zeros of the system 

ax = Ax + Bu, 


rank 


C 

CA 


= n 


y = Cx, 


CA 


n+r -1 


that is the roots of the numerator polynomial of 
the transfer function W(s) = C(sl —A)~ l B , i.e., 
the zeros of W(s). This implies that, for single¬ 
input, single-output systems, the full information 
regulator problem is solvable if and only if the 
eigenvalues of the exosystem are not zeros of the 
transfer function of the system (1) with input u , 
output e , and d = 0. 


The Error Feedback Regulator 
Problem 

To provide a solution to the error feedback 
regulator problem, we need to introduce a new 
assumption. 

Assumption 3 The system 


1 1 

b b 

= 

'A P~ 
_0 S _ 

X 

_d_ 

e 

= [CQ] 

X 

d 


is observable. 


Note that Assumption 3 implies observability 
of the system 


ax = Ax, 
y = Cx. 


(ID 


To show this property, note that observability of 
the system (10) implies that 


rank 


C Q 

CA : 


CA 


n+r -1 . 


= n + r. 


This, in turn, implies 


and, by Cayley-Hamilton Theorem, 


rank 


C 

CA 


= n, 


CA "- 1 


which implies observability of system (11). Sim¬ 
ilarly to what discussed in the case of Assump¬ 
tion 2, Assumption 3 can be replaced by the 
weaker assumption that the system (10) is de¬ 
tectable. We are now ready to state and prove the 
result which provides conditions for the solvabil¬ 
ity of the error feedback regulator problem. 

Theorem 2 Consider the error feedback regu¬ 
lator problem. Suppose Assumptions 1-3 hold. 
There exists an error feedback control law de¬ 
scribed by Eq. (4) which solves the full informa¬ 
tion regulator problem if and only if there exist 
two matrices IT and T such that the equations 


US = ATI + BY + P , 
0 = cn + Q, 


( 12 ) 


hold. 

Remark Theorem 2 can be alternatively stated 
as follows. Consider the error feedback regulator 
problem. Suppose Assumptions 1-3 hold. Then 
the error feedback regulator problem is solvable if 
and only if the full information regulator problem 
is solvable. 

Proof (Necessity) The proof of the necessity is 
similar to the proof of the necessity of Theo¬ 
rem 1 , hence omitted. 

(Sufficiency) The proof of the sufficiency is con¬ 
structive. Suppose there are two matrices n 
and T such that Eqs. (12) hold. Then, by The¬ 
orem 1, the full information control law u = 
Kx-\-(T—KU)d, with K such that the system 
ax = (A + BK)x is asymptotically stable, 
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solves the full information regulator prob¬ 
lem. This control law is not implementable, 
because we only measure e. However, by As¬ 
sumption 3, it is possible to build asymptotic 
estimates £ and <5 of v and d\ hence, imple¬ 
ment the control law 


ax 


~A + BK 

-BK 

-£(r-^n)~ 

ve x 

= 

0 

A + GiC 

p + g x q 

_& e d _ 


0 

g 2 c 

s + g 2 q 


x 

e* 

ed 


(15) 


u = + (T - KU) 8 . (13) 

To this end, consider an observer described by 
the equation 


Recall that the matrices G x and G 2 have been 
selected to render system (14) asymptotically 
stable and that K is such that the system ax = 
(A + BK)x is asymptotically stable. As a result, 
system (15) is asymptotically stable. 


>£" 


"A P~ 

T 

08 


_0 s _ 

_ 8 _ 


+ 


G\ 

G2 




+ 


B 

0 


[k r- xn] 



The Internal Model Principle 

The proof of Theorem 2 implies that a controller 
which solves the error feedback regulator prob¬ 
lem is described by equations of the form (4) with 


The estimation errors e x = x — £ and e d = y _ £ 

A n 

d — 8 are such that L d 


ae d 


([ 


A P 
0 S 




A + G\C + BK P + G\Q + B(T — A'n) 

g 2 c s + g 2 q 


’( 16 ) 


e* 

e d 


(14) 


G = 


G 1 
G 2 


h = [k r-^n], 


hence, by Assumption 3, there exist G\ 
and G 2 that assign the eigenvalues of this 
error system. Note now that the control 
law (13) can be rewritten as w = Kx + 
(T - KU)d - (Ke x + (T — KH)e d ); 
hence, the control law is composed of the 
full information control law, which solves 
the regulator problem, and of an additive 
disturbance, which decays exponentially to 
zero. Such a disturbance does not affect 
the regulation requirement, provided the 
closed-loop system is asymptotically stable. 
Therefore, to complete the proof, we need 
to show that condition (S) holds. In the 
coordinates x, e x , and e d , the closed-loop 
system, with d = 0, is described by the 
equations 


K, G 1 , and G 2 such that a stability condition 
holds and n and T such that Eqs. (12) hold. 
This controller, and in particular the matrix F, 
possesses a very interesting property. 

Proposition 1 (Internal model property) The 

matrix F in Eq. (16) is such that 

ft: = es, 

for some matrix E of rank r. In particular, any 
eigenvalue of S is also an eigenvalue of F. 

Proof Let 



and note that rankE = r, by construction, and 
that 
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Tin + GiCn + BKTl + P + GiQ 

+B(r - Kn) - g 2 c n + s - g 2 q 


(An + bt + p) + g^cti + q) 
S-G 2 (cn + Q) 


US 

s 


= ES, 


hence the first claim. To prove the second claim, 
let A be an eigenvalue of S and v the correspond¬ 
ing eigenvector. Then Sv = Xv\ hence, 


FEu = E Sv = AEu, 


which shows that A is an eigenvalue of F with 
eigenvector Eu, and this proves the second claim. 

It is possible to prove that the property high¬ 
lighted in Proposition 1 is shared by all error 
feedback control laws which solve the considered 
regulation problem. This property, which is often 
referred to as the internal model principle, can be 
interpreted as follows. The control law solving 
the regulator problem has to contain a copy of 
the exosystem, i.e., it has to be able to generate, 
when e = 0, a copy of the exogenous signal. 


► Linear Systems: Continuous-Time, Time-Vary¬ 
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Summary and Future Directions 

The problem of tracking and regulation for linear 
systems in the presence of references and/or dis¬ 
turbances generated by a linear signal generator 
has been solved. It has been shown that the 
problem is solvable provided a system of linear 
matrix equations admits a solution. The tracking 
and regulation problem can be studied and solved 
for more general classes of systems, including 
nonlinear systems, distributed parameter systems, 
and hybrid systems, exploiting the same ideas 
presented in this article. 
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► Linear Systems: Continuous-Time, Time-In¬ 
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Abstract 

The main objective of tracking model predictive 
control is to steer the tracking error, that is, the 
difference between the reference and the output, 
to zero while the constraints are satisfied. In order 
to predict the expected evolution of the tracking 
error, some assumptions on the future values 
of the reference must be considered. Since the 
reference may differ from expected, the tracking 
problem is inherently uncertain. 

The most extended case is to assume that the 
reference will remain constant along the pre¬ 
diction horizon. Tracking predictive schemes for 
constant references are typically based on a two- 
layer control structure in which, provided the 
value of the reference, first, an appropriate set 
point is computed and then a nominal MPC 
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is designed to steer the system to this target. 
Under certain assumptions, closed-loop stability 
can be guaranteed if the initial state is inside the 
feasibility region of the MPC. However, if the 
value of the reference is changed, then there is no 
guarantee that feasibility and stability properties 
of the resulting control law hold. Specialized 
predictive controllers have been designed to deal 
with this problem. Particularly interesting is the 
so-called MPC for tracking, which ensures recur¬ 
sive feasibility and asymptotic stability of the set 
point when the value of the reference is changed. 

The presence of exogenous disturbances or 
model mismatches may lead to the controlled 
system to exhibit offset error. Offset-free control 
in the presence of unmeasured disturbances can 
be addressed by using disturbance models and 
disturbance estimators together with the tracking 
predictive controller. 

Keywords 

Loss of feasibility; MPC for tracking; Offset-free 
control; Set-point tracking 

Introduction 

The problem of designing and stabilizing model 
predictive control (MPC) schemes to regulate a 
system to the origin has been widely studied, 
and there are well-known solutions for varied 
cases including linear, nonlinear, and uncertain 
systems, among others (Rawlings and Mayne 
2009). 

The objective of tracking MPC is to ensure a 
tracking error, which is the difference between a 
reference or desired output r and the actual output 
y, tends to zero. 

The most common tracking problem is when 
the reference r is constant. In this case, the con¬ 
troller is required to steer the state v of the plant 
and the control input u applied to the plant to a set 
point (x r ,u r ) where the tracking error y r is zero 
and the plant is in equilibrium (at rest); the state 
x r is called a target. It is also necessary to ensure 
that x r is asymptotically stable for the controlled 


system, i.e., that the state v converges to x r and 
that, near x r , small changes in v cause small 
changes in the subsequent trajectory. A relatively 
straightforward solution for this problem exists. 

Set-point tracking is a relevant control prob¬ 
lem in the process industry in which the plant is 
typically designed to operate at an equilibrium 
point that maximizes the profit of the plant. In 
this case, the optimal set point is calculated online 
by a real-time optimizer (RTO) according to an 
economic criteria. The set points remain constant 
for a long period of time, until the RTO, which 
is executed at a very low frequency, calculates a 
different set point. The steady-state target associ¬ 
ated to the given set point must be calculated and 
provided to the MPC to track this target. 

The tracking problem is considerably more 
difficult when the reference r varies in a way not 
known a priori because MPC is naturally suited 
to deterministic control problems. Uncertainty 
requires the “invention” of special techniques so 
that a variety of solutions have been proposed 
in the literature to deal with a varying reference 
(Bemporad et al. 1997; Chisci and Zappa 2003; 
Limon et al. 2008; Maeder and Morari 2010; 
Pannocchia and Rawlings 2003; Rossiter et al. 
1996). 

Another tracking problem arises when there 
exists a mismatch between the model used for 
prediction in the optimal control problem and the 
real plant. If the reference is constant and the 
model mismatch is sufficiently small not to cause 
loss of asymptotic stability, the state and control 
will converge to values at which the predicted 
tracking error, but not the actual tracking error, 
is zero. The difference between the predicted and 
actual values of the output y is known as the 
offset; offset-free tracking when the reference is 
constant may be achieved by incorporation of a 
suitable observer to estimate the offset. 

Notation 

The set Im denotes the set of integers 
{0,1, -,M}. I n denotes the identity matrix 

in R nxn . z denotes a signal (or time sequence) 
z = {z(0),z(l), • • •}, whose cardinality is 

inferred from the context. A signal that depends 
on a parameter 6 is denoted as z (0) and z(i ; 6) 
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denotes its i th element. A closed polyhedron 
X C W 1 is a set that results of the intersection 
of a finite number of hyperplanes as follows: 
X = n, {x : FiX < fi}, where F, e K lx '' and 
/(€*■ 

Problem Statement 

In this article, for the sake of simplicity, we 
consider that the system to be controlled can 
be modeled as a linear time-invariant system 
described by a discrete-time state-space linear 
model: 

x(k + 1) = Ax(k) + Bu(k) (la) 
y(k) = Cx(k) (lb) 

where x(k) e W 1 , u(k) e M m , and y (k) e W are 
the state, the manipulable inputs, and the outputs 
of the system at time step k , respectively. This 
model will be used to calculate the predictions in 
the predictive controller. 

The evolution of the plant must be such that 
the constraint 

(x(k), u(k)) e Z (2) 

is satisfied for all k > 0. The set Z is a 
closed polyhedron. Without loss of generality, we 
assume that (0,0) e Z. 

The main objective of tracking model predic¬ 
tive control is to steer the system output to the 
reference, that is, steer the tracking error y — r to 
zero, while the constraints are satisfied. In order 
to predict the expected evolution of the tracking 
error, some assumptions on the future values 
of the reference must be considered. Since the 
reference may differ from expected, the tracking 
problem is inherently uncertain. 

Thus, assuming that the reference signal is 
known a priori, r = {r(0), r(1), • • •}, the tracking 
model predictive control law K(x(k), r) must be 
designed to ensure that the resulting controlled 
system 

x(k + 1) = Ax(k) + Bic(x(k),r) 


y(k) = Cx(k) 

satisfies the constraints, i.e., (x(k),u(k)) e Z 
for all k > 0 is stable and, if it is possible, 
the controlled output converges to the reference, 
that is, 

lim \\y(k)-r(k)\\ = 0 . 

k —>-oo 

It is assumed that the system is stabilizable 
and that the outputs are linearly independent. It 
is also considered that the state is measured and 
available at each sample. 

Tracking MPC for a Constant 
Reference 

The most simple tracking problem is to consider 
that the reference signal is a constant signal in the 
future equal to the actual value of the reference, 
i.e., r(k) = r. This control problem is very 
common in the process industry, for instance, 
where processes are typically designed to operate 
at certain equilibrium point. 

Determining the Set Point 

Corresponding to each value r of the reference is 
a set point (x r ,u r ) that is ideally an equilibrium 
point of the prediction model, i.e., it satisfies 

x r = Ax r + Bu r . (3) 

The set point (x r ,u r ) is also required to satisfy 

y r = Cx r = r (4) 

and 

(x r ,u r ) e Z 

so that the tracking error y — r is zero and the 
constraint (2) is satisfied at the set point. Because 
the set point is an equilibrium point, the tracking 
error remains zero once the set point is reached. 

Conditions for the existence of a set point pos¬ 
sessing the above properties are given in Rawl¬ 
ings and Mayne (2009, Lemma 1.14). 

In practice, the condition ( x r ,u r ) e Z is 
replaced by ( x r ,u r ) e Z s C interior! Z} in 
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order to ensure that the constraint (x r ,u r ) G Z 
is not active at the set point, and the tracking 
error requirement is slightly relaxed so that the 
set point is determined by solving 

(x r ,u r ) = arg min l t (x s ,u s ,r) (5) 

where l t is a convex function, typically a 
quadratic function as follows: 

l t (x s ,u s ,r) = \\Cx s -r\\ 2 Qs + 

This problem is referred to as steady-state 
target optimization problem (Rao and Rawlings 
1999). 

Model Predictive Controller Design 

If the reference to be tracked is a constant, 
i.e., r(k) = r for all k , then the control 
objective is to stabilize the system and steer 
the initial state x( 0 ) to the set-point state x r . 
As is usual in model predictive control, a finite 
horizon optimization problem that depends on 
the current state x and the constant reference r is 
solved yielding a control sequence u °(x,r) = 
{u o (0\ x, r), u°(l; x, r), • • • ,u°(N — 1 ;x,r)} 

and the associated state trajectory x°(x,r) = 
{x°(0;x, r) = x,x°(l;x,r), ••• ,x°(N;x,r)}, 
where N is the prediction horizon. The first 
element of this sequence, namely, u°(0’,x,r), is 
applied to the system. 

Because the reference is constant, the appro¬ 
priate optimal control problem P^{x,r) is a 
slight variation of that discussed in the article 
► Nominal Model-Predictive Control and is de¬ 
fined by 


The stage cost function £(•) is a measure of 
the predicted tracking error set point, that is, 
£(x r , u r , r) = 0 and £(x, u , r) > Gq(||x — x r ||). 
The terminal cost function F/(-) is such that 

Ct 2 (\\x - x r \\) < Vf (x r ,r) < or 3 (||x -x r ||). 

Functions a?/ are /Coo functions (see the article 
► Nominal Model-Predictive Control). The set of 
states where this optimization problem is feasible 
is denoted as X^(r). 

The solution of the optimal control problem 
TV(x, r ) yields the receding horizon control law 

/Civ(x, r) = u°( 0 ; x, r) 

and the system under model predictive control 
satisfies 

x(k + 1) = Ax(k ) + BKu(x(k), r ) (7) 

Because the horizon N is finite, x r is not nec¬ 
essarily asymptotically stable for this system, but 
asymptotic stability can be ensured if the terminal 
cost function V / (•) and the terminal region Xf (r) 
are chosen appropriately. 

The functions £(•), V/0) and the set Xf(r) 
must satisfy the following condition. 

Stability conditions for nominal MPC: For all 

x G Xf(r), there exists a control input u such 
that (x,u) G Z and the successor state x + = 
Ax + B u are contained in Xf(r) and 

F/(x + , r) — Vf(x, r ) < — £(x, u, r). 


N -1 

mjn Z £(x (y). u(j), r ) + V f (x(N), r ) 


j= o 

s.t. x( 0 ) = x, (6a) 

x(j + 1) = a x(j) + Bu(j), 

j € Ijv-i ( 6 b) 

(x(j),u(j)) e Z, j e Ijv-i (6 c) 

x(N) e Xf(r) ( 6 d) 


These conditions are trivially satisfied taking 
X f (r) = x r and F/(x, r) = 0 . 

Under these assumptions, the optimiza¬ 
tion problem is recursively feasible, i.e., if 
Bjv(x( 0 ),r) is feasible, then all subsequent 
problems Bjy(x(i), r ) are also feasible. Besides, 
the optimal cost function is a Lyapunov function 
of the system (7). Then, the set point (x r ,u r ) is 
an asymptotically stable equilibrium point of the 
system (7) and the domain of attraction is X^ (r). 
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Tracking MPC for a Changing 
Reference 

The previous predictive controller is inherently 
deterministic, since it is assumed that the refer¬ 
ence is known and this will remain constant in 
the future. However, in a realistic scenario, the 
reference may be changed without a predefined 
deterministic law or even randomly. In this sec¬ 
tion, a tracking predictive controller, for the case 
when the reference is constant or varying but 
ultimately constant, is presented. 


x, which lies in does not necessarily lie 

in Xn(t 2 ) so that Kn(-, ri) is undefined and the 
model predictive controller fails. 

This phenomenon is illustrated for the double 
integrator system where 


A = 


1 1 
0 1 


B = 


0.5 

0.5 


,c = [l 0] 


and the set of constraints is given by 


Z — {(x, u) \ ||x||oo ^ 5, Halloo < 0.3} 


Feasibility and Stability Issues 

If the reference r is constant, tracking MPC 
ensures asymptotic stability of the target state 
x r and convergence to zero of the tracking error 
y—r. However, if the reference r varies, recursive 
feasibility (i.e., feasibility of Pjy(x(k), r(k)) at 
each time instant k) and asymptotic stability may 
be compromised. For each value of r, the feasi¬ 
bility region X^ (r) is the set of states for which 
Pjv(x,r) has a solution; it is also the domain 
of attraction for the closed-loop system (7). If r 
changes value from r\ to r 2 , the terminal con¬ 
straint set Xf ( 72 ) and the terminal cost function 
F/(-, 7 * 2 ) have to be computed. The current state 


The initial state is x(0) = (2.91,-1.83) and 
the initial value of the reference is r\ = —2. The 
corresponding set point is (x ri , w n ) where x n = 
(—2,0) and u n = (0,0). If the reference changes 
from r\ to r 2 , the new set point is (x r2 , u r2 ) where 
x r2 = (4.5,0) and u n = (0,0). The horizon 
is chosen to be = 3 and the domains of 
attraction for the two values of r are, respectively, 
X^{r\) and X^fa). These two domains, X^{r{) 
and ^ 3 ^ 2 ), are disjoint. While r = n, the state 
trajectory commencing at x( 0 ) e X^{r{) remains 
in X^irx). If r subsequently changes its value 
to r 2 at time t\, the model predictive controller 


Tracking Model 
Predictive Control, Fig. 1 

Example of the double 
integrator: terminal regions 
(Xf (ri) and Xf (r 2 )) and 
domains of attraction of 
MPC CY 3 (n), X 3 (r 2 ), and 
X\o(r 2 )) 
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fails since x(/i) does not lie in X^fa). This is 
illustrated in Fig. 1 . 

These feasibility and stability issues can 
be overcome if the predictive controller is 
redesigned for the new set point. This would 
require the calculation of a new terminal set 
and a prediction horizon each time the set point 
changes. For instance, in the example of Fig. 1, 
if the terminal constraint is recalculated for r 2 
and the prediction horizon is chosen as TV = 10, 
then the MPC controller steers the system to 
the reference r 2 since v(0) G Xio(/ 2 ). This 
recalculation can be done off-line if the set-point 
changes are a priori known (Findeisen et al. 2000; 
Wan and Kothare 2003). Other methods to avoid 
this issue are designing a predictive controller to 
provide a certain degree of robustness to set-point 
variations (Pannocchia 2004; Pannocchia and 
Kerrigan 2005) and a predictive control law with 
a mode to recover recursive feasibility (Chisci 
and Zappa 2003; Rossiter et al. 1996) or using 
specialized predictive control laws (Magni and 
Scattolini 2005; Magni et al. 2001). Another 
solution to this case is to use a reference governor 
and a predictive controller (Bemporad et al. 1997 ; 
Olaru and Dumur 2005). 


Stabilizing MPC for Tracking 

The idea behind the reference governor is to 
introduce an artificial reference r a that is ma¬ 
nipulated to ensure that the current state is in 
the domain of attraction 2Cv(r a ) while tends to 
the actual reference r if r remains constant or 
tends to a constant. In Limon et al. (2008), this 
idea is used to formulate the MPC for tracking. 
The artificial reference r a is an extra decision 
variable in the optimal control problem to avoid 
the loss of feasibility issue. In order to enforce 
the convergence to the actual reference r, a term 
that penalizes the deviation between the artificial 
reference r a and the actual reference r, i 0 (j a , r) 
is added. This function is assumed to be convex 
in r a . A suitable choice of this term is the cost 
function of the steady-state target calculator (5), 
i.e., l 0 (r a ,r ) = l t (x r a,u r a,r ), where (x r a,u r a) 
is the artificial set point associated to the artificial 
reference r a . 


The optimal model predictive control problem 
P l N (x, r ) for tracking is given by 

N -1 

min y] l(x(j),u(j),r a ) + V f (x(N),r a ) 

U ’ r i= 0 


+l 0 (r a , r)s.t.x(0) = x, 

(8a) 

x(J + 1) = Ax(J) + Bu(J), 


j € Ijv-i 

(8b) 

(x(j), u(j )) e Z, j e Ijv-i 

(8c) 

r a elZ 

(8d) 

(x(N),r a ) e T 

(8e) 


where 7 Z = {r : ( x r ,u r ) G Z s , Ax r + Bu r = 
x r , Cx r = r}. 

Condition (3) is an extended terminal con¬ 
straint of both the terminal state x(N) and the 
artificial reference r a . The feasibility region of 
this optimization problem X* N is the set of states 
that can be steered to any reference of the set 7 Z 
in N steps, that is, 

X'n = U Vv(r fl ) 

r a en 

The terminal cost function Vf (■) and the ter¬ 
minal constraint set, T, must satisfy appropriately 
modified stability conditions in order to ensure 
recursive feasibility and asymptotic stability of 
(x r ,u r ). The stability conditions are the follow¬ 
ing. 

Stability conditions for tracking MPC: For all 

(x,r a ) e T, there exists a u satisfying: 

(i) (v, u) G Z 

(ii) the successor state = Ax + Bu such that 
(v + , r a ) G T and 

Vf{x + , r a ) — Vf(x , r a ) < —l(x, u, r a ). 

As shown in Limon et al. (2008), if the termi¬ 
nal control law is chosen as u = K(x — x r a) + u r a 
with K such that the eigenvalues of A + BK are 
in the unitary disk, then the terminal set T can be 
calculated using standard algorithms to compute 
positively invariant sets for constrained linear 
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systems and it is a polyhedron. A simple choice of 
the terminal cost and constraint satisfying these 
assumptions is Vf (•) = 0 and T = {( x,r a ) : 

X = X r a). 

Theorem 1 If the stability conditions for track¬ 
ing MPC hold, then predictive control law de¬ 
rived from the optimal control problem P l N (x, r) 
is such that: 

1. For all feasible initial state, i.e., x(0) e X* N , 
and for all r e the optimization problem 
is recursively feasible, that is, if P* N (x (0), r) 
is feasible, then all the subsequent problems 
P l N (x (i), r) are also feasible. 

2. If r is admissible, i.e., r e IZ, then the 
set point ( x r ,u r ) is an asymptotically stable 
equilibrium point of the closed-loop system 
and the domain of attraction is X l N . 

3. If r is not admissible, that is, r ^ IZ, then the 
set point (x r *, w r *) such that 

r* = arg min l 0 (j a , r) 

r a €lZ 

is asymptotically stable and the domain of 
attraction is X l N . 

4. The domain of attraction X l N is larger than the 
domain of the nominal MPC for any reference 
r elZ, that is, X^ (r) c X l N , and contains all 
the equilibrium points contained in Z s . 

5. If the reference r (k) is not constant and con¬ 
verges to a steady value r, the optimization 
problem is recursively feasible and the set 
point ( x r , u r ) is an asymptotically stable equi¬ 
librium point for all x(fS) G X* N . 

In Fig. 2a the aforementioned properties are 
illustrated for the example of the double inte¬ 
grator. The MPC for tracking has been designed 
with the same prediction horizon N = 3 and the 
same terminal control law and the terminal cost 
function that in the previous tracking MPC case. 
The initial state is also the same and the reference 
signal is r(k) = 7*2 for k < 30 and r(k) = r\ for 
k > 30. Notice that the tracking MPC cannot be 
used to do this without redesign. In Fig. 2a, it can 
be seen that the domain of attraction of the MPC 
for tracking X\ is larger than the domain provided 
by the standard tracking MPC X^irf) or X^fa). 


This figure also shows the state portrait of the 
closed-loop trajectory. In Fig. 2b the trajectories 
of the reference signal r, the controlled output 
y, and the artificial target output y r a = Cx r a 
are depicted. Notice the role of the artificial 
target: y r a differs from the reference in order to 
guarantee recursive feasibility and finally con¬ 
verges to the reference r to enforce asymptotic 
stability. 


Offset-Free Tracking 

In practice there may exist mismatches between 
the prediction model and the dynamics of the 
real plant to be controlled, due, for instance, to 
un-modeled nonlinearities or unmeasured distur¬ 
bances. This would require to design the pre¬ 
dictive controller to be robust to this uncertain 
effects. Assuming that the predictive controller 
based on nominal predictions is robustly stable 
and considering that the controlled system con¬ 
verges to a steady state, there may exist a steady 
error between set point and the output. 

This offset can be canceled taking into ac¬ 
count a prediction model corrected by a distur¬ 
bance model (Pannocchia and Rawlings 2003). 
To achieve offset-free control, the disturbance 
is assumed to be an integrating disturbance as 
follows: 

x(k + 1) = Ax(k) + Bu(k ) + Bdd(k) (9a) 
d(k + 1) = d(k) (9b) 

y(k) = Cx(k) + Ddd(k) (9c) 

Matrices Bd and Dd define the disturbance 
model and these are chosen to guarantee offset- 
free control. They are typically chosen as Bd =0 
and Dd = I P - 

The disturbance signal d (k) is estimated using 
an observer based on the disturbance model. The 
disturbance model and the estimator gains can 
be calculated separately, but this may lead to 
a poor closed-loop performance. A joint design 
procedure has been proposed in Pannocchia and 
Bemporad (2007). 
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Tracking Model Predictive Control, Fig. 2 The double X 3 (r{) and X 3 (r 2 ) vs. the domain of attraction of the 
integrator controlled by the MPC for tracking, (a) Com- MPC for tracking X 3 . (b) Trajectories of the reference, 
parison of the domains of attraction of the tracking MPC the controlled output, and the artificial reference r a 


Once the estimated disturbance d is 
available, the corrected prediction model (9) 
must be used to calculate the MPC tar¬ 
get in the steady-state target optimization 
problem (5) (x r ,u r ) and to calculate the 
predictions in the optimization problem 
Pn(x, x r ,u r , d). 


Future Directions 

Tracking model predictive control is an 
inherently uncertain control problem due to the 
unexpected changes in the reference. Constant 
reference tracking has been widely studied and 
there exist a number of nice solutions. 
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The case of trajectory tracking is not as mature 
as the set-point tracking case. If the reference 
signal is known a priori, this can be used to 
calculate the predicted cost. This control problem 
can be solved by using a two-layer structure: a 
trajectory planning on top of a predictive control 
law that steers the system to the trajectory tar¬ 
get. Asymptotic stability to the trajectory target 
can be proved using terminal equality constraint 
resorting on the regulation problem. Another in¬ 
teresting line is to assume that the reference 
is the output of a certain dynamic system. For 
different families of trajectories, such as ramps 
or sinusoidal signals, Maeder and Morari (2010) 
has proposed a reference tracking MPC based on 
extended disturbance models. 

The problem of tracking MPC in case of 
unknown (or changing) reference signals can be 
considered an open problem that deserves more 
research efforts. 

Another interesting control problem is the 
tracking of unreachable (equilibrium point as 
well as trajectory) targets. Recently this problem 
has been posed as an economic model predictive 
control problem (Rawlings and Mayne 2009). 
Therefore, the stabilizing design of economic 
MPC presented in Angeli et al. (2012) can be 
extended to the case of tracking unreachable 
targets. 

Cross-References 

► Economic Model Predictive Control 

► Nominal Model-Predictive Control 

► Regulation and Tracking of Nonlinear Systems 

► Tracking and Regulation in Linear Systems 

Recommended Reading 

The book Camacho and Bordons (2004) covers 
the classic approach to the tracking MPC. In 
Rawlings and Mayne (2009), the authors deal 
with the tracking MPC in a very general and 
clear way and survey existing results on stability, 
target calculation, and offset-free control for lin¬ 
ear and nonlinear models. In Muske (1997), the 


reachability of set points is studied and in Rao 
and Rawlings (1999), the target calculation prob¬ 
lem. Disturbance models are widely analyzed 
in Pannocchia and Rawlings (2003), Pannocchia 
and Bemporad (2007), Maeder et al. (2009), and 
Maeder and Morari (2010). Another offset-free 
MPC based on the internal model principle can 
be found in Magni and Scattolini (2007). Fur¬ 
ther results on MPC for tracking are addressed 
in Ferramosca et al. (2009). A survey on the 
MPC for tracking can be found in Limon et al. 
( 2012 ). 
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Abstract 

Automotive transmissions are fundamental 
components in modern vehicles. They are 
required to make the engine operating at the 
most efficient operating point for providing the 
necessary torque at the wheels and minimizing 
the fuel consumptions. Moreover, transmissions 


should be able to smooth or to filter out power 
source torque oscillations that can appear in 
the driveline. For achieving such objectives, 
the automotive industry has looked at different 
technological solutions. The introduction of elec¬ 
tronically controlled transmissions contributed 
to augment the possibilities of new solutions 
that would not have been implementable 
without the flexibility and the performance of 
electronic control. Thus, recent technological 
developments of automotive transmissions gave 
the opportunity to engineers and control scientists 
for investigating challenging control problems 
with daily life practical applications. 

Keywords 

Actuators; Clutch engagement; Driveline; Dry 
clutch; Electrohydraulic; Electronic control; Gear 
shifting; Multivariable control; Optimal control; 
Powertrain; Sensors; Smart materials; Torque 
converter 


Introduction 

In motor vehicles, the transmission is an impor¬ 
tant system that transfers the power generated 
by the internal combustion engine to the wheels, 
according to the driver’s requests. The transmis¬ 
sion, together with the engine, the driveshaft, 
differential, and driven wheels, constitutes the 
powertrain (sometimes driveline or drive train 
is used to denote the powertrain excluding the 
engine and the transmission). The first fundamen¬ 
tal objective of a transmission is to adjust the 
ratio between the wheel speed and the engine 
speed in order to achieve the optimal operating 
point of the engine, independently of the vehicle 
velocity. Indeed, typical internal combustion en¬ 
gines provide low torques at low engine speeds, 
and, thus, it is necessary to amplify the torque 
making the engine work at higher speeds when 
the vehicle is at low speeds, e.g., during a launch 
from standstill. From an equivalent point of view, 
the transmission allows to amplify the engine 
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torque transferred to the wheels when higher 
accelerations are needed. For such reasons, every 
type of transmission has some devices that allow 
selection of select the ratio between its input 
shaft angular speed (engine side) and its output 
shaft angular speed (the side toward the wheels). 
The transmission’s input shaft is connected to the 
flywheel of the engine, while the output shaft of 
the transmission is connected to the final drive 
(containing the differentials ) through the drive 
shaft. (In British English, the term propeller shaft 
is also used when dealing with a rear-wheel- 
driven vehicle.) Even though such subsystems 
are differently located depending on the vehicle 
layout (if front-wheel or rear-wheel driven or 
even all-wheel driven), they all are parts of the 
powertrain and determine its behavior. 

Transmissions can be viewed also as systems 
that allow the transfer of power from the engine 
to the vehicle in a smooth and efficient way. 
In order to achieve such basic and fundamental 
objectives as well as to improve fuel economy, 
performance, and drivability, many technologies 
have been introduced into the market of automo¬ 
tive transmissions. 


Types of Transmissions 

In manual transmissions (MT), a set of gears 
provides each the different speed conversion ra¬ 
tios, and any gear can be selected by the driver 
by acting on the shift lever. For interrupting the 
power flow during the gear selection, a clutch is 
requested that disconnects the transmission from 
the flywheel of the engine and reconnects it just 
after the selection of the new gear. All such 
operations are performed manually by the driver. 

Automatic transmissions (AT) are the other 
well-known type of automotive transmissions. 
Hydraulic ATs do not have the clutch for connect¬ 
ing the transmission to the flywheel; instead they 
have a torque converter that provides the fluid 
coupling between the transmission and the engine 
realizing both a damping of the powertrain vibra¬ 
tions and a torque multiplication. Moreover in 
ATs, a set of planetary gear allows the selection 


of different gear ratios. The driver selects only the 
operation mode, and the selection of the gear is 
implemented through the electronic control. The 
main limits of these transmissions are the low ef¬ 
ficiency (particularly due to the slip of the torque 
converter), a larger space requirement, and a 
higher weight. The new generations of automatic 
transmissions have reduced such disadvantages, 
thanks to the use of lightweight materials and 
more shifting steps, and, above all, the replace¬ 
ment of conventional hydraulic components by 
electronic and electrohydraulic counterparts. 

Indeed the electronic transmission control not 
only improves fuel economy, performances, and 
drivability, but it also gives flexibility and new 
possibilities (e.g., diagnostics, fault detection, 
integration with other subsystems) that overcome 
the intrinsic disadvantages like complexity and 
development cost (Deur et al. 2006). Electronic 
transmission control played a fundamental role 
also in the introduction of new technologies that 
have exploited automatic control techniques. 
Thus, in recent years, continuously variable 
transmissions, automated manual transmissions, 
dual clutch transmissions, and electrically 
variable transmissions have appeared in the 
market, which was traditionally dominated by 
ATs and MTs (Sun and Hebbale 2005). 

An automated manual transmission (AMT) 
system can be viewed as an MT with some 
controlled actuators as add-ons: it still has a (dry 
or wet) clutch assembly and a multispeed gear¬ 
box, both of which are equipped with electrome¬ 
chanical or electrohydraulic actuators which are 
commanded by an electronic control unit. In 
AMTs the gearshift can be decided automatically 
by the transmission control unit (TCU) or even 
manually by the driver. In both cases after the 
gearshift command, the TCU manages all the 
shifting steps, through suitable signals sent to 
the engine, the clutch assembly, and the gearbox. 
This technology has the advantages of lower 
weight, lower costs, and higher efficiency with 
respect to ATs. 

It is worth highlighting one limitation of 
AMTs, the reduction in driving comfort caused 
by lack of traction during gear shift actuation. 
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Indeed the torque interruption leads to perceived 
jerks due to vehicle acceleration discontinuity 
and is very different compared to the smoother 
conventional automatic transmissions with torque 
converters (Lucente et al. 2007). 

Automated manual transmissions have 
become popular in Europe for their higher 
performances with respect to MTs and for their 
lower cost compared to ATs. In North America, 
instead, their use is limited because of the 
torque interruption during shifts that causes some 
discomfort. 

An offshoot of the AMT is the dual clutch 
transmission (DCT), in which the gearbox 
assembly has two separate and independent 
clutches, one for odd gears and one for even 
gears. In a DCT, shifts can be achieved without 
noticeable torque gap, by applying the engine 
torque to one clutch while the engine torque 
is being disconnected from the other clutch. 
The result is gentle, jerk-free gear shifts with 
the same comfortable driving of an automatic 
transmission combined with the efficiency 
and the performance of an economic manual 
transmission. In both DCT and AMT, electronic 
control (in particular aimed to solve the clutch 
engagement control problem) is the key to 
ensuring a smooth torque transfer. 

As a further transmission technology, the con¬ 
tinuously variable transmission (CVT) enables 
the engine to operate in a wide range of speed 
and load conditions independently from the speed 
and the torque requests of the vehicle. A modern 
CVT system consists of a steel belt that runs 
between two variable-width pulleys. The distance 
between pulley cones can be varied to change 
the gear ratio between shafts, thus generating an 
infinite number of “gears.” A CVT is less efficient 
than a standard discrete AT due to the losses 
in the belt-pulley system, but it can improve 
fuel economy by making the engine work in 
better operating conditions. Related to CVT is the 
electrically variable transmission (EVT) that 
appeared in the market with hybrid electric ve¬ 
hicles recently and use electric machines, namely 
motors/generators with planetary gear sets, so as 
to enable the function of CVTs with flexibility, 
controllability, and better performance. This type 


of transmission is usually found in hybrid electric 
vehicles. 

By looking at the different types of automotive 
transmissions, it is possible to classify electron¬ 
ically controlled transmissions into two groups: 
discrete ratio and continuously variable transmis¬ 
sions. The first group deals with the problem 
of automating the shift scheduling (“when-to”) 
or also controlling the shift execution (“how¬ 
to”) (Hrovat and Powers 1988). The latter group 
deals with control problems that live in a con¬ 
tinuous domain (like classical “process control”), 
and that allows the design of simpler control 
software. Indeed the discrete ratio transmissions 
are more complex to control since they determine 
many large transients of short duration due to 
gear shifts, and moreover their intrinsic mixed 
discrete-continuous nature gives rise to dynamic 
systems in which continuous time dynamics in¬ 
teract with discrete event dynamics. In other 
words such class of controlled transmissions can 
be considered a significant application of what 
control theory calls hybrid systems. This makes 
transmission control a very interesting and chal¬ 
lenging control engineering problem. 


Control Problems for Automotive 
Transmissions 

Electronic control applied to automotive trans¬ 
missions enables improved efficiency and fuel 
economy, better shift quality and comfort, and 
flexible driving. In order to achieve such ob¬ 
jectives, different approaches can be used for 
designing suitable control laws, and a hierarchi¬ 
cal approach is often required for dealing with 
the complexity of the several problems. Thus, 
recently produced cars together with the engine 
control unit also have a TCU that manages all 
transmission operations and sends command sig¬ 
nals to actuators in order to perform the desired 
behavior. Some dedicated devices are then avail¬ 
able for tackling challenging control problems 
and for trying to solve them exploiting classical 
feedback and/or modern model-based control. 
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Low Level Control 

In electronically controlled transmissions, many 
hydraulic functions of conventional transmis¬ 
sions have to be replaced by electrohydraulic 
systems. Thus it is a fundamental requirement 
to be able to control actuators in a suitable way. 
Usually classical PID regulators are employed 
for this type of low level controls whose aim 
consists of regulating some variables to reference 
values computed by a higher level controller. For 
instance, in AMTs the concentric slave cylinder 
is controlled to make the clutch disk follow a 
position reference signal computed by the TCU. 
The clutch position reference can be obtained 
by taking into account some models of the 
clutch transmission characteristic (Vasca et al. 
2011), thus realizing a feedforward/feedback 
architecture. 

Other examples of low level controls in au¬ 
tomotive transmissions are related to the clutch 
fill process (Song et al. 2010) in ATs, or the 
line pressure control, or also the CVT belt load 
control. 

Calibration Process 

Most of the industrial control strategies applied 
to automotive transmissions are based on feedfor¬ 
ward/feedback architectures. Feedforward con¬ 
trol typically relies on detailed models of the 
transmission that quite often consist of some 
lookup tables rather than specific physical mod¬ 
els. Lookup tables are used also for implementing 
adaptive feedback controllers, and thus, the use 
of tables with calibrated variables is widespread 
in automotive transmission control. With the in¬ 
creasing number of required functionalities, the 
calibration process for transmission control sub¬ 
systems becomes more and more complex. Then 
it becomes important to investigate automated 
and systematic approaches for the calibration 
process in order to improve the reliability and 
performances and, above all, for diminishing the 
development time. 

Of course, a different approach that looks at 
developing model-based control strategies could 
be the way to reduce the number of calibration 
variables and, thus, the calibration effort and 
time. The main obstacle to that is the uncertain 


environment that makes it very challenging to 
find robust control solutions. 

Gear Shifting 

The gear shift execution is a common problem 
in all discrete ratio transmission. In Figs. 1 and 2, 
the schemes of two transmission architectures are 
reported. Although we are looking at completely 
different typologies like ATs and AMTs, the 
gear shift problem is almost the same from an 
abstract point of view: commanding the actuator 
for getting a desired torque at the primary shaft of 
the transmission in order to have the possibility 
of disengaging the old gear, engaging the neutral 
gear, and then engaging the new gear, without 
shuffles, and limiting the jerk experienced by the 
driver. In ATs the basic idea is to control the 
hydraulic pressures of the torque converter to 
transfer smoothly the power from the engine to 
the driveline while minimizing the torque distur¬ 
bance at the output shaft. Analogously in AMTs 
with dry clutch, the actuator is commanded for 
positioning the clutch disk toward the flywheel, 
exerting a pressure that is transformed into the 
transmitted torque. 

For instance, a wet clutch of an automatic 
transmission gives the following transmitted 
torque (Deur et al. 2006): 

T = n A p p m / i(a), p m , 9) r e sgn(co) 

where n is the number of friction surfaces, A p is 
the piston area, p app is the hydraulic pressure, /z 
is the friction coefficient (depending also on the 
clutch fluid temperature 6 ), r e is the equivalent 
radius of the clutch, and co is the clutch slip speed, 
i.e., the difference between the engine speed and 
the speed of the input shaft of the transmission. 

For a dry clutch of an automated manual 
transmission (Vasca et al. 2011), 

T = nF pp (x t o)fi(ci>, 9)r e sgn(o;) 

where F pp is the force exerted by the cushion 
spring depending on the clutch actuator position 
x t0 and 0 is the clutch disk temperature. 

In both cases, the actuator allows regulation of 
the torque transmitted, respectively, through the 
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Wheel 



Transmission, Fig. 1 Architecture of an automatic transmission 


Wheel 



Transmission, Fig. 2 Architecture of an automated manual transmission 


torque converter or the dry clutch under a slipping 
condition. The main problem is that in modern 
transmissions, there are no low-cost torque sen¬ 
sors, and thus, due also to model uncertainties and 
highly variable operating conditions, it is not pos¬ 
sible to regulate the transmitted torque through a 
closed-loop scheme. What is usually done is to 
control the engine speed and/or the speed of the 
input shaft of the transmission. Quite often, their 
difference (the slip speed) is the variable to be 
controlled. 

Many different approaches have been pro¬ 
posed in the literature for solving such control 
problems that can be formulated as simply as 
a regulation problem of the slip speed or as 
a more complex multivariable control problem 
that considers the engine and clutch torques as 
control variables and the slip speed and vehicle 
speed as controlled variables, possibly solving 
the problem through robust control tools. The 


problem can be formulated quite naturally also as 
an optimal control problem that aims to minimize 
the engagement time, the driveline oscillations, or 
the dissipated energy. For example, by defining 
the time derivative of the clutch torque as one 
control variable, the transmitted torque becomes 
a state variable, and the energy dissipated during 
the engagement phase can be expressed as the 
cross product of two state variables (Garofalo 
et al. 2002) 

Ed = ( co(s)T(s)ds, 

Jo 

and the clutch engagement can be expressed as an 
optimal control problem with free final time (the 
engagement time, t) and a final state constraint 
(i.e., co{t) = 0). 

Some authors have also proposed a differ¬ 
ent solution for the gear shifting problem by 
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acting through the engine control (Pettersson and 
Nielsen 2000). The idea is to control the gear 
shifting using directly the engine as the actuator 
that allows to modulate the transferred torque to 
the transmission (see Figs. 1 and 2). In particular 
the engine is controlled so as to get a zero 
transferred torque in the transmission and then 
the neutral gear is engaged. In this way, a virtual 
clutch is realized. 

Driveline Modeling 

When model-based control is used for automotive 
transmissions, it is important to have a good 
model of the driveline that is detailed enough 
for capturing the main dynamics and, at the 
same time, sufficiently simple to deal with for 
designing not so complex controllers. Vehicu¬ 
lar drivelines have many elastic parts making 
mechanical resonances occurring. Handling such 
resonances is important for driveability but also 
for reducing mechanical stresses. Thus driveline 
control is crucial not only during gear shifting but 
also for a more general powertrain control that 
could manage wheel-speed oscillations induced 
by sudden accelerations or following from the 
road roughness. 

Integrated Powertrain Management 

Shift scheduling is an additional interesting prob¬ 
lem of electronically controlled transmissions. 
The shift point is usually based on some mea¬ 
surements like the vehicle speed, the maximum 
acceleration or throttle angle. In this case the 
control strategy is open-loop and implemented 
through lookup tables. 

As the number of gears increases, the shift 
schedule gets more complicated. Thus it becomes 
important to take into account also the actual 
driving scenario. For example, entering a curve 
during uphill driving is quite different relative to 
a downhill driving situation, so it can be very 
useful getting information on the steering angle, 
the road grade, vehicle acceleration, etc. More 
information, together with new degrees of free¬ 
dom that are available to modern vehicles (e.g., 
vehicles with electronic throttle control), give the 
opportunity to realize an integrated powertrain 
control which coordinates the engine control and 


the transmission control allowing to manage the 
gear scheduling and the gear shifting execution 
in a more flexible way, trying to optimize the 
fuel consumption and to improve the driveabil¬ 
ity (Kim et al. 2007). 

Diagnostics 

In all automotive applications, safety is a fun¬ 
damental issue that becomes more and more 
critical when the number of subsystems and their 
interaction increases, as it happens when intro¬ 
ducing electronic control. Thus, diagnosing faults 
of control systems is a challenging problem, in 
particular when there is limited information. To 
this aim, systems and control theory can be very 
useful for designing observer or fault detection 
algorithms that could deal with these types of 
problems. 

Summary and Future Directions 

In summary transmission control is a fertile ap¬ 
plication for looking at challenging problems of 
much interest for both control scientists and en¬ 
gineers, giving the opportunity for investigating 
topics like optimal control (e.g., for gear shifting 
and integrated powertrain control), robust con¬ 
trol (for driveline modeling and control), esti¬ 
mation (diagnostics), and adaptive and predictive 
control. 

Technological developments could affect the 
possibilities and the effectiveness of transmission 
control. Of course new sensing devices can 
improve the reliability and the precision of 
the feedback, but they can also open the door 
to new control architectures. For instance, 
the phenomenon of inverse magnetostriction 
that converts material strain into magnetic 
property changes can be exploited to measure 
transmitted torque, and some magnetoelastic 
torque sensors have been investigated by a 
number of researchers (Klimartin 2003; Pietron 
et al. 2013). In this way the gear shifting control 
problem can be attacked by closing the loop on 
the transmitted torque measurement, avoiding 
more or less complex torque observers or, at 
least, improving the control performances. 
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Analogous considerations can be carried out 
at the actuation level. For instance, Kim and Choi 
(2011) have proposed a new clutch actuator with 
a self-energizing mechanism so as to amplify 
the normal force applied on the contact surfaces 
for the engagement. That idea allows the clutch 
module to consume less energy for actuating the 
overall system. Smart-material-based actuation 
devices were also developed by a number of 
researchers in recent years (Chaudhuri and 
Wereley 2012), and some specific applications 
to automotive transmissions are currently under 
investigation, like magnetorheological fluid 
dual clutch transmissions that are discussed in 
Chen et al. (2012). 

At the system level, one of the most interesting 
research directions deals with the communication 
and coordination among different control subsys¬ 
tems like the engine control, transmission control, 
and electronic stability control, with the final aim 
of integrating all such subsystems for integrated 
powertrain management. 


Cross-References 

► Engine Control 

► Powertrain Control for Hybrid-Electric and 
Electric Vehicles 


Recommended Reading 

Hrovat et al. (2010) give an overview of auto¬ 
motive transmissions in their chapter on power- 
train control of the CRC Control Handbook. Of 
course transmission control is discussed also in 
classical automotive control books: one of the 
first well-known books dedicated to automotive 
control was Kiencke and Nielsen (2005). There 
a whole chapter on driveline control deals with 
driveline modeling and gear shifting for clutch- 
based transmissions. A more recent book on 
automotive control is Ulsoy et al. (2012) where 
transmission control for all-wheel drive vehicles 
is also presented. 


In the scientific literature, many papers deal 
with transmission control: here, in particular, we 
would like to cite the optimal control approach 
for ATs by Haj-Fraj and Pfeiffer (2001), a deep 
discussion on AMT control in Glielmo et al. 
(2006), and, more recently, papers on DCTs 
like Kulkarni et al. (2007) and Senatore (2009); in 
the latter, the author illustrates the wide selection 
of patents on dual clutch. 

Regarding automotive technologies, an 
overview of automotive sensors can be 
found in Fleming (2008), while a discussion 
on smart materials and the integration of 
mechanics, materials, and electronics (the so- 
called mechamatronics discipline) are presented 
by Munhoz et al. (2007). 
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Abstract 

Dynamic vision is a subfield of computer vision 
dealing explicitly with problems characterized 
by image features that evolve in time according 
to some underlying dynamics. Examples include 
sustained target tracking, activity classification 
from video sequences, and recovering 3D geom¬ 
etry from 2D video data. This article discusses 
the central role that systems theory can play 
in developing a robust dynamic vision frame¬ 
work, ultimately leading to vision-based systems 
with enhanced autonomy, capable of operating in 
stochastic, cluttered environments. 

Keywords 

Event detection; Multiframe tracking; Structure 
from motion 

Background 

In this article, we represent linear time invariant 
(LTI) systems by their associated transfer matrix 


G(z). The “size” of G(z), which plays a key 
role in assessing the effects of uncertainty, will 
be measured using the 1-Loo norm, defined as 
||G||oo = sup w g (G(V"), wherea(.) denotes 
maximum singular value. For scalar systems, 
this reduces to the peak value of the frequency 
response (i.e., the maximum gain of the system). 
In the matrix case, this definition takes into ac¬ 
count both the worst-case frequency and spatial 
direction. Background material on the Hoo norm, 
its computation and its significance in the context 
of robust control theory, is given in Sanchez- 
Pena and Sznaier (1998). A general coverage of 
linear systems theory, including alternative repre¬ 
sentations of linear systems and their associated 
properties, can be found, for instance, in Rugh 
(1996). 

Multiframe Tracking 

A requirement common to most dynamic vision 
applications is the ability to track objects across 
frames, in order to collect the data required by 
a subsequent activity analysis step. Current ap¬ 
proaches integrate correspondences between in¬ 
dividual frames over time, using a combination 
of some assumed simple target dynamics (e.g., 
constant velocity) and empirically learned noise 
distributions (Isard and Blake 1998; North et al. 
2000). However, while successful in many sce¬ 
narios, these approaches are vulnerable to model 
uncertainty, occlusion, and appearance changes, 
as illustrated in Fig. 1. 


J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9, 
© Springer-Verlag London 2015 
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Frame 150 Frame 95 


Uncertainty and Robustness in Dynamic Vision, Fig. 1 Unscented particle filter-based tracking in the presence of 
occlusion 


As shown next, the fragility noted above can 
be avoided by modeling the motion of the target 
as the output of a dynamical system, to be iden¬ 
tified directly from the available data, along with 
bounds on the identification error. In the sequel, 
we consider two different cases: (i) the motion of 
the target is known to belong to a relatively small 
set of a priori known motion modalities; and (ii) 
no prior knowledge is available. 

The case of known motion models: Consider 
first the case where a set of models known to 
span all possible motions of the target is known a 
priori, as it is often the case with human motion. 
In this case, the position yk of a given target can 
be modeled as y(z ) = J r (z)e(z) + rj(z) where e 
and rjk denote a suitable input and measurement 
noise, respectively, and where T admits an ex- 

Np 

pansion of the form T = Y. Pj^ j +Fnp- Here 

j =i 

represent the (known) motion modalities of 
the target and || T np ||oo 5 K, e.g., a bound on the 
maximum admissible approximation error of the 
expansion T v to T is available. In the reminder 
of this article, we will further assume that a set 
membership descriptions ijk e M is available 
and, without loss of generality, that e(z) = 1 (i.e., 
motion of the target is modeled as the impulse 
response of the unknown operator J 


In this context, the next location of the target 
feature yk can be predicted by first identifying 
the relevant dynamics T and then using it to 
propagate its past values. In turn, identifying the 
dynamics entails finding an operator T{z) E S = 
\T(z)'.T = T p + T n p\ such that y — rj = T, 
precisely the class of interpolation problem 
addressed in Parrilo et al. (1999). As shown 
there, finding such an operator reduces to solving 
a linear matrix inequality (LMI) feasibility 
problem. Once this operator is found, it can be 
used in conjunction with a particle (or a Kalman) 
filter to predict the future location of the target. 
Figure 2 shows the tracking results obtained 
using this approach. Here, we used a combination 
of a priori information: (i) 5 % noise level and (ii) 

T ‘:nnn T 1 z z Z 2 z 2 ~cosmz 

P F U—1 ’ z —a ’ (z— l) 2 ’ (z— l) 2 ’ z 2 — 2cos&>z+l ’ 


2 where a e {0.9,1,1.2,1.3,2} and 

co e {0.2,0.45}. The experimental information 
consisted of the position of the target in N = 20 
frames, where it was not occluded. Note that, 
by exploiting predictive power of the identified 
model, the Kalman filter is now able to track the 
target past the occlusion, eliminating the need 
for using a (more computationally expensive) 
particle filter. 


Unknown motion models: This case could be 
addressed in principle by performing a purely 
nonparametric worst-case identification (Parrilo 
et al. 1999) and then proceeding as above. 
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Uncertainty and Robustness in Dynamic Vision, Fig. 2 Using the identified model in combination with Kalman 
filter allows for robust tracking in the presence of occlusion 


However, a potential difficulty here stems from 
the high order of the resulting model (recall that 
the order of the central interpolant is the number 
of experimental data points). If a bound n on 
the order of the underlying models is available, 
this difficulty can be avoided by recasting the 
prediction problem into a rank minimization 
form, which in turn can be relaxed to a semi- 
definite optimization. To this effect, recall that 
(Ding et al. 2008), in the absence of noise, 
given 2 n values of {y k)\ =t - ln +v ^ s next va l ue 
y*+i is the unique solution to the following rank 
minimization problem: 

y,+i = argmin{rank[H„+i(y)]} where H, J+1 (y) 
y 



Uncertainty and Robustness in Dynamic Vision, 
Fig. 3 Trajectory prediction. Rank minimization (7) ver¬ 
sus Kalman filtering (2) 


yt- 2 n + l yt- 2 n +2 
yt-2n+2 yt-2n+3 


y t—n 

y t — n +1 


(i) 


_ y t — n~\-\ y t — n~\~2 % y 


Clearly, the same result holds if multiple el¬ 
ements of the sequence y are missing, at the 
price of considering longer sequences (the total 
number of data points should exceed 2 n). This 
result allows for handling both noisy and missing 
data (due, for instance, to occlusion), by simply 
solving 


min^ {rank [H(£)]} subject to v e J\f v 


where £/ = 


y; - v f if i e X a 
x,- if i e X m 


X a and X m denote the set of available (but noisy) 
and missing measurements, respectively, and 
where A f v is a set membership description of 
the noise v. In the case where J\f v admits a 
convex description, using the nuclear norm as 
a surrogate for rank (Fazel et al. 2003) allows for 
reducing this problem to a convex semi-definite 
program. Examples of these descriptions are balls 
in £oo, e.g., M = {v:\vk\ < 6} or constraints 
on the norm of H^, the Hankel matrix of the 
noise sequence, which under mild ergodicity 
assumptions are equivalent to constraints on the 
magnitude of the noise covariance. Figure 3 
illustrates the effectiveness of this approach. 
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As shown there, the rank minimization-based 
filter successfully predicts the location of the 
target, while a Kalman filter-based tracker fails 
due to the substantial occlusion. 


Event Detection and Activity 
Classification 

Using the trajectories generated by the track¬ 
ing step for activity recognition entails (i) seg¬ 
menting the data into homogeneous segments 
each corresponding to a single activity and (ii) 
classifying these activities, typically based on 
exemplars from a database of known activities. 
As shown in the sequel, both steps can be ef¬ 
ficiently accomplished by exploiting the proper¬ 
ties of the underlying system. The starting point 
is to model these activities as the output of a 
switched piecewise linear system. In this context, 
under suitable dwell time constraints, each switch 
(indicating a change in the underlying activity) 
can be identified by simply searching for points 
associated with discontinuities in the rank of 
the associated Hankel matrix, as illustrated in 
Fig. 4. Further, in this framework, the problem of 
classifying each subactivity can be recast into the 
behavioral model (in)validation setup shown in 
Fig. 5. Here yi (.) represents the impulse response 
of the (unknown) LTI system G, affected by 
measurement noise rji e J\f and uncertainty 


A i eV that accounts for the variability intrinsic 
to two different realizations of the same activity. 
Two different time series are considered to be 
realizations of the same activity if there exists at 
least one pair (rji , 772 ) £ A/* 2 , one pair (Ai, A 2 ) e 
V 2 , a LTI system G with McMillan degree at 
most no , and suitable initial conditions xi, X 2 
resulting in the observed data. Remarkably, this 
model (in)validation problem can be reduced to 
a rank minimization form. In the simpler case 
where A/ = 0, the problem can be solved us¬ 
ing the following algorithm (Sznaier and Camps 
2011): 

Next, consider the more realistic case where 
the trajectories are also affected by bounded 
model uncertainty A, ||A|| 00 ^ y, where y 
is given as part of a priori information. In 
this scenario, the internal signal z is given by 
z(t) = £(t) — rj(t), rj e AT, where the signal f 
satisfies 

y = (1 + A) * £, for some A e V (2) 

where * denotes convolution. Exploiting Theo¬ 
rem 2.3.6 in Chen and Gu (2000) leads to an 
LMI condition in the variables z, Tj, for feasibil¬ 
ity of (2). Thus, the only modification to Al¬ 
gorithm 1 required to handle model uncertainty 
is to incorporate this additional (convex) con¬ 
straint to the rank minimization problems. Table 1 
shows the results of applying this approach to 



Uncertainty and Robustness in Dynamic Vision, Fig. 4 The jump in the rank of the Hankelmatrix corresponds to 
the time instant where the subjectsmeet and exchange a bag 
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Uncertainty and Robustness in Dynamic Vision, Fig. 5 Model (in)validation setup 


2 video sequences, walking and running, from 
the KTH database (Laptev et al. 2008). Sam¬ 
ple frames from these sequences are shown in 
Fig. 6. In order to reduce the dimensionality of 
the data, the frames were first projected into a 
three-dimensional space using principal compo¬ 
nent analysis (PCA), and the resulting time series 
were used as the input to Algorithm 1, assum¬ 
ing 10% noise and 10% model uncertainty. As 
shown in Table 1 , the algorithm correctly identi¬ 
fies the subsequences (a)-(c) as being generated 
by the same underlying activity (walking). 


Algorithm 1 Behavioral model (in)validation 
Data: Noisy measurements yuyi- 
A priori information: noise description rji G J\f 

1. Solve the following rank-minimization problems: 
r™ in = min^ rank(H Vl — H^) 

subject to: r]\ G J\f. 

r min _ min ^ ran k(H j2 — U V2 ) 

subject to: r\ 2 G J\f. 

r™ n = min,, rank([H v ,„ ft yin \) 

subject to: r]\,r ]2 £ A/" 

H yin = H-yj — 

= H y 2 — H„ 2 

2. The given trajectories were generated by the 
same LTI system with McMillan degree < no iff: 

r min — min — min ^ 

r \ — '2 '12 — n G 


Uncertainty and Robustness in Dynamic 
Vision, Table 1 Activity classification results. 
Sequences (a)-(c) correspond to walking and (d) to 
running 


Activity pair 

Rank(Hi) 

Rank(H 2 ) 

Rank([Hi H 2 ]) 

(a,b) 

4 

4 

4 

( a,c ) 

4 

4 

4 

(a, d ) 

4 

8 

8 


Summary and Future Directions 

Vision-based systems are uniquely positioned 
to address the needs of a growing segment of 
the population. Aware sensors endowed with 
scene analysis capabilities can prevent crime, 
reduce time response to emergency scenes, and 
render viable the concept of ultra-sustainable 
buildings. Moreover, the investment required to 
accomplish these goals is relatively modest since 
a large number of cameras are already deployed 
and networked. Arguably, at this point, one of 
the critical factors limiting widespread use of 
these systems is their potential fragility when 
operating in unstructured scenarios. This article 
illustrates the key role that control theory can play 
in developing a comprehensive, provably robust 





























1498 


Uncertainty and Robustness in Dynamic Vision 



Uncertainty and Robustness in Dynamic Vision, Fig. 6 Sample frames from KTH activity video database, (a) 
Walking, (b) Running 


dynamic vision framework. In turn, computer 
vision provides a rich environment both to draw 
inspiration from and to test new developments in 
systems theory. 


Cross-References 

► Particle Filters 

► Estimation, Survey on 


Recommended Reading 

Details on how to select good features to track 
can be found in Richard Szeliski (2010). Using 
dynamics to recover 3D structure from 2D data is 
covered in Ayazoglu et al. (2010). Finally, further 


details on the connection between identification 
and the problem of extracting actionable infor¬ 
mation from large data streams can be found, for 
instance, in Sznaier (2012). 
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Abstract 

For underactuated marine vessels, the dimension 
of the configuration space exceeds that of the 
control input space. This article describes un¬ 
deractuated marine vessels and the control chal¬ 
lenges they pose. In particular, there are two 
main approaches to design control systems for 
underactuated marine vessels. The first approach 
reduces the number of degrees of freedom (DOF) 


that it seeks to control such that the number of 
DOF equals the number of independent control 
inputs. The control problem is then a fully actu¬ 
ated control problem - something that simplifies 
the control design problem significantly - but 
special attention then has to be given to the 
inherent internal dynamics that has to be carefully 
analyzed. The other approach to design control 
systems for underactuated marine vessels seeks 
to control all DOF using only the limited number 
of control inputs available. The control problem 
is then an underactuated control problem and is 
quite challenging to solve. In this article, it is 
shown how line-of-sight methods can solve the 
underactuated control problems that arise from 
path following and maneuvering control of un¬ 
deractuated marine vessels. 


Keywords 

Marine vessels; Underactuated marine control 
problems; Underactuated marine vessels; Under¬ 
actuation 


Introduction 

Marine systems are often equipped with fewer 
independent actuators than degrees of freedom. 
Examples include conventional ships/surface 
vessels that are typically equipped with a main 
thruster and a rudder or with two independent 
main thrusters, but without a side thruster. As a 
result, we have no control force in the sideways 
direction. This means that the forward motion 
(the surge motion) and the orientation (the yaw 
motion) can be controlled directly, while there is 
no direct way to influence the sideways motion 
of the surface vessel (the sway motion). The 
vessel is then said to be underactuated in sway. 
It is an underactuated system since it has only 
two independent control inputs, giving force and 
torque in surge and yaw, while the system has 
three degrees of freedom: surge, sway, and yaw. 
This underactuation leads to challenges when it 
comes to designing the control system. 
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Definition: Underactuated 
Marine Vessels 

In order to properly define what we mean by 
underactuated marine vessels, we need the math¬ 
ematical model (► Mathematical Models of Ships 
and Underwater Vehicles; Fossen 2011): 

Mi) + C(v)v + D(v)v + g(rj) = * 

rj = J(r])v 

where the configuration vector rj e M", the 
velocity vector v e M", while the vector of 
independent control inputs r e M m . The vessel is 
underactuated because n > m, i.e., the dimension 
of the configuration space exceeds that of the 
control input space (Oriolo and Nakamura 1991; 
Pettersen and Egeland 1996). 

The underactuation leads to a second-order 
nonholonomic constraint 

M u v + C u (v)v + D u (y)v + g u {rj) = 0 

where M u denotes the last n —m rows of the ma¬ 
trix M and C M (v), D u (v ), and g u (ji) are defined 
similarly. 

Definitions of nonholonomic and holonomic 
constraints can be found in Goldstein (1980). 
More facts about these kinds of constraints and 
conditions for when this second-order nonholo¬ 
nomic can be integrated to either a first-order 
nonholonomic or a holonomic constraint can be 
found in Tarn et al. (2003). 


Control of Underactuated 
Marine Vessels 

As we have seen above, the underactuation leads 
to a constraint, and this gives challenges when 
it comes to designing the control system. In 
particular, it can be shown that if g u Qf) has a 
zero element, then there exists no continuous or 
discontinuous state feedback law that can asymp¬ 
totically stabilize the equilibrium point (rj,v) = 
(0,0) (Pettersen and Egeland 1996). This means 


that in order to stabilize an equilibrium point, 
control methods from linear or classical nonlinear 
control theory cannot be applied. 

There are two main classes of approaches to 
control underactuated marine vessels. The first 
class approaches the control problem by reducing 
the number of degrees of freedom that are to be 
controlled, while the other class seeks to control 
all degrees of freedom using the limited number 
of control inputs available. 

If we reduce the number of degrees of free¬ 
dom (DOF) that we seek to control, such that 
the number of DOF agrees with the number of 
independent control inputs, then we have a fully 
actuated control problem although the vessel is 
underactuated. This may at first sight look like 
a very simple way to design a control system 
for underactuated marine vessels. Note, however, 
that then, there will inherently be internal dynam¬ 
ics that needs to be examined carefully (Isidori 
1995; Nijmeijer and van der Schaft 1990). Say, 
for instance, that we only care about controlling 
the position of the ship, and we choose not to care 
very much about the orientation of the ship. We 
do, for instance, want the ship to follow a straight 
line trajectory (x r (t), y r (t )), where x and y give 
the ship’s position in an earth-fixed coordinate 
system, and the angle giving the ship orientation, 
if/, is not so important to us. It is quite straight¬ 
forward to use, for instance, output feedback 
linearization to this end (Isidori 1995; Nijmeijer 
and van der Schaft 1990). The resulting dynamics 
of the subsystem (x, y) is then called the external 
dynamics. We have full control over this using the 
two independent control inputs and can make it 
track any smooth trajectory ( x r (t), y r ( t )). Every¬ 
thing looks simple when considering the external 
dynamics only, but the internal dynamics can 
frequently be hard to predict. The orientation of 
the ship, given by the yaw angle psi, also needs 
to be analyzed. The controlled motion will not 
necessarily have the ship aligned with the tangent 
of the trajectory, for instance. Firstly, the ship 
control system that only focuses on the position 
variables (x,y) may equally well result in the 
ship moving backward along the line; a behavior 
that is not really desirable with respect to energy 
efficiency or for passenger comfort. Secondly, 
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there will always be environmental disturbances: 
currents, wind, and waves, and we need to make 
a thorough stability analysis of the internal dy¬ 
namics in order to guarantee sufficient robustness 
properties for these. So to conclude, if you reduce 
the number of DOF that you seek to control, in 
order to achieve a fully actuated control problem, 
then you need to consider the internal dynamics 
very carefully when dealing with underactuated 
marine vessels. 

If we follow the other approach to controlling 
underactuated marine vessels, where we seek to 
control more degrees of freedom than we have 
independent control inputs, then we not only have 
an underactuated marine vessel at hand, but we 
also have an underactuated control problem. This 
is a challenging control problem, and we will now 
see how this can be solved for path following and 
maneuvering control. 

Path Following and Maneuvering 
Control of Underactuated Marine 
Vessels 

For path following control systems, the control 
objective is to make the vessel follow a given path 
V, often defined as a parametrized path 

Y d := {y e DT : 30 e M such that y = y d {6)} 

where m < n and y d is continuously 
parametrized by the path variable 0 . The control 
objective is thus to force the output y to converge 
to the desired path y d (0) : lim*-^ \y(t) — 
y d (9(t))\ = 0. This constitutes a geometric task. 
When there is also a dynamic task, for instance, 
a speed assignment like forcing the path speed 6 
to converge to a desired speed v s (9(t),t) 

lim \6(t) - v s (6(t),t)\ = 0 

t—>oo 

then the control problem is an output maneuver¬ 
ing problem (Skjetne et al. 2004). 

Line-of-sight (LOS) guidance control has 
proven to be a powerful tool for path following 
and maneuvering control of underactuated 
vessels. LOS guidance is much used in practice 


for manual control of ships, where the helmsman 
typically will steer the vessel toward a point 
lying a constant distance, called the look-ahead 
distance, ahead of the vessel along the desired 
path. LOS guidance is simple, intuitive, and easy 
to tune, and it can be shown that it provides 
nice path convergence properties (Breivik and 
Fossen 2004; Bprhaug et al. 2008; Caharija et al. 
2012; Fredriksen and Pettersen 2006; Lefeber 
et al. 2003). For the simplified case without any 
environmental disturbances and when the desired 
path is a straight line, the LOS guidance law for 
an underactuated surface vessel is given by 

fi = VAos = -tan -1 (-0 , A > 0 

where y is the cross-track error. The angle t^los 
is called the line-of-sight (LOS) angle, and geo¬ 
metrically, it corresponds to the orientation of the 
vessel when headed toward the point that lies a 
distance A > 0 ahead of the vessel along the path 
y = 0, cf. Fig. 1. The look-ahead distance A is a 
control design parameter. 

In order to handle ocean currents and other 
environmental disturbances such as wind and 
waves, the LOS guidance law can be extended 
with integral action 

V'loS = - tan_1 ( Z± F a ) ’ A > 0 

v = Ay 

ymt (y+oy-uA) 2 +A 2 



Underactuated Marine Control Systems, Fig. 1 

Illustration of LOS guidance 
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where a > 0 is a design parameter, an integral 
gain, and A > 0 has the same interpretation as 
above. The integral effect will generate a sideslip 
angle that allows the vessel to stay on the desired 
path even though affected by environmental dis¬ 
turbances with components normal to the path, 
even though the vessel has no control forces to 
act in the sideways direction. 

Various standard control techniques can read¬ 
ily be used to track the above guidance com¬ 
mands. LOS guidance can also be extended to the 
3D case for path following/maneuvering control 
of underactuated autonomous underwater vehi¬ 
cles (AUV), cf. the references given above. 


Summary and Future Directions 

Underactuated marine vessels are vessels for 
which the dimension of the configuration space 
exceeds that of the control input space. There 
are two main approaches to design control 
systems for underactuated marine vessels. The 
first approach reduces the number of degrees of 
freedom (DOF) that it seeks to control, such 
that the number of DOF equals the number 
of independent control inputs. The control 
problem is then a fully actuated control problem, 
something that simplifies the control design 
problem significantly, but special attention 
then has to be given to the inherent internal 
dynamics that has to be carefully analyzed. The 
other approach to design control systems for 
underactuated marine vessels seeks to control all 
DOF using only the limited number of control 
inputs available. The control problem is then 
an underactuated control problem, and this is a 
quite challenging control problem. In this entry, 
it is shown how line-of-sight methods can solve 
the underactuated control problems that arise 
from path following and maneuvering control of 
underactuated marine vessels. 

Future developments of underactuated marine 
control systems will include solving more un¬ 
deractuated control problems of marine vessels 
taking into account both the complete mathe¬ 
matical model of the vessels and also advanced 


mathematical models of all the environmental 
disturbances in both 2D and 3D. 


Cross-References 
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► Underactuated Robots 


Bibliography 

Aguiar AP, Pascoal AM (2007) Dynamic positioning and 
way-point tracking of underactuated AUVs in the pres¬ 
ence of ocean currents. Int J Control 80:1092-1108 
Breivik M, Fossen TI (2004) Path following of straight 
lines and circles for marine surface vessels. In: Pro¬ 
ceedings of 6th IFAC conference on control applica¬ 
tions in marine systems, Ancona, pp 65-70 
Bprhaug E, Pavlov A, Pettersen KY (2008) Integral LOS 
control for path following of underactuated marine sur¬ 
face vessels in the presence of constant ocean currents. 
In: Proceedings of 47th IEEE conference on decision 
and control, Cancun, 9-11 Dec 2008, pp 4984-4991 
Caharija W, Pettersen KY, Gravdahl JT, Bprhaug E (2012) 
Path following of underactuated autonomous under¬ 
water vehicles in the presence of ocean currents. In: 
Proceedings of 51th IEEE conference on decision and 
control, Maui, Dec 2012, pp 528-535 
Encarnacao P, Pascoal AM, Arcak M (2000) Path follow¬ 
ing for marine vehicles in the presence of unknown 
currents. In: Proceedings of 6th IFAC international 
symposium on robot control, Vienna, 21-23 Sept 
2000, pp 469-474 

Fossen TI (2011) Handbook of marine craft hydrodynam¬ 
ics and motion control. Wiley, Chichester/West Sussex 
Fredriksen E, Pettersen KY (2006) Global K-exponential 
way-point maneuvering of ships: theory and experi¬ 
ments. Automatica 42:677-687 
Goldstein H (1980) Classical mechanics, 2nd edn. 
Addison-Wesley, Reading 

Healey AJ, Lienard D (1993) Multivariable sliding mode 
control for autonomous diving and steering of un¬ 
manned underwater vehicles. IEEE J Ocean Eng 
18:327-339 

Indiveri G, Zizzari AA (2008) Kinematics motion con¬ 
trol of an underactuated vehicle: a 3D solution with 
bounded control effort. In: Proceedings of 2nd IFAC 
workshop on navigation, guidance and control of un¬ 
derwater vehicles. Killaloe, Ireland 
Isidori A (1995) Nonlinear control systems, 3rd edn. 
Springer, London 

Lapierre L, Soetanto D, Pascoal AM (2003) Nonlinear 
path following with applications to the control of 



Underactuated Robots 


1503 


autonomous underwater vehicles. In: Proceedings of 
42nd IEEE conference on decision and control, Maui, 
Dec 2003, pp 1256-1261 

Lefeber AAJ, Pettersen KY, Nijmeijer N (2003) Tracking 
control of an under-actuated ship. IEEE Trans Control 
SystTechnol 11:52-61 

Nijmeijer H, van der Schaft AJ (1990) Nonlinear dynami¬ 
cal control systems. Springer, New York 

Oriolo G, Nakamura Y (1991) Control of mechanical 
systems with second-order nonholonomic constraints: 
underactuated manipulators. In: Proceedings of 30th 
IEEE conference on decision and control, Brighton, 
Dec 1991, pp 2398-2403 

Pettersen KY, Egeland O (1996) Exponential stabilization 
of an underactuated surface vessel. In: Proceedings of 
35th IEEE conference on decision and control, Kobe, 
Dec 1996, pp 967-972 

Pettersen KY, Egeland O (1999) Time-varying exponen¬ 
tial stabilization of the position and attitude of an 
underactuated autonomous underwater vehicle. IEEE 
Trans Autom Control 44:112-115 

Skjetne R, Fossen TI, Kokotivic PV (2004) Robust output 
maneuvering for a class of nonlinear systems. Auto- 
matica 40:373-383 

Tam T-J, Zhang M, Serrani A (2003) New integrability 
conditions for classifying holonomic and nonholo¬ 
nomic systems. In: Rantzer A, Byrnes Cl (eds) Di¬ 
rections in mathematical systems theory and optimiza¬ 
tion, Springer, Berlin/Heidelberg, pp 317-331 


Underactuated Robots 

Kevin M. Lynch 

Mechanical Engineering Department, 
Northwestern University, Evanston, IL, USA 

Abstract 

Underactuated robots, robots with fewer actua¬ 
tors than degrees of freedom, are found in many 
robot applications. This entry classifies underac¬ 
tuated robots according to their dynamics and 
constraints and provides an overview of control¬ 
lability, stabilization, and motion planning. 
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Introduction 

An underactuated robot is a robot with fewer 
actuators (control inputs) than the number of 
variables describing its configuration (degrees of 
freedom). Some robots have this property un¬ 
avoidably, while others are specifically designed 
this way, perhaps to save the cost of actuators. 
Examples include: 

• A cart and pendulum (inverted pendulum). 

This system has two degrees of freedom, the 
linear position of the cart and the angle of 
the pendulum, but only one control input, the 
acceleration of the cart. 

• A car. A car has only two control inputs 
(steering and forward/backward speed) but at 
least three degrees of freedom: the position 
(x, y ) and orientation 0 of the chassis. If the 
steering and/or rolling angles of the wheels 
are included in the representation of the con¬ 
figuration, the car has even more degrees of 
freedom. 

• A walking robot. When a biped steps with 
one foot in the air and the toes of the other 
foot on the ground, there is no actuator at the 
toes to directly control the angle between the 
foot and the ground. 

• A quadrotor flying robot. A quadrotor has 
four control currents driving the four pro¬ 
pellers, but its configuration is described by 
six variables: (x, y, z) position and roll, pitch, 
and yaw. 

• An underactuated robot hand. Robot hands 
generally have many joints, up to four per 
finger for anthropomorphic hands. To reduce 
cost, a small number of motors (as few as one) 
may be used to open and close the fingers, 
with joint motions coupled by springs. 

• Robot manipulation. When a robot arm and 
hand manipulates a rigid object, the entire 
system, taken together, has at least six more 
degrees of freedom than actuators - the six 
degrees of freedom of the object. 

In all underactuated robot systems of interest, 
the fewer control inputs are somehow coupled 
to all of the degrees of freedom. This entry 
focuses on coupling through the inertia matrix 
and kinematic constraints. In addition, this entry 
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focuses on control of the full configuration, or 
more generally the state, of the robot system. 
Other goals, such as successfully grasping an 
object with a compliant underactuated hand, are 
outside the scope of this entry. 


Classification of Underactuated 
Robots 

The robot has n degrees of freedom, and its 
configuration is written in local coordinates as a 
column vector q e W 1 . If the robot is described 
as a kinematic system, then its state x is simply q 
and the control inputs are velocities. If the robot 
is a mechanical system, then its state is x = 
[q T ,q T ] T and the control inputs are accelerations 
(forces). Let p denote the dimension of the state 
space, where p = n for a kinematic system and 
p = 2n for a mechanical system. 

The equations of motion of an underactuated 
robot can be written in the control-affine form 

m 

X = fix) + Uigi(x) wher em<n. (1) 
i = 1 

The vector field f(x) is a drift vector field 
describing the unforced motion of the robot, 
the gi (x) are linearly independent control vector 
fields describing how the controls act on the 
robot, and u = [u\,..., u m ] T is the control. 
Kinematic systems are commonly drift-free 
( f(x ) = 0). For a mechanical system, the 

drift field / (x) typically includes velocities 
acting on positions and gravity acting on 
velocities. 

The fact that the number of controls m is less 
than the number of degrees of freedom n can be 
viewed as n — m constraints on the motion. For a 
kinematic system, these are velocity constraints. 
For a mechanical system, these are acceleration 
constraints. In addition, a mechanical system may 
be subject to a separate set of k velocity con¬ 
straints, often called Pfaffian constraints, of the 
form 

( 2 ) 


where A(q) e R nxk . Such constraints arise from 
conservation laws and rolling without slip, for 
example. 

Understanding the integrability of these con¬ 
straints is key to understanding the controllability 
of underactuated robots (section “Determining 
Controllability”). For example, if acceleration 
constraints can be integrated to yield equivalent 
velocity constraints, then the dimension of the 
space of reachable velocities of the mechanical 
system is reduced. If velocity constraints can 
be integrated to yield equivalent configuration 
constraints, then the dimension of the reachable 
configuration space is reduced. If some velocity 
constraints are integrable to configuration con¬ 
straints, we simply eliminate those configuration 
variables from the description of the system so we 
can focus on the controllable degrees of freedom. 
Velocity constraints that cannot be integrated are 
called nonholonomic , while configuration con¬ 
straints are called holonomic. 

Based on the type of constraints, we can clas¬ 
sify underactuated robots into three categories - 
pure kinematic, pure mechanical, and mixed 
kinematic and mechanical - as described below. 

Pure Kinematic 

This category consists of systems with velocities 
as inputs, as well as mechanical systems that can 
be modeled by a kinematic reduction that has 
time-differentiable velocities as controls (Bullo 
and Lewis 2004; Bullo et al. 2002). (The actual 
acceleration controls of the original system are 
the time derivatives of these velocities.) Exam¬ 
ples of mechanical systems that can be reduced 
to kinematic systems include systems with actua¬ 
tors for every degree of freedom (fully actuated 
systems, of little interest here) and mechanical 
systems whose acceleration constraints can be 
completely integrated to equivalent velocity con¬ 
straints. 

Example 1 (Upright rolling wheel) Consider a 
wheel of radius R rolling upright on a hori¬ 
zontal plane (Fig. la). The center of the wheel 
is (p x , p y , p z ), and the orientation is described 
by its “leaning” angle 9 , rolling angle 0, and 
heading angle 0. The constraints that the wheel 


A(q)q = 0, 
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Underactuated Robots, 
Fig. 1 (a) A wheel in 
space, then confined to be 
upright on a horizontal 
plane with coordinates 
(. Px,Py,f , <P)- (b) A top 
view of a robotic 
snakeboard. The 
configuration is given by 
(p x , p y , 9) for the board, 
the steering angle </> of the 
wheels, and the angle js of 
the reaction wheel 



remain upright and touching the plane can be 
written differentially as p z = 0 and 6 = 0, 
but these constraints can be integrated to the 
equivalent configuration constraints p z = R and 
6 = 0, so we eliminate these variables from the 
description of the configuration and focus on the 
remaining four coordinates. 

Writing the configuration vector as q = 
[p x , Py, 0] r and the two control inputs as 
the rolling velocity u\ = \jf and the heading rate 
of change u 2 = <p, the control system is 

q = uigx(q) + u 2 g 2 (q), 

where g\(q) = [i? cos0, R sin0,1,0] r and 

g 2 (q) = [0,0,0, l] T . Implicit in these equations 
of motion are the two rolling constraints 
A(q)q = 0, where 

A (n \- I" 1 0 —^COS0 0" 

yq) |_° 1 —^sin0 0_ ' 

These velocity constraints cannot be integrated to 
equivalent configuration constraints. 

Example 2 (Reaction-wheel satellite) The three- 
dimensional orientation of a satellite can be con¬ 
trolled by spinning internal reaction wheels. The 
controls to the reaction wheels are torques. By 
conservation of angular momentum, the total an¬ 
gular momentum P of the satellite is subject to 
the constraints 

P = Jco + ^ Jf a>i = constant, 


where J is the inertia of the satellite body, co is its 
angular velocity, Ji is the inertia of momentum 
wheel /, and cot is its angular velocity. These 
constraints are velocity constraints - given the 
angular velocity of the momentum wheels, the 
angular velocity of the satellite is known. Thus, 
we can treat the original mechanical system as 
a kinematic system with (differentiable) angular 
velocities of the momentum wheels as inputs. 
If the system satisfies P = 0, the kinematic 
reduction is drift-free. 

While satellite orientation is commonly 
controlled using three orthogonal reaction 
wheels (a fully actuated system), two reaction 
wheels suffice to control the orientation of 
the kinematic reduction in the case P = 0. 
This is apparent from the fact that successive 
rotations about two orthogonal body-fixed axes 
(e.g., body-referenced ZYZ Euler angles) are 
sufficient to arbitrarily orient a rigid body in 
space. 

Pure Mechanical 

This category consists of mechanical systems 
without any velocity constraints. 

Example 3 (3R robot arm with a passive joint) 
The dynamics of a robot arm are determined by 
its inertia matrix M(q ), from which the kinetic 
energy K = ^q T M(q)q is derived, and its 
potential energy V(q). If one of the joints of the 
arm rotates freely without an actuator, the arm 
is underactuated. One such robot is a planar arm 
with two actuated joints and one passive (Bullo 
and Lynch 2001; Lynch et al. 2000). Lor this 
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robot, the acceleration constraint arising from the 
lack of an actuator cannot be integrated to an 
equivalent velocity constraint. 

Mixed Kinematic and Mechanical 

This category consists of mechanical systems 
with both (1) velocity constraints and (2) accel¬ 
eration constraints that cannot be integrated. 

Example 4 (Snakeboard) The snakeboard is a 
skateboard with steerable wheels. The rider 
can locomote without touching the ground by 
twisting his or her body while steering the 
wheels. The configuration of a robotic model of 
the snakeboard and rider (Ostrowski et al. 1994) 
consists of the position (x,y) and orientation 
0 of the board, the steering angle of the wheels 
(assumed to be coupled to be equal and opposite), 
and the angle of a reaction wheel representing 
the rider (Fig. lb). The controls are the steering 
torque to the wheels and the driving torque of the 
reaction wheel. This system is mixed because of 
the presence of the no-slipping constraint at the 
wheels. 

While in some cases it is obvious whether 
velocity or acceleration constraints can be inte¬ 
grated to equivalent constraints on configuration 
or velocity, respectively, in general this is not 
trivial. Instead of attempting to determine the 
integrability of constraints, we typically study 
the reachable sets of the system (1). This is 
the topic of controllability of nonlinear systems, 
section “Determining Controllability”. 

Underactuated robots can also be classified ac¬ 
cording to the set of available controls U c R m . 
For example, the control set could be a discrete 
set of points in M m , or only nonnegative values, 
or a bounded set of M m containing the origin in 
the interior. For simplicity, assume u eU = M m . 


[B AB A 2 B ... A p ~ l B] 


is p , then it is possible to transfer the system from 
any state to any other state in finite time. 

Most underactuated systems of the form (1), 
such as all of the examples given above, are non¬ 
linear systems, however. For nonlinear systems, 
there are many possible notions of controllability 
(see Bullo and Lewis 2004; Lynch et al. 2011; 
Nijmeijer and van der Schaft 1990; Sussmann 
1983). Some examples include: 

• Small-time local accessibility (STLA) at x: For 
any time T > 0, the reachable set starting 
from x at times t < T contains a full¬ 
dimensional subset of the state space. 

• Small-time local controllability (STLC) at x: 
For any time T > 0, the reachable set starting 
from v at times t < T contains a neighbor¬ 
hood of x. 

• Global controllability : The robot can reach 
any state from any other state. 

STLC is strictly stronger than STLA. Neither im¬ 
plies global controllability nor does global con¬ 
trollability imply either of the local properties. 
STLA and STLC are illustrated in Fig. 2. 

STLA can be tested by a Taylor expansion of 
flows along vector fields. A key object in this 
study is the Lie bracket of two vector fields 
V\{x) and ^(x), defined as the new vector 
field 


dV 2 

[V l ,v 2 ] = -^v 1 

OX 



If the system were to start from x and flow along 
V\ for a short time e, then V 2 for e, then —V\ for c, 
then — V 2 for 6, a Taylor expansion shows that the 
net motion of the system would be e 2 [V \, U 2 K*) 
(plus terms of order e 3 and higher). If this direc- 


Control Challenges 

Determining Controllability 

For linear systems of the form x = Ax + Bu, x e 
R p ,u e M m , there is one notion of controllability, 
determined by the Kalman rank condition (KRC). 
If the rank of the matrix 



STLA STLC 


Underactuated Robots, Fig. 2 Example reachable sets 
in small time for systems that are STLA and STLC at x 
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tion is neither zero nor a linear combination of V\ 
and V 2 , then effectively a new motion direction 
has been created. 

For the upright rolling wheel, the Lie bracket 
of g] (forward-backward rolling) and g 2 (turn¬ 
ing) is 

[gi,g 2 ] = [R sin0, — R cos0,0,0] r , 

a sideways “parallel parking” motion. This new 
direction increases the dimension of the locally 
reachable set beyond what could be reached by a 
local linearization of the nonlinear system. 

Roughly speaking, the Lie algebra of a set 
of vector fields V is the set of vector fields V, 
all iterated Lie brackets of these vector fields, 
and their linear combination. For example, the 
Lie algebra of V = {^ 1 ,^ 2 } includes [gi,g 2 ], 
tel. tel.£ 2 ]], te2, tel. tel.£ 2 ]]], etc., as well as 
their linear combinations. Deeper Lie brackets 
correspond to higher-order terms in the Taylor 
expansion of flows. 

With these concepts, a theorem due to Chow 
(1939) says that a system (1) satisfies STLA 
at x if the dimension of the Lie algebra of 
{/, g \,..., g m } at x is p , the dimension of the 
state space. This is known as the Lie algebra rank 
condition (LARC). Most underactuated systems 
of interest satisfy the LARC but not the KRC. 
For the upright rolling wheel, the linearization 
at any q fails the KRC, but the four-dimensional 
configuration space is spanned by gu gi, [gi, £ 2 ], 
and [g 2 , [gugiW at satisfying the LARC. 
Therefore the system is STLA at all points. 

The STLA property can be strengthened to 
STLC if the system additionally satisfies certain 
symmetry properties, allowing it to proceed both 
forward and backward along Lie bracket direc¬ 
tions. For example, if f(x) =0 and the control 
set U contains the origin in the interior, the LARC 
implies STLC. This is the case for the upright 
rolling wheel. More general notions of symmetry 
have also been derived (e.g., Sussmann 1987). 

For mechanical systems, STLC can only hold 
at zero-velocity states where f(x) = 0. In 
addition, velocity constraints may prevent the 
system from reaching a 2n -dimensional set in 
state space. A more relevant question may be 


whether the configuration alone can be locally 
controlled at zero-velocity states. Specialized 
Lie-algebraic controllability tests have been 
developed for configuration controllability of 
mechanical and mixed systems (Bullo and Lewis 
2004; Bullo and Lynch 2001; Bullo et al. 2002; 
Lewis 2000). 

Global controllability results often derive from 
STLC at all states for drift-free systems or from 
STLA and global properties of the vector fields 
or the topology of the state space (Choset et al. 
2005). 

Feedback Stabilization 

For some underactuated robots, the linearization 
at a state x may satisfy the KRC. An example 
is an inverted pendulum linearized at a balanced 
equilibrium state. In this case, it is possible to 
derive a linear feedback controller, based on the 
linearization, to stabilize the balanced state. 

For many underactuated systems of interest, 
however, the linearization at a desired state is 
not controllable. For such systems, a famous 
theorem due to Brockett (1983), plus subsequent 
strengthening, implies the following: 

Theorem 1 For any drift-free underactuated 
kinematic system of the form (1), there exists no 
time-invariant continuous state feedback law that 
stabilizes the origin. 

For example, there exists no continuous state- 
feedback control law that can stabilize the upright 
rolling wheel to a desired configuration. 

This obstruction to stabilizability has resulted 
in a number of different approaches to feedback 
control of underactuated systems, including (1) 
time-varying feedback control laws, (2) feed¬ 
back control laws that are discontinuous in the 
state, and (3) two-degree-of-freedom controllers 
consisting of a motion planner plus a feedback 
controller for the easier problem of stabilizing 
the nominal trajectory. Strategies for planning 
nominal motions for two-degree-of-freedom con¬ 
trollers are discussed next. 

Motion Planning 

Given an initial state x(0) = x sta rt and a goal 
state Xgoai, the motion planning problem for a 
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system x = h(x,u) is to find a control history 
u : [0, T] —> such that 


L 


•^goal — -^0 h(x(s), u(s))ds 


while avoiding any obstacles that may be present 
in the environment. It may also be desired to 
minimize some notion of cost, 


=f 


L(x(s), u(s))ds. 


One choice of L(s ) is u T (s)u(s ), the square of the 
control effort. 

Ideally the motion planning method would 
be complete (guaranteed to find a solution in 
finite time if one exists) or probabilistically 
complete (if a solution exists, the probability 
of finding a solution goes to one as time goes to 
infinity). 

A variety of approaches to motion planning 
have been proposed in the robotics literature. 
Approaches that apply to underactuated systems 
include: 

• Search-based methods. A popular class of 
search-based methods are rapidly exploring 
random trees (RRTs) and variants (LaValle 
and Kuffner 2001). These approaches offer 
probabilistic completeness for many systems, 
including systems with obstacles, but naive 
implementations may be slow to find solu¬ 
tions, and the solutions generally do not sat¬ 
isfy optimality criteria. 

• Numerical optimization. The control history 
can be converted to a finite parameterization 
using representations such as polynomials, cu¬ 
bic B-splines, wavelets, and truncated Fourier 
series. Numerical optimization methods can 
then be applied to solve the two-point bound¬ 
ary value problem while minimizing a cost 
function. Gradient-based numerical optimiza¬ 
tion methods may yield locally optimal so¬ 
lutions, but they may suffer from numerical 
convergence problems, and they may get stuck 
in local minima depending on an initial guess. 
Optimization methods that do not use gradient 
information potentially offer globally optimal 


solutions, but typically at the expense of sig¬ 
nificantly longer computation times. 

• Fictitious input methods. These methods 
assume that there is a direct control input 
available for each Lie bracket motion 
direction. These fictitious inputs are then 
converted to a sequence of feasible inputs 
utilizing the Campbell-Baker-Hausdorff- 
Dynkin expansion of flows (Lafferriere and 
Sussmann 1991). In general, these methods 
require iterative application to account for 
errors in the approximate conversion. 

• Trajectory transformation methods. One way 
to deal with obstacles is to first use a global 
motion planner that is complete under the 
assumption that the robot has no motion 
constraints. Then the template unconstrained 
solution is iteratively subdivided into smaller 
pieces, with each piece replaced by an 
obstacle-free feasible trajectory generated 
by a local planner. If the system is drift- 
free and STLC at all configurations, then it 
is possible to develop a local planner that 
guarantees success of the transformation 
from an unconstrained trajectory to a feasible 
trajectory as the subdivisions get small enough 
(Laumond et al. 1994). 

Often it is possible to exploit structure of the 
equations of motion beyond the general form (1). 
Making use of extra structure can reduce the 
computational complexity of motion planning. 

• Chained form, sinusoidal controls, and av¬ 
eraging. Certain drift-free kinematic systems 
can be transformed to a canonical chained 
form. For systems in such a form, sinusoidal 
controls of integrally related frequencies can 
be chosen to drive one of the configuration 
variables to its desired value while having zero 
net effect on configuration variables already at 
their desired value. In this way, configuration 
variables can be driven sequentially to their 
desired values (Murray and Sastry 1993). 

For many underactuated systems, si¬ 
nusoidal controls can be used to achieve 
approximate motion in each Lie bracket 
direction needed to complete the LARC. The 
resulting periodic motions are sometimes 
called gaits, and motion planning can be 
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achieved using a finite set of gaits (Bullo 
and Lewis 2004; Ostrowski et al. 1994). 

• Differentially flat systems. For certain under¬ 
actuated systems with u e M m , there exist a 
set of m functions w/ of the state, the control, 
and its derivatives, 

Wj (x, u, u, u ,..., u^), i = l...m, 

such that the states and control inputs can 
be expressed as functions of w and its time 
derivatives. The w ? are called flat outputs. The 
motion planning problem is to find w(t), t e 
[0, T] , such that w(0), w(0), w(0),... and 
w(T),w(T),w(T),... satisfy the constraints 
specified by x star t and x goa i. The problem 
changes from constrained motion planning 
in the ^-dimensional state space to finding 
a curve satisfying start and end constraints 
on w and its derivatives (Fliess et al. 1995; 
Sira-Ramirez and Agrawal 2004). 

• Kinematic reductions. Motion planning in 
configuration space is a lower-dimensional 
problem than motion planning in 
configuration-velocity space. Therefore, when 
a mechanical system can be reduced to 
a kinematic equivalent, motion planning 
can be more efficient. Examples include 
mechanical systems that can be fully reduced 
to a kinematic system (like the reaction- 
wheel satellite) and mechanical systems 
that admit rank-1 kinematic reductions - 
vector fields on configuration space that 
can be followed at any speed, despite the 
underactuation constraints. These vector fields 
become primitives for motion planning on 
configuration space (Bullo and Lewis 2004; 
Bullo and Lynch 2001; Bullo et al. 2002; 
Choset et al. 2005). 

Summary and Future Directions 

Underactuated systems arise in all areas of 
robotics, including robot manipulation and 
aerial, ground, and underwater locomotion. 
Underactuation raises a number of challenging 
issues in robot motion planning and control. 


While significant progress has been made, further 
research is needed on computationally efficient 
motion planning and robust stabilization of 
nominal trajectories. In addition, although this 
entry focuses on systems that can be described 
by a single set of dynamics, many interesting 
underactuated systems are hybrid systems 
that experience changing contact constraints. 
Examples include biped robots striding from 
one foot to the next and robot manipulators that 
manipulate objects with changing contact modes 
(grasping, rolling, pushing, etc.). Further work is 
needed to incorporate contact models, beyond 
simple kinematic constraints, and changing 
equations of motion in motion planning and 
control of hybrid underactuated systems. 

Cross-References 

► Controllability and Observability 

► Differential Geometric Methods in Nonlinear 
Control 

► Feedback Linearization of Nonlinear Systems 

► Feedback Stabilization of Nonlinear Systems 

► Hybrid Dynamical Systems, Feedback Con¬ 
trol of 

► Lie Algebraic Methods in Nonlinear Control 

► Nonlinear Zero Dynamics 

► Underactuated Marine Control Systems 

► Walking Robots 

► Wheeled Robots 


Recommended Reading 

Introductions to underactuated robot systems can 
be found in Choset et al. (2005), Lynch et al. 
(2011), and Murray et al. (1994). 

While this entry focuses on configuration 
spaces modeled locally as M”, most robotic 
systems consist of rigid bodies whose positions 
and orientations can be described globally as 
elements of the Lie group SE( 3) or one of its 
subgroups: SE( 2), SO( 3), or SO( 2). Geometric 
methods for control of underactuated systems 
make use of the extra structure of Lie groups and 
their Lie algebras, symmetries, and concepts 
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from geometric mechanics such as tangent 
and cotangent bundles, Riemannian metrics on 
manifolds, symplectic manifolds, connections, 
fiber bundles, covariant derivatives, etc. Excellent 
treatments can be found in Bloch et al. (2003), 
Bullo and Lewis (2004), and Murray et al. (1994). 
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Abstract 

Validation and verification (V&V) of advanced 
control systems is required for their use in fielded 
systems. A comprehensive V&V process involv¬ 
ing analysis, simulation, and experimental testing 
should be used to assess closed-loop system per¬ 
formance and identify system limitations. This 
entry discusses current V&V methods and tools 
as well as future research directions for safety- 
critical control applications. 
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Introduction 

Control system validation and verification (V&V) 
is an assurance that the closed-loop system (i.e., 


the control system acting on the plant being 
controlled) remains stable and performs within 
acceptable performance metrics across the oper¬ 
ational region of application. Basic definitions of 
validation and verification are given below (IEEE 
2011 ). 

Validation: The assurance that a product, ser¬ 
vice, or system meets the needs of the customer 
and other identified stakeholders. It often in¬ 
volves acceptance and suitability with external 
customers. 

Verification: The evaluation of whether or not 
a product, service, or system complies with a reg¬ 
ulation, requirement, specification, or imposed 
condition. It is often an internal process. 

Control system validation can therefore be 
thought of as a confirmation that the algorithms 
are performing their intended functions and an 
affirmation of their effectiveness in performing 
these functions. Validation is a rigorous evalu¬ 
ation process that should involve clearly iden¬ 
tifying system limitations, including regions of 
operation within which stability or acceptable 
levels of performance are not guaranteed. Verifi¬ 
cation can be thought of as a confirmation that the 
system implementation in software and hardware 
is correctly executing the algorithms as designed 
(and validated). This includes a rigorous evalu¬ 
ation of system requirements and specifications 
and a clear determination of whether or not they 
have been met. V&V methods include analy¬ 
sis, simulation, and experimental testing, which 
are (ideally) applied in an integrated or iterative 
manner to corroborate results across methods of 


J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9, 
© Springer-Verlag London 2015 




1512 


Validation and Verification Techniques and Tools 


evaluation. In the case of aircraft flight control 
and other safety-critical control applications, the 
V&V process must ultimately lead to system cer¬ 
tification. The following subsections summarize 
control system V&V analytical, simulation, and 
experimental test methods in terms of the current 
(or recommended) state of practice. A summary 
and some future research directions for safety- 
critical control applications are also provided. 

Control System Validation 

Validation methods, involving analysis, simula¬ 
tion, and experimental testing, are utilized to en¬ 
sure against errors and significant deficiencies in 
the underlying control system algorithms under 
realistic operational conditions for the intended 
application. System weaknesses and limitations 
are also identified through the validation pro¬ 
cess. Control system validation begins with an 
analysis of closed-loop system stability, perfor¬ 
mance, and robustness. Linear systems theory 
forms the basis for analytical stability and ro¬ 
bustness methods and the associated software 
tools that are currently available for closed-loop 
system validation. In current practice, stability 
of nonlinear systems is determined by evalu¬ 
ating the stability of the linearized closed-loop 
system at numerous equilibrium points across 
the operating range (or envelope) of the system. 
Closed-loop stability is assessed by computing 
the eigenvalues of the linearized closed-loop sys¬ 
tem at a number of equilibrium points in the 
region of operation. It should be noted, how¬ 
ever, that stability is not guaranteed between 
the operating points analyzed. Moreover, if the 
control system utilizes gain scheduling across the 
operating envelope, stability cannot be guaran¬ 
teed for interpolated gains between the design 
points. Relative stability is determined by gain 
and phase margins for single-input, single-output 
(SISO) systems and by the multivariable stability 
margin (see, e.g., Lavretsky and Wise 2013) for 
multiple-input, multiple-output (MIMO, or mul¬ 
tivariable) systems. Time-delay margins, defined 
as the minimum time delay required to destabilize 
the closed-loop system, can be determined from 


phase margin and verified in nonlinear simula¬ 
tion. Stability robustness is assessed in terms of 
uncertainties, including parametric uncertainties 
and unmodeled dynamics in the mathematical 
model of the plant. Advanced robustness analysis 
methods based on the structured singular value 
(Zhou et al. 1996) require the uncertainties to 
be separated from the nominal plant into what is 
termed a linear fractional transformation (LFT). 
Advanced robustness methods can be used to 
assess stability and performance robustness, as 
well as worst-case combinations of uncertain¬ 
ties that result in destabilization, loss of perfor¬ 
mance, or the lowest robustness margins. LFT 
models can be formulated for the analysis of 
nonlinearities, expressed as multivariate polyno¬ 
mials, around a trim condition or over subregions 
within the operational envelope. Stability over 
a region or subregion of the operating envelope 
can be guaranteed using linear parameter-varying 
(LPV) methods (see Apkarian and Gahinet 1995; 
Packard 1994; Rugh and Shamma 2000; Wu 
et al. 1996). Nonlinear stability and control are 
addressed more fully in Slotine and Li (1991) 
and Khalil (2002), as well as in numerous other 
texts. Analytical methods and software tools are 
available in Matlab®, using the Control Sys¬ 
tem Toolbox™ and Robust Control Toolbox™. 
LFR/LFT modeling methods and software tools 
are described in Magni (2004, 2006), Hecker 
et al. (2005), Varga et al. (1998), and Belcastro 
et al. (2005) and provided in the Robust Control 
Toolbox®. An LPV Toolbox™ is also under 
development and will become available soon (see 
Balas et al. 2013b). 

Performance is usually assessed using a high- 
fidelity simulation of the plant under expected 
operational conditions. The simulation should 
include nonlinear plant dynamics, noise, distur¬ 
bances, and any other phenomena anticipated un¬ 
der operation. Nominal performance is assessed 
in terms of the control system design objectives, 
which typically include (at a minimum) closed- 
loop steady-state tracking and transient response 
characteristics. Transient response characteristics 
typically include delay time, rise time, peak time, 
maximum overshoot, and settling time (see, e.g., 
Ogata 1970). Steady-state tracking error is also 
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typically assessed. Performance robustness can 
be evaluated using Monte Carlo simulation tech¬ 
niques (see, e.g., Kroese et al. 2011), in which 
parameters and operational conditions are var¬ 
ied over numerous simulation runs in order to 
statistically evaluate an extensive set of uncer¬ 
tainties and operational variations. Stability ro¬ 
bustness can also be assessed using Monte Carlo 
simulation techniques and by utilizing worst- 
case uncertainties and time-delay margins ob¬ 
tained during analysis. If the plant is a vehicle or 
robotic manipulator to be operated by a human, 
the handling qualities must also be evaluated 
to assess human-system interfaces and interac¬ 
tions. A real-time high-fidelity simulation with 
a human interface representative of the opera¬ 
tional environment is required for this evaluation. 
For aircraft, piloted simulation evaluations are 
conducted using a cockpit mock-up, and han¬ 
dling qualities are assessed under various scenar¬ 
ios using the Cooper-Harper Scale (Cooper and 
Harper 1969). Susceptibility to operator-induced 
oscillations, for example, resulting from time de¬ 
lays in the controlled response, may typically be 
uncovered using operator-in-the-loop simulation 
evaluations. 

Experimental testing should be conducted 
under realistic conditions that cover the entire 
(potential) operational space of the plant 
being controlled in order to assess realistic 
operational performance. For aircraft, this 
includes flight testing using full-scale and/or 
subscale test vehicles under nominal and 
off-nominal conditions. If the analytical and 
simulation evaluations provide a good match 
to the experimental evaluations, the test matrix 
can be comprised of key high-risk conditions to 
confirm desired behavior. 

Validation methods should be applied in 
an iterative manner comparing results from 
the analysis, simulation, and experimental 
tests and going back to reevaluate in one 
domain based on results from another. For 
example, analysis results should provide good 
predictions of results seen in simulation and 
experimental testing. If a good match is 
not obtained, the analysis model may have 
to be improved or another analysis method 


utilized to reduce conservatism in the result. 
Similarly, simulation results should provide a 
good prediction of experimental test results. 

Control System Verification 

System verification is ideally performed by or in 
collaboration with a computer science specialist 
to ensure against errors in the software/hardware 
implementation of the control algorithms. 
Control system verification begins with an 
analysis (Rushby 1995, 2009) of the software 
and hardware implementation requirements to 
ensure completeness and accuracy in the system 
specification. Several refinement steps are taken 
to transform the control system requirements 
to those implementable on the actual avionics 
hardware to be fielded. Verification, consisting 
of tests and analyses, is required to confirm 
requirements traceability and compliance from 
one refinement to another, to confirm accuracy of 
the algorithms, and to assure compliance and 
robustness of the final code with respect to 
the original control algorithms, to ensure that 
no errors are introduced from the refinement 
itself or related to the target computing platform. 
Formal analysis methods can be used to evaluate 
software logic and other software mechanisms for 
correctness under all operational conditions and 
to provide correctness proofs. Formal methods 
can also be utilized for model checking to verify 
system properties through an exhaustive search 
of all possible states that can be entered during 
execution (Berard et al. 1998). One software 
verification tool is called PVS (Owre et al. 1992), 
developed by SRI International, and is available 
on the Internet as an open-source software tool. 
Other methods and tools are also available from 
SRI International, as well as other sources. 

Simulation techniques are used to evaluate 
software code, modules, subsystems, and the full 
system. Once the software has been verified, it 
is implemented on actual or representative hard¬ 
ware and evaluated using hardware-in-the-loop 
simulation and experimental testing. Experimen¬ 
tal testing should include laboratory evaluations 
under all possible operational conditions and in 
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a relevant application environment. For aircraft, 
this would include flight testing of the control 
system on the actual avionics hardware to be 
fielded. 


Summary and Future Directions 

This entry has summarized current and 
recommended practices for control system V&V, 
including methods and tools for analysis, 
simulation, and experimental testing. The V&V 
process ensures against errors and deficiencies 
in the underlying system algorithms (validation) 
and in its software/hardware implementation 
(verification). Analysis, simulation, and ex¬ 
perimental testing are performed iteratively to 
utilize and confirm results between evaluation 
techniques. 

A comprehensive validation process is per¬ 
formed to assure control system effectiveness 
across the operational envelope of the plant and 
to identify system limitations and weaknesses. 
Current analysis methods for control system val¬ 
idation are typically based on linear systems 
theory and focus on nominal operations under 
model uncertainties and anticipated disturbances 
(e.g., noisy measurement signals). This analy¬ 
sis includes stability, performance (e.g., track¬ 
ing accuracy), and robustness. Advanced robust 
control analysis methods have been applied to 
safety-critical applications, such as aircraft (see 
Fielding et al. 2002; Varga et al. 2012), to as¬ 
sess robust stability and performance. These two 
references provide a global optimization-based 
worst-case approach as a “necessary condition” 
technique for flight control system validation or 
as a “sufficient condition” technique for invali¬ 
dation. High-fidelity nonlinear simulation evalu¬ 
ations are performed in batch and real time to 
assess robustness under system and operational 
uncertainties and to assess interface effectiveness 
for human-in-the-loop operations (if applicable). 
Experimental testing under realistic operationally 
relevant conditions is performed across key oper¬ 
ating conditions to confirm analytical and simu¬ 
lation predictions. 


System verification is performed to ensure 
correctness of the hardware/software implemen¬ 
tation. Various analysis and testing methods, in¬ 
cluding advanced formal analysis methods, are 
used to assess completeness of the system re¬ 
quirements and specification and software ele¬ 
ments (e.g., logic). Model checking techniques 
are used to verify system properties. Code is 
tested in simulation and on representative or ac¬ 
tual hardware under realistic operationally rele¬ 
vant conditions. 

Future research directions will enable the 
V&V of nonlinear and adaptive control systems 
(see, e.g., Hovakimyan and Cao 2010; Tallant 
et al. 2004) that improve performance under 
highly uncertain conditions, as well as the 
V&V of complex integrated safety-critical 
systems for operation under off-nominal and 
hazardous conditions (see Belcastro 2010, 
2012). These systems will include diagnostic 
and prognostic algorithms for integrated vehicle 
health management, resilient control systems that 
enable the detection and mitigation of multiple 
hazards, supervisory systems that provide safety 
assurance (for safety-critical operations), and 
intelligent interface and decision-based systems 
that enable human-optional and fully autonomous 
operations. These systems will inherently involve 
stochastic decision-making and nonlinear and 
adaptive control algorithms. V&V of these future 
systems poses significant technical challenges 
and is the subject of current research. Some 
of these challenges include the following: (1) 
development and validation of multidisciplinary 
simulation models for characterizing hazardous 
condition effects; (2) validation of adaptive, 
diagnostic/prognostic, and reasoning algorithms 
under numerous off-nominal and hazardous 
conditions; (3) verification of software-intensive 
highly complex systems; and (4) determining 
a level of confidence in V&V results for 
hazardous application domains that cannot be 
fully replicated during the evaluations. 

Research on modeling and simulation meth¬ 
ods is being performed to characterize multidis¬ 
ciplinary effects of off-nominal and hazardous 
conditions, and validation of these models can 
be difficult. For aircraft applications, hazardous 
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conditions relate to aircraft loss of control (LOC) 
and precursor conditions. LOC is a complex and 
highly nonlinear phenomenon for which there is 
little available data. Hazardous conditions con¬ 
sidered in this research include vehicle upset 
conditions (e.g., stall/departure), vehicle impair¬ 
ment conditions (e.g., failure, icing, and dam¬ 
age), and external disturbances (e.g., inclement 
weather and wake vortices). Multidisciplinary 
models under development include aerodynamic, 
propulsion, and airframe structure effects. For 
example, simulation models for characterizing 
aircraft flight dynamics and control effects under 
upset conditions are currently being developed 
(see, e.g., Foster et al. 2005; Groen et al. 2012), 
as well as propulsion effects resulting from the 
associated reduced flow conditions (Liu et al. 
2013). Model validation is being performed using 
available flight and accident data, as well as ex¬ 
perimental testing in the laboratory and through 
subscale and full-scale flight testing. The en¬ 
hanced high-fidelity simulation models resulting 
from this research will be used in the develop¬ 
ment and validation of onboard control systems 
designed to detect and mitigate these hazards. 

Current research efforts for validating 
the above future systems include nonlinear 
robustness analysis methods and software tools 
(see (Chakraborty et al. 201 la, b), Balas 
et al. 2013a; Packard et al. 2010; Summers 
et al. 2013), nonlinear analysis methods 
for controlled systems (Gill et al. 2012; 
Kwatny et al. 2013), uncertainty quantification 
and robustness analysis methods for mixed 
uncertainties and multiple objectives (Kenny 
et al. 2012), and the analysis of stochastic 
filters (see, e.g., Reif et al. 1999; Rhudy et al. 
2013a, b, c). The term “mixed uncertainty” 
refers to aleatory and epistemic uncertainties. 
Aleatory uncertainties are typically stochastic 
(or statistical) and represent operational or 
environmental uncertainties (e.g., turbulence) 
that cannot be altered or controlled during 
experiments or fielded applications. Epistemic 
uncertainties are typically deterministic and arise 
from lack of knowledge about the plant resulting 
from modeling assumptions, neglected effects 
(e.g., unmodeled dynamics), and parametric 


uncertainties resulting from inaccurate measure¬ 
ments or operational variability. These analysis 
methods and tools will be used iteratively with 
simulation evaluations and experimental testing 
methods, as described herein, to comprehensively 
assess nonlinear and adaptive control systems 
that enable resilience under multiple hazards. 

Current research efforts on software ver¬ 
ification focus on argument-based safety 
assurance for highly complex integrated systems 
of systems; assessment tools for evaluating 
the safety and coordination of authority and 
autonomy assignments; methods for ensuring 
safety-critical properties of distributed systems; 
and the development of tools and techniques for 
assessing software-intensive systems in meeting 
performance safety objectives. Research on 
software-intensive systems includes the devel¬ 
opment of methods and tools to detect, diagnose, 
and predict adverse events due to a software fault 
or failure once the software has been verified and 
is in operation. Some recent references on this 
work include Holloway (2012), Xu et al. (2013), 
Driscol et al. (2012), Person et al. (2011), and 
Latorella and Feary (2011). 

Research has been initiated on developing 
methodologies for determining (i.e., quantifying) 
the predictive capability of the validation 
process for systems designed to operate under 
conditions that cannot be fully replicated during 
evaluations. Predictive capability assessment 
is an evaluation of the validity and level of 
confidence that can be placed in the validation 
process and results under nominal and hazardous 
conditions (and their associated boundaries). 
The need for this evaluation arises from the 
inability to fully evaluate these technologies 
under actual hazards to be encountered by the 
fielded system. A detailed disclosure is required 
of model, simulation, and emulation validity 
for the off-nominal conditions being considered 
in the validation, interactions that have been 
neglected, assumptions that have been made, and 
uncertainties associated with the models and data. 
Cross-correlations should be utilized between 
analytical, simulation, and ground test and flight 
test results in order to corroborate the results 
and promote efficiency in covering the very 
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large space of operational and off-nominal and 
hazardous conditions being evaluated. The level 
of confidence in the validation process and results 
must be established for subsystem technologies 
as well as the fully integrated system. This 
includes an evaluation of error propagation 
effects across subsystems and an evaluation of 
integrated system effectiveness in mitigating 
hazardous conditions and preventing cascading 
errors, faults, and failures across subsystems. 
Metrics for performing this evaluation are also 
needed. 


Cross-References 

► Computer-Aided Control Systems Design: In¬ 
troduction and Historical Overview 

► Interactive Environments and Software Tools 
for CACSD 

► Robust Synthesis and Robustness Analysis 
Techniques and Tools 
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Abstract 

Current prevailing control technology enables 
vehicle dynamic control through powertrain 
torque manipulation and individual wheel 
braking. Longitudinal control can maintain 
vehicle acceleration/braking capability within 
the physical limits that the road condition can 
support, while vehicle lateral control can preserve 
vehicle steering/handling capability up to the 





1518 


Vehicle Dynamics Control 


maximum capacity offered by the road/tire 
interaction. Since most of these controllers are 
driver-assist systems, their objective is to retain 
the vehicle dynamic state in operating regions 
familiar to drivers. In general, this implies that 
the controller will keep the tire in its linear region 
and avoid excessive slipping, skidding, or sliding. 

Keywords 

Active yaw control; Electronic stability control; 
Evasive maneuvers; Lateral dynamics; Traction 
assist; Traction control; Vehicle stability assist 

Introduction 

Vehicle dynamics control generally refers to the 
active modification of longitudinal and lateral tire 
forces and the corresponding dynamics of ground 
vehicles using sensors and actuators. While it 
may also include vehicle active or semi-active 
suspension control (Hrovat 1997), vehicle dy¬ 
namics control in this entry will focus on trac¬ 
tion control - vehicle longitudinal control and 
electronic stability control - combined vehicle 
longitudinal and lateral control. 

Simply speaking, tire force is generated when 
there exists a velocity difference between tire 
tread and the ground, also known as tire slip. 
As illustrated in Fig. 1, the longitudinal tire force 


first grows proportionally with the tire slip, in 
a so-called linear region, and then saturates as 
tire slip passes beyond a certain threshold. The 
figure also shows the coupling effect between 
longitudinal and lateral tire forces. That is, the 
available lateral force (as a function of tire slip 
angle) decreases when the longitudinal tire slip 
increases, and the available longitudinal force (as 
a function of tire slip) decreases as the lateral 
tire slip angle increases. This coupling effect is 
essential for understanding vehicle dynamics and 
leads to numerous control applications. 


Traction Control 

Since vehicle motion relies on the tire/ground 
interaction, it is important for the purpose of 
vehicle controllability to maintain tire/road inter¬ 
action in a linear and predictable way. Anti-lock 
braking systems (ABS), and traction control (TC) 
in particular, monitor and control the tire slip so 
that the longitudinal tire force can best support 
and balance the corresponding brake torque (dur¬ 
ing ABS intervention) or driveline torque (during 
TC intervention) delivered to the wheels. Without 
the wheel/tire slip control, tire force may saturate, 
resulting in both the reduction of longitudinal and 
lateral tire force capacity, with the corresponding 
reduction in decelerating/accelerating capability, 
or loss of road grip/lateral tire force capacity. 


Vehicle Dynamics 
Control, Fig. 1 

Longitudinal and lateral 
tire force as a function of 
tire slip and slip angle 
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Control Design 

The objective of a TC system is to ensure lon¬ 
gitudinal tire force capacity while maintaining 
a good margin on available lateral force road 
grip (see Fig. 1). Based on the tire force/slip 
characteristics, this can be achieved by regulating 
the longitudinal tire slip, roughly defined as the 
relative velocity between the contact patch of the 
tire and the road surface. This can be expressed 
as the difference between the vehicle traveling 
speed and tire rotational speed, as defined in 
Eq. 1, according to the Society of Automotive 
Engineers (SAE), where V is the vehicle speed, 
co is the angular speed of the tire, and R is the 
effective tire rolling radius. The effective rolling 
radius is defined so that vehicle speed equals the 
product of R and co (i.e., V = R co) when there is 
no torque applied to the wheel and the tire is free 
rolling. 


At low slip, the longitudinal tire force grows 
as the slip increases (Carlson and Gerdes 2003), 
while at high slip it passes its peak and begins 
to decrease (Deur et al. 2004). High slips oc¬ 
cur when wheels are locked during braking or 
are over spinning during acceleration. A traction 
control system uses feedback control to regulate 
wheel/tire slip. 

Sensors and Actuators 

To regulate driven wheel slips effectively with 
closed loop control, wheel speed sensors at the 
non-driven wheels are utilized for vehicle speed 
estimation (V in Eq. 1). In the case of all-wheel 
drive or four-wheel drive systems, a longitudinal 
accelerometer is typically added for the speed 
estimate. As the amount of desired wheel slip 
may vary depending on maneuvers, accelerator 
pedal, steering angle, and yaw rate signals may 
be used as well. Some systems deploy steering 
wheel angle and yaw rate sensors for direct signal 
assessment and signal sharing competency, while 
others estimate these signals based on the speed 
difference between left and right wheels, for 


subsystem modularity across various vehicle con¬ 
figurations and platforms as well as calibration 
independency. 

Powertrain and brake torque modulation are 
typically used for actuation to regulate driven 
wheel slips. 


Control System Behavior 

Wheel/tire slip targets are typically adjusted 
based on vehicle driveline configurations as 
well as vehicle maneuvers. When a vehicle is 
cornering, a low slip target is generated to assure 
sufficient margin in lateral tire force capacity. 
Similarly, rear wheel drive vehicles may warrant 
a lower slip target than front or all-wheel drive 
vehicles. When a driver presses hard on the 
accelerator pedal, the slip target can be raised 
to accommodate higher acceleration. In addition, 
the target can be adjusted based on vehicle speed 
and estimated road available friction, all in an 
attempt to optimize the longitudinal traction 
force while keeping sufficient margin on lateral 
grip (Fodor et al. 1998; Hrovat et al. 2000). 

Uniform Friction Surface: (Uniform mu) 

Unless equipped with advanced driveline mech¬ 
anism such as active limited slip differential or 
torque vectoring differential (Deur 2010), a vehi¬ 
cle is typically equipped with open differential, 
thus transmitting the powertrain torque evenly to 
both left and right driven wheels. Since there is 
no difference between the left and right wheel 
torque that can be supported by the uniform 
driving surface, wheel slip regulation can be quite 
effectively achieved by modulating only the pow¬ 
ertrain torque. One successful example of this is 
Ford’s engine-only traction control system, intro¬ 
duced in 2006 on its Fusion and FI50 models, 
which was well received by media experts and 
customers (Healey 2005). 

Nonuniform Friction Surface: (Split mu) 

For driving surfaces offering different tire/road 
characteristics, different wheel torque can be 
supported on different sides. In this case, a 
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vehicle equipped with an open differential would 
transmit only the minimum torque (set by the 
low friction side) to the road. Any additional 
driveline torque that cannot be supported by 
the road surfaces results in spinning the wheel 
on the low friction side. Since open differential 
transmits equal amount of torque left and right, 
the additional tire force available on the higher 
friction side would not be fully utilized with 
powertrain only actuation. In this case, by 
applying additional brake torque at the low 
road friction side, the driveline torque can 
be balanced at the higher level offered by 
the high friction side. Care must be taken to 
avoid aggressive brake application which can 
cause driveline and/or half shaft oscillations 
(Fodor et al. 1998; Hrovat et al. 2000). 


Control Challenges 

Given that road/tire interaction varies with 
multiple environmental factors, the tire force/slip 
relationship depicted in Fig. 1 is only a qualitative 
characterization, and the actual optimal slip 
for a desired traction force is difficult to 
accurately establish. While the peak traction 
force and the corresponding road friction 
potential (i.e., mu) can be detected once 
the wheel starts to spin, it is difficult to do 
so prior to a wheel spin event (Gustafsson 
1996). 

Since the powertrain actuation is less percep¬ 
tible yet occasionally sluggish, and the brake ap¬ 
plication can be fast but intrusive at times, it can 
be a control challenge to optimize the actuation 
combination and bandwidth. 

If a priori knowledge of friction potential 
and optimal slip can be learned, detailed 
powertrain/driveline actuation delay and 
dynamics can be modeled, and optimal actuation 
combination and bandwidth can be incorporated; 
it is conceivable that further improvement in 
wheel slip and traction control can be achieved, 
using advanced control approaches such as model 
predictive control (Borrelli et al. 2006), for 
example. 


Electronic Stability Control 

According to the Society of Automotive 
Engineers (SAE), an electronic stability control 
system (ESC) is a computer-controlled system 
that augments vehicle directional stability by 
applying and adjusting individual wheel braking. 
It is operational over the full speed range of 
the vehicle and is capable of monitoring both 
driver steering input and vehicle yaw rate to 
limit vehicle understeering and oversteering, as 
appropriate. 

The wide proliferation of ESC in recent 
years (Van Zanten 2000) across the vehicle 
fleet has allowed various evaluation studies of 
its effectiveness in real-world environments. 
Among them, the United States NHTSA 
(National Highway Traffic Safety Adminis¬ 
tration) study (Dang 2004) concluded that 
ESC reduces fatal single vehicle crashes by 
35 %, while single vehicle crashes involving 
sport utility vehicles (SUVs) are reduced 
by 67 %. Similar conclusions were arrived 
in other subsequent studies, including the 
statement that “Electronic stability control 
could prevent nearly one-third of all fatal 
crashes ...” from the Insurance Institute for 
Highway Safety organization (IIHS 2006). 
Many of these effectiveness studies are 
summarized in a literature review by Ferguson 
(2007). 

Control Design 

The objective of an ESC system is to provide 
vehicle controllability and predictability to assist 
the driver. This can be achieved by preventing 
excessive deviations between the intended and 
actual lateral response of the vehicle, especially 
during critical maneuvers such as a sudden en¬ 
counter with a slippery /icy road. 

During driving, a driver relies on a mental 
model of the vehicle’s response to his/her steering 
input developed from previous driving experi¬ 
ence. A vehicle model, as described in Eq. 2 and 
Fig. 2, is often used to describe nominal lateral 
vehicle behaviors. 
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Vehicle Dynamics 
Control, Fig. 2 Vehicle 
cornering model 
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mv y = -mVf + Fyf(v y , V,8, F xf ) 

+ F yr (Vy,V,8,F xr ) 

/i/r = 2 aF y f — 2bF yr 
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rate and lateral acceleration error when both are 
compared to a nominal vehicle model (Manning 
and Crolla 2007), or balance between yaw rate 
error and detected excessive sideslip angle (Di 
Cairano et al. 2013). 


where m is the vehicle mass; V and V y are 
vehicle longitudinal and lateral velocity, corre¬ 
spondingly; \jr is vehicle yaw rate; and 8 is the 
steering angle. Parameters a and b are the dis¬ 
tance between vehicle center of gravity to front 
and rear axle; c is the half track width. F x and 
F y denote the longitudinal and lateral/cornering 
tire force, with subscript indicating longitudinal 
(x) or lateral (y) direction, as well as the specific 
corner of the vehicle (front, rear, left, and right). 

Note that the corresponding longitudinal dy¬ 
namics can be described as 

mV = mVyf + Fxji + F xr j + F xfr + F xr , r (3) 

As the available road friction is not always 
known, a nominal vehicle lateral response de¬ 
rived from a hi-mu surface may not be feasible 
and may not best represent a driver’s intent. To 
modify the feedback to best adapt to the road 
condition, ESC would do one or more of the 
following: Adjust the driver intended yaw rate 
according to detected lateral acceleration capa¬ 
bility (Tseng et al. 1999), balance between yaw 


Sensors and Actuators 

In order to effectively provide vehicle control¬ 
lability through an embedded controller, ESC 
systems are equipped with a steering angle sen¬ 
sor, wheel speed sensors, a yaw rate sensor, and a 
lateral accelerometer. Additional sensors such as 
a longitudinal accelerometer and a roll rate sensor 
may be installed to better observe the vehicle 
dynamic states and provide improved fidelity for 
estimated vehicle behaviors. The actuators of an 
ESC system are the individual wheel brakes and 
powertrain torque. 


Control System Behavior 

A vehicle can exhibit understeering and/or 
oversteering behaviors during aggressive 
lateral maneuvers. Figure 3 illustrates how a 
vehicle equipped with ESC may provide better 
controllability. 

Understeering - When a vehicle does not 
turn in as much as desired by the driver (see 
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Vehicle Dynamics Control, Fig. 3 Vehicle going through a hairpin turn 


upper Fig. 3). In this case, the vehicle yaw rate, 
an ESC measured/monitored signal, would be 
less than the driver desired value. For example, 
a vehicle on ice may experience extreme 
understeering that keeps the vehicle moving 
straight even when the steering wheel is turned. 
In this case, ESC applies corrective yaw moment 
to increase the yaw rate through individual wheel 
braking. Most of the longitudinal braking force is 
applied on rear axle inside wheel in the attempt 
to increase the lateral force capacity on the front 
axle while decreasing the lateral force capacity 
on the rear axle. As such, the vehicle experiences 
not only the yaw moment correction but also the 
reduction of understeering tendency with ESC 
brake application. 

Oversteering - When a vehicle turns too much, 
i.e., yaws with a smaller turning radius than 
the one needed to negotiate the road (see lower 
Fig. 3). In this case, the vehicle yaw rate would be 


larger than the driver desires. The vehicle tends 
to build up a large sideslip angle, resulting in a 
spinout due to the saturation of rear tire force. In 
this case, ESC applies corrective yaw moment to 
decrease the size of yaw rate through individual 
wheel braking. The longitudinal braking force is 
applied mostly on the front axle outside wheel 
to preserve the lateral force capacity on the rear 
axle and decrease the lateral force capacity on 
the front axle. As such, the vehicle experiences 
not only the yaw moment correction but also 
the reduction of oversteering tendency with ESC 
brake application. 

Evasive Maneuver - During an evasive 
maneuver, such as an aggressive double lane 
change, the vehicle may first turn in one 
direction, followed by an oversteer in the 
other direction. Due to delay and lag of 
actuator response in practice, feedforward 
control is typically used to ensure brake 
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application would generate corrective yaw 
moment in the appropriate direction. Further 
improvement could be possible by using 
road/traffic preview along with (semi)autonomous 
intervention based on advanced optimal control 
such as model predictive control (Falcone et al. 
2008), for example. 

Skidding and Oversteering - When both 
front and rear tires experience large tire slip 
angle, both tire forces are saturated. In this case, 
the vehicle is operating in a region where the 
rear tire slip angle can grow rapidly. Unless 
the front steering is delicately and quickly 
balanced, the excessive rear tire slip angle 
could cause the vehicle to spin out. In this 
case, in addition to applying corrective yaw 
moment similar to the above oversteering case, 
ESC may command light braking on all four 
wheels in an attempt to further slow down the 
vehicle. 

Rollover Mitigation - An ESC system may 
be extended to further provide a more control¬ 
lable vehicle behavior and mitigate rollover risks 
in evasive maneuvers that demand a large and 
sudden lateral force. For example, Roll Stability 
Control ™ system introduced at Ford in 2003 
monitors vehicle roll behavior in addition to ve¬ 
hicle yaw behavior to assist the driver (Lu et al. 
2007). 


Control Challenges 

In order to best provide the assistance to drivers’ 
desire, it is important to assess the vehicle dy¬ 
namic state and driver intention with high fidelity. 
This can be challenging in the presence of various 
factors that directly influence the vehicle behav¬ 
ior or sensor readings but are not or cannot be 
directly measured. For example, the road bank 
angle information is typically unavailable, but it 
has a direct influence on the lateral accelerometer 
measurement and could be misinterpreted as a 
discrepancy between vehicle yaw rate and lateral 
force (Tseng 2001; Tseng et al. 2007). And de¬ 
spite its criticality in vehicle dynamics control, 


the available road surface friction capacity and 
the vehicle sideslip angle typically cannot be 
measured (Tseng 2002, Ryu 2002, Ahn et al. 
2013). The driver’s intent is prescribed by a 
mental model in the computer, but we cannot 
directly read the driver’s mind. In addition, the 
controller should detect when a sensor is mis¬ 
behaving and giving out false or biased readings 
(Xu and Tseng 2007). While advanced observers 
have been developed to address these challenges, 
it is foreseeable that optimization in these areas 
could further improve the observer fidelity and 
overall ESC performance. 


Cross-References 

► Lane Keeping 

► Motorcycle Dynamics and Control 
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Abstract 

Even since the pioneering work of Levine 
and Athans and Melzer and Kuo, control 
of vehicular formations has been a topic 
of active research. In spite of its apparent 
simplicity, this problem poses significant 
engineering challenges, and it has often inspired 
theoretical developments. In this article, we view 
vehicular formations as a particular instance of 
dynamical systems over networks and summarize 
fundamental performance limitations arising 
from the use of local feedback in formations 
subject to stochastic disturbances. In topology 
of regular lattices, it is impossible to have 
coherent large formations, which behave like 
rigid lattices, in one and two spatial dimensions; 
yet this is achievable in 3D. This is a consequence 
of the fact that, in ID and 2D, local feedback 
laws with relative position measurements are 
ineffective in guarding against disturbances 
with slow temporal variations and large spatial 
wavelength. 


Keywords 

Fundamental performance limitations; Localized 
control; Optimal control; Relative information 
exchange; Spatially invariant systems; Toeplitz 
and circulant matrices; Vehicular formations 


Introduction 

Control of vehicular strings has been an active 
area of research for almost five decades (Levine 
and Athans 1966; Lin et al. 2012; Melzer and 
Kuo 1971a,b; Middleton and Braslavsky 2010; 
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Seiler et al. 2004; Swaroop and Hedrick 1996, 
1999; Varaiya 1993). This problem represents a 
special instance of more general vehicular for¬ 
mation problems which are encountered in the 
control of unmanned aerial vehicles, satellite for¬ 
mations, and groups of autonomous robots (Bullo 
et al. 2009; Mesbahi and Egerstedt 2010). Even 
for the simplest control objective, in which it is 
desired to maintain a constant cruising velocity 
and a constant distance between the neighboring 
vehicles, it has been long recognized that lim¬ 
ited information exchange between the vehicles 
imposes fundamental performance limitations for 
control design. In particular, look-ahead strate¬ 
gies that rely only on relative spacing information 
with respect to the preceding vehicle suffer from 
string instability. This phenomenon is character¬ 
ized by unfavorable amplification of disturbances 
downstream the vehicular string (Middleton and 
Braslavsky 2010; Seiler et al. 2004; Swaroop and 
Hedrick 1996, 1999). In order to avoid this unfa¬ 
vorable spatial application, it is typically required 
to broadcast the state of the leader to the rest of 
the formation. 

While a precise characterization of funda¬ 
mental performance limitations in the control of 
vehicular formations is still an open question, in 
this article we review recent progress in this area. 
We begin by highlighting performance limits 
that arise even in optimally controlled vehicular 
strings. The LQR problem for vehicular strings 
was originally formulated in pioneering papers by 
Levine and Athans (1966) and Melzer and Kuo 
(1971a,b). These formulations were revisited 
in Jovanovic and Bamieh (2005) where it was 
shown that the time constant of the optimally 
controlled closed-loop system increases linearly 
with the number of vehicles. This reference also 
employed spatially invariant theory (Bamieh 
et al. 2002) to demonstrate the lack of exponential 
stability in the limit of an infinite number of 
vehicles and to explain the arbitrarily slowing 
rate of convergence observed in numerical 
studies of finite strings of increasing sizes. 
We then summarize a recent result that viewed 
vehicular strings as the ID version of vehicular 
formations on regular lattices in arbitrary 
spatial dimensions and established fundamental 


performance limitations of spatially invariant 
localized feedback strategies with relative 
position measurements (Bamieh et al. 2012). 
It was shown that it is impossible to achieve 
robustness to stochastic disturbances with only 
localized feedback in ID and 2D; yet this can 
be achieved in 3D. This is a consequence of 
the fact that, in ID and 2D, local feedback laws 
are ineffective in guarding against disturbances 
with slow temporal variations and large spatial 
wavelength. An “accordion” type of motion 
experienced by these spatiotemporal modes 
compromises formation throughput, and it may 
occur even in formations that are string stable. 
Since the phenomenon that we describe also 
occurs in distributed averaging algorithms, 
global mean first passage time of random walks, 
effective resistance in electrical networks, and 
statistical mechanics of harmonic solids, it is 
relevant for a broad class of networked dynamical 
systems. 


Optimal Control of Vehicular Strings 

We next summarize a linear quadratic regulator 
problem for vehicular strings (Levine and 
Athans 1966; Melzer and Kuo 1971a, b) and 
demonstrate that strategies that penalize only 
relative position errors between neighboring 
vehicles yield nonuniform rates of convergence 
towards the desired formation (Jovanovic 
and Bamieh 2005). In particular, the time 
constant of the optimally controlled closed-loop 
system increases linearly with the number of 
vehicles, and the formation loses exponential 
stability in the limit of infinite vehicular 
strings. 

Optimal Control of Finite Strings 

A string consisting of M identical unit mass vehi¬ 
cles is shown in Fig. la. Each vehicle is modeled 
as a point mass that obeys the double-integrator 
dynamics: 

x n = u n , n e {1,..., AT} (1) 
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Vehicular Chains, Fig. 1 Finite and infinite strings of vehicles 
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Vehicular Chains, Fig. 2 Finite string with fictitious lead 
and follow vehicles 


1 r°° 

- / + u*(t)Ru(t))dt. 

2 Jo 


(3) 


where x n is the position of the nth vehicle and u n 
is the control applied on the nth vehicle. A control 
objective is to provide the desired constant cruis¬ 
ing velocity v and to keep the constant distance <5 
between the neighboring vehicles. By introducing 
the absolute position and velocity error variables 


Pn(t) := x n (t) -Vt+n8 

v n (t) := x n (t) — v, n e {1,... , AT} 

system (1) can be brought into the state-space 
form (Melzer and Kuo 1971a, b): 


The control problem (2) and (3) is in the standard 
LQR form with the state and control weights: 


Qp 

0 q v I 


Q p :=q p T M , R := rl. 


Here, Tm is an M xM symmetric Toeplitz matrix 
with the first row given by [ 2 — 1 0 • • • 0] e R M . 

We next briefly summarize the explicit solu¬ 
tion to the LQR problem (2) and (3) and refer 
the reader to Jovanovic and Bamieh (2005) for 
additional details. By performing a spectral de¬ 
composition of the Toeplitz matrix Tm , 


P 

v 


0 / 
0 0 


P 

v 


0 

I 


u =: Ax// + Bu 

( 2 ) 


where p := [p\ • • • Pm] t , v := vm] t , and 

u := [u\ • • • um] t • 

Following Melzer and Kuo (197 la, b), 
fictitious lead and follow vehicles, respectively, 
indexed by 0 and M + 1, are added to the 
formation; see Fig. 2. These two vehicles are 
constrained to move at the desired velocity 
v, and the relative distance between them is 
assumed to be equal to (M + 1)5 for all times. 
A quadratic performance index that penalizes 
control effort, relative position, and absolute 
velocity error variables is associated with 
system (2): 


T m = UA t U *, UU* = U*U = I 
A t = dmg{X\(T M ),..., Xm(T m )} 
K{Tm) = 2(1-cos^), n e {l,.M} 

(4) 

the solution to the LQR algebraic Riccati equa¬ 
tion can be represented as 


P := 


Pi Po 
Po Pi 
Pi = UAlU*. 


,Po = UA 0 U*,P 2 = UA 2 U*, 

(5) 


Here, 


^0 = ^/rq^A 1 / 2 

A 2 = -Jr (iJTTpA 1 + 

1 /2 

Ai = JcTpAp (2 Jfq-pA'I 1 + q v Ij 
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and the eigenvalues of the closed-loop A -matrix 
are determined by the solutions to the following 
system of the uncoupled quadratic equations: 

si + b n s n +c n = 0, n e {1, — , Af} 

c n := (K(T M )q p /r) X/2 (6) 

b n := (2c n + )^ 2 • 

From the above expression, it can be shown 
that in large-scale formations, the least-stable 
eigenvalue of the closed-loop system approaches 
the imaginary axis at the rate that is inversely 
proportional to the number of vehicles. As can 
be seen from the PBH detectability test, this is 
because the pair (Q, A) gets closer to losing its 
detectability as the number of vehicles increases. 
This clearly indicates that the resulting optimal 
control strategy leads to closed-loop systems with 
arbitrarily slow decay rates as the number of ve¬ 
hicles increases. As summarized in section “Op¬ 
timal Control of Infinite Strings,” the absence 
of a uniform rate of convergence for a finite 
number of vehicles manifests itself as the absence 
of exponential stability in the limit of infinite 
vehicular strings. 


Optimal Control of Infinite Strings 

The LQR problem for a system of identical 
unit mass vehicles in an infinite string (see 
Fig. lb) was originally studied in Melzer and 
Kuo (1971a). As summarized below, using the 
theory for spatially invariant linear systems 
(Bamieh et al. 2002), it was shown in Jovanovic 
and Bamieh (2005) that the resulting LQR 
controller does not provide exponential stability 
of the closed-loop system due to the lack of 
detectability of the pair (Q, A). 

The infinite dimensional equivalent of (2) is 
given by 


Pn 

bn 


0 / 
0 0 


Pn 

+ 

Vn 



Un —• A n \j/ n B n u n , yi £= Z 


(7) 


1 

J =~ E (qpiPnit) - Pn-\(t)) 2 

2 J ° n€Z (8) 

+q v v 2 n (t) + r u 2 n {t)) dt 



Vehicular Chains, Fig. 3 The spectra of the closed-loop 
generators in LQR-controlled finite {symbols) and infinite 
{solid line ) strings of vehicles with M = 50 and q p = 
q v = r - 1. The closed-loop eigenvalues of the finite 
string are points in the spectrum of the closed-loop infinite 
string. As the number of vehicles increases, the number of 
eigenvalues that accumulate in the vicinity of the stability 
boundary gets larger and larger 


with q p , q v , and r being positive design pa¬ 
rameters. Spatial invariance over a discrete spa¬ 
tial lattice Z can be used to establish that the 
solution to the LQR problem does not provide 
an exponentially stabilizing feedback for system 
(7). In particular, the spectrum of the closed- 
loop generator in an LQR-controlled spatially 
invariant string of vehicles (7) with performance 
index (8) is given by the solutions to the following 
6 -parameterized quadratic equation: 


Sg + bgse +cg =0, 
c 0 := (2(q p /r)(l - cos 0)) 1/2 (9) 

bg := (2 c 0 +q v /r) l/2 


where 6 E [0,2 7t) denotes the spatial wave num¬ 
ber. By comparing (4), (6) and (9), we see that 
the closed-loop eigenvalues of the finite string are 
points in the spectrum of the closed-loop infinite 
string. Furthermore, from these equations it fol¬ 
lows that as the size of the finite string increases, 
this set of points becomes dense in the spectrum 
of the infinite string closed-loop A-operator. The 
spectrum of the closed-loop generator, shown in 
Fig. 3 for q p = q v = r = 1, illustrates the 
absence of exponential stability. 
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Coherence in Large-Scale Formations 

Fundamental performance limitations arising 
from the use of local feedback in networks 
subject to stochastic disturbances were recently 
examined in Bamieh et al. (2012). For consensus 
and vehicular formation control problems in 
topology of regular lattices, it was shown that 
it is impossible to guarantee robustness to 
stochastic exogenous disturbances in one and 
two spatial dimensions. Yet it was proved that 
this is achievable in 3D. This phenomenon is a 
consequence of the fact that, in ID and 2D, local 
feedback laws are ineffective in guarding against 
disturbances with large spatial wavelength, 


maintaining their respective position in a Z d N grid 
with spacing of 8 in each dimension. 

By introducing the position and velocity devi¬ 
ations from the desired trajectory, 

Pn • = Xn ■ = t? 

and by confining our attention to static-feedback 
policies, 


u(t) = -[K p K v ] 


p(t ) 
v(t) 


( 11 ) 


and it has also been observed in global mean 
first passage time of random walks, effective 

" f 


0 / 

~ p~ 

+ 

"0" 

resistance in electrical networks, and statistical 

V 


1 

1 

cs 

V 

I 


mechanics of harmonic solids. We next briefly 
summarize the implications of these results for 
the control of vehicular formations and refer the 
reader to Bamieh et al. (2012) for details. 

Stochastically Forced Vehicular 
Formations with Local Feedback 

Let us consider M := N d identical vehicles 
arranged in a d -dimensional torus, Z d N , with the 
double integrator dynamics: 


u n + w n 


( 10 ) 


equations of motion for the controlled system 
(10) can be brought into the state-space form 


w =: Axfr + B vi’ ' ^ ^ 

z = Cf. 

Here, p and v are the position and velocity 
vectors of all vehicles, z is the performance out¬ 
put, and w is the forcing vector. 

An Example 

In one-dimensional formations with nearest 
neighbor relative position and velocity measure¬ 
ments, the control acting on the nth vehicle is 
given by 


where n := («i, ...,«</) is a multi-index with 
each rii e Z # := {0, ...,N — 1}, u is the 
control input, and w is a mutually uncorrelated 
white stochastic forcing. Each position vector 
x n is a d -dimensional vector with components 
x n := [ x\ • • • x d ] . The control objective is to 
have the nth vehicle follow the absolute desired 
trajectory x n : 



-*h " 


v l 


n\ 

x n := vt+n8 O 


:= 


t + 





i 

1^ 


_ n d _ 


In other words, it is desired that all vehicles 
move with constant heading velocity v while 


Unit) = -k p iPnit) - Pn-l(t)) 

-k+ iPnit) - Pn+\{t)) 

—k~iv n (t) — v n -\{t)) K } 

—k+ iv n it) — V n + i (t)) 

where k± and kf are positive design parameters. 
For a system that evolves over a ID lattice, the 
feedback gain matrices K p and K v are tridiag¬ 
onal Toeplitz matrices implying that the closed- 
loop systems have been effectively converted into 
a mass-spring-damper system shown in Fig. 4. 
Figure 5 shows the results of a stochastic simula¬ 
tion for the closed-loop system (12) and (13) with 
100 vehicles with desired inter-vehicular spacing 
8 = 20 and k p = kf = 1. These plots 
indicate the lack of formation coherence. This is 
only discernible when one “zooms out” to view 
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Vehicular Chains, Fig. 4 Finite string of vehicles with a nearest neighbor relative position and velocity feedback 



time 


Vehicular Chains, Fig. 5 Position trajectories of a 
stochastically forced formation with 100 vehicles con¬ 
trolled with nearest neighbor strategy (13). Left plot 

the entire formation. The length of the formation 
fluctuates stochastically, but with a distinct slow 
temporal and long spatial wavelength signature. 
In contrast, the zoomed-in view in Fig. 5 shows a 
relatively well-regulated vehicle-to-vehicle spac¬ 
ing. In general, small-scale (both temporally and 
spatially) disturbances are well regulated, while 
large-scale disturbances are not. This indicates 
that a local feedback strategy (13) cannot regulate 
against large-scale disturbances. 


Structural Assumptions 

We now list the assumptions on the operators 
K p , K v , and C in (12) under which asymptotic 
scaling trends summarized in section “Scaling 
of Variance per Vehicle with System Size” are 
obtained. 

(Al) Spatial invariance. Operators K p , K v , 
and C in (12) are spatially invariant with 
respect to Z d N . 

(A2) Spatial localization. The feedback (11) 
uses only local information from a neigh¬ 
borhood of width 2 q, where q is indepen¬ 
dent of N. 



00 -- 

3500 3550 3600 3650 3700 


time 

demonstrates accordion-like motion of the entire forma¬ 
tion; right plot shows that vehicle-to-vehicle distances are 
relatively well regulated 

(A3) Reflection symmetry. The interactions be¬ 
tween vehicles exhibit mirror symmetry. 
(A4) Coordinate decoupling. For d > 2, con¬ 
trol in each coordinate direction depends 
only on measurements of position and ve¬ 
locity error vector components in that co¬ 
ordinate. 

While assumptions (A3) and (A4) were made 
to simplify calculations, assumptions (Al) and 
(A2) were essential for the developments in 
Bamieh et al. (2012). 

Performance Measures 

We next examine the dependence of the steady- 
state variance of stochastically forced system (12) 
on the number of vehicles. In the presence of 
relative position or velocity measurements, the 
matrix A in (12) is not necessarily Hurwitz, 
and the state x/r may not have finite steady-state 
variance. However, for connected networks, the 
performance output z that does not penalize the 
motion of the mean will have finite steady-state 
variance; this is because the modes of A at the 
origin will be unobservable from z. The steady- 
state variance of z, 
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V:= Y lim £(z T n (t)z n (t)) (14) 

L —' t-> OO V 7 

ne Z d N 

is quantified by the square of the H 2 norm of 
the system (12) from w to z, and it can be 
determined from the solution of the algebraic 
Lyapunov equation. 

We next summarize two different performance 
measures for stochastically forced vehicular for¬ 
mations. 

(PI) Local error. This is a measure of the dif¬ 
ference of neighboring vehicles positions from 
the desired spacing. In ID, the performance 
output of the nth vehicle is given by 

Z-n ■ = Pn Pn—\ • 

In d -dimensions, the performance output vec¬ 
tor contains as its components the local er¬ 
ror in each respective dimension. Since this 
output involves quantities local to any vehicle 
within a formation, the corresponding steady- 
state variance is referred to as a microscopic 
performance measure , U m i cro . 

(P2) Deviation from average. This is a measure 
of the deviation of each vehicle’s position 
error from the average of the overall position 
error. 

1 

z " := p,, ~m L pj- < 15 > 

i €Z !v 

Since this output determines deviation from av¬ 
erage, and thereby quantities that are far apart in 


the network, the corresponding steady-state vari¬ 
ance is referred to as a macroscopic performance 
measure , F maC ro- 

Scaling of Variance per Vehicle with 
System Size 

We next summarize asymptotic bounds for both 
microscopic and macroscopic performance mea¬ 
sures derived in Bamieh et al. (2012). The upper 
bounds result from simple feedback laws similar 
to the one given in (13). In the situations where 
either absolute position or velocity measurement 
are available, additional terms proportional to p n 
and v n will appear in (13). The lower bounds 
have been obtained for any linear static feedback 
control policy satisfying the structural assump¬ 
tions (A1)-(A4) and the following constraint on 
control variance at each vehicle: 

£ {p n u n) — f4nax- (16) 

Under this constraint, the equivalence between 
scaling trends of lower and upper bounds can be 
established. As illustrated in Table 1 , the depen¬ 
dence of the asymptotic bounds on the number of 
vehicles is strongly influenced by the underlying 
spatial dimension d . 

Since the macroscopic performance measure 
captures how well the formation regulates against 
large-scale disturbances, the scaling results 
presented in Table 1 demonstrate that local 
feedback with relative position measurements 
is unable to regulate against these large-scale 


Vehicular Chains, Table 1 Asymptotic scalings of microscopic and macroscopic performance measures in terms of 
the total number of vehicles M = N d , the spatial dimensions d, and the control effort per vehicle t/ max . Quantities 
listed are up to a multiplicative factor that is independent of M or t/ max : 
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Lnicro/Af 
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Absolute position 
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disturbances in ID. To the contrary, in higher 
spatial dimensions, local feedback can regulate 
against large-scale disturbances and provide 
formation coherence. As shown in Table 1, the 
“critical dimension” needed to achieve network 
coherence depends on the type of feedback 
strategy: dimension 3 for relative position and 
absolute velocity feedback and dimension 5 for 
relative position and velocity feedback. 


Summary and Future Directions 

For stochastically forced vehicular formations in 
topology of regular lattices, we have summa¬ 
rized fundamental performance limitations re¬ 
sulting from the use of local feedback. Even for 
formations that are string stable, local feedback 
is not capable of guarding against slowly varying 
disturbances with long spatial wavelength in ID 
and 2D. The observed phenomenon also arises in 
distributed averaging and estimation algorithms, 
global mean first passage time of random walks, 
effective resistance in electrical networks, and 
statistical mechanics of harmonic solids. Since 
performance measures that we used to quantify 
robustness to disturbances are easily extensible 
to networks with arbitrary topology and more 
complex node dynamics, they can be used to eval¬ 
uate performance of a broad class of networked 
dynamical systems in future studies. 


Cross-References 

► Averaging Algorithms and Consensus 

► Flocking in Networked Systems 

► Networked Systems 

► Oscillator Synchronization 
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This entry reviews vibration control system 
design of buildings in terms of energy dissipation 
and seismic isolation including full active 
control devices and semi-active or passive 
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devices. Vibration control of buildings subjected 
to dynamic loadings such as large earthquakes, 
strong winds, or heavy traffic is one of the most 
important factors to take into consideration to 
secure the users. Since energy dissipation is the 
key technology in vibration control, many kinds 
of devices have been developed for structural 
mitigation. Seismic retrofit of buildings is very 
important because long-period earthquakes occur 
at considerable distances from the seismic center. 
Here, we introduce the application of specific 
devices to the vibration control system design of 
real buildings, especially in Japan, where there 
are many earthquakes. 

Keywords 

Active control; Base isolation; Energy dissipa¬ 
tion; Seismic response control; Seismic retrofit; 
Semi-active control; Vibration control 

Introduction 

Vibration control of buildings subjected to dy¬ 
namic loadings such as large earthquakes, strong 
winds, or heavy traffic is one of the most impor¬ 
tant factors to consider for the safety of building 
occupants. Energy dissipation is the key tech¬ 
nology in vibration control, and many kinds of 
devices have been developed for structural mit¬ 
igation (Soong and Spencer 2002; Spencer and 
Nagarajaiah 2003). In Japan, the 2011 earthquake 
occurred on the Pacific coast of Tohoku, pro¬ 
longed for an extended period to the Tokyo area 
400 km away from the seismic center, and caused 
fatal damages to the buildings of the surrounding 
areas. 

Therefore, seismic retrofitting of buildings 
is very important in Japan because long-period 
earthquakes occur at sites far away from their 
seismic center as well. In particular, old super- 
high-rise buildings are concentrated in the 
central ward of Tokyo, Shinjuku, and they 
have been built on the basis of the theory 
of flexible structures. During a long-period 
earthquake, super-high-rise buildings have very 


large displacement (about 0.5 m) because of 
resonance vibration and may need a few minutes 
to dissipate the structural vibrations. These 
buildings need to be retrofitted by adding some 
energy dissipating devices, such as active mass 
dampers (AMDs), tuned mass dampers, rotating 
inertial mass dampers, and passive/semi-active 
base isolation devices. 

This entry reviews a vibration control system 
design of buildings in terms of energy dissipa¬ 
tion and seismic isolation including full active 
control devices and semi-active or passive de¬ 
vices. We introduce the application of specific 
devices to the vibration control system design of 
real buildings. 

Active Mass Damper 

Active, semi-active, or passive mass damper sys¬ 
tems have been installed in a large number of 
high-rise buildings as shown in Fig. 1 (Soong and 
Spencer 2002; Spencer and Nagarajaiah 2003). 
Although active mass dampers have historically 


Mass damper system 
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Active, semi-active, or passive mass damper system in¬ 
stalled in an n -storied building subjected to wind force and 
ground motion 
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used ball-screw-type actuators, the IHI Corpo¬ 
ration has now developed AMDs driven by a 
linear motor, making the production of a long 
stroke type easier than that in the case of the ball- 
screw-type actuator (Koike et al. 2011). Other 
advantages of using a linear actuator are lesser 
noise and vibration, lightweight, and compact¬ 
ness. Thanks to these advantages, it is expected 
that linear motor type AMDs will be installed 
in existing buildings as seismic retrofitting de¬ 
vices. To avoid reaching the stroke length limit 
of the actuator because of a large earthquake, 
a displacement control of the mass is applied. 
A phase lead compensation in response to the 
displacement signal is used to preview the mass 
stroke. 

The two AMDs have been installed in 
the Docomo Tohoku building of Japan in the 
same direction to improve by 9.5 % the 
damping ratio of the 1st mode of the trans¬ 
lational vibration. The weight of the mass is 
20,000 kg, and the total weight of the device 
is about 25,000kg. The control experiment 
was performed by exciting the building with 
AMDs, and a damping ratio of 11 % was 
obtained by activating the vibration control with 
AMDs. 


Seismic Retrofitting 

Shimizu Corporation modified the super-high- 
rise building (height = 100 m) in the Shibaura 
ward of Tokyo, Japan, by installing rotational 
inertia mass dampers. The rotational inertia mass 
damper has a mechanism consisting of a ball 
screw and a rotational inertia mass, with which 
the relative translational displacement between 
stories can be changed to rotational motion of the 
damper to efficiently increase the dissipation of 
the kinetic energy. 

Although in the previous seismic retrofitting 
many dampers have been distributed in each 
floor as shown in Fig. 2a, Shimizu Corporation 
concentrated the rotational inertia mass dampers 
on the lower floors of the building (e.g., 1-7) as 
shown in Fig. 2b. The seismic response against 
the 2011 Tohoku earthquake would now be re¬ 
duced by about 35 % not only for the maximum 
displacement but also for the maximum accelera¬ 
tion of the top floor. Moreover, the duration time 
would become 220 s instead of 400 s. The method 
of retrofitting super-high-rise buildings is very 
unique because the lower floors behave as isola¬ 
tion layers of the base isolation system. Although 
the displacement of the lower floors becomes 


Vibration Control 
System Design for 
Buildings, Fig. 2 Seismic 
retrofitting, (a) Distributed 
energy dissipation device, 
(b) Energy dissipation 
device concentrated in 
lower floors 
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slightly larger than that before the retrofit, the 
whole building has a good seismic response per¬ 
formance. 


Semi-active Base Isolation 

The semi-active base isolation system as shown 
in Fig. 3 has been mounted in a building for the 
first time in 2000. The building is located on the 
Yagami campus of the Keio University in Yoko¬ 
hama, Japan, and the base isolation system in two 
directions consists of eight semi-active hydraulic 
dampers that can change the damping coefficient 
in four steps using a controllable orifice. The 
maximum damping force is 640 kN, while the 
switching law of damping coefficients is based on 
the optimal bilinear control theory. The damper 
is modeled on the lines of the Maxwell model, 
where a spring and a damper are connected in 
series, and the objective function on the kinetic 
energy of the building and the constraint func¬ 
tion of the squared damping force are adopted 
(Yoshida and Fujio 2000). 



Vibration Control System Design for Buildings, Fig. 3 

Semi-active base isolation system 


In 2008, another type of semi-active base iso¬ 
lation system has been installed by Collaboration 
Complex in the Hiyoshi campus of the Keio Uni¬ 
versity in Yokohama, Japan. The system consists 
of eight semi-active dampers along with eight 
conventional hydraulic fluid dampers in each di¬ 
rection of the X-Y axes. While the maximum 
force of the semi-active damper and the conven¬ 
tional hydraulic fluid damper is about 1,000 kN, 
the semi-active damper can change the damping 
coefficient in two steps, high side, 3.68MNs/m 
and low side, 1.23MNs/m. When an earthquake 
manifests, the high damping coefficient in normal 
status is switched to the low side. This switch 
enables the suppression of the acceleration re¬ 
sponse of the building at the early stages of the 
earthquake. After the early stage, according to 
the acceleration response filtered on the isola¬ 
tion layer, the low damping coefficient should 
be switched to the high side again to avoid the 
collision of the building with the foundations. 

Magneto-rheological (MR) fluid dampers 
have been studied by many researchers, and 
in 2001, two 300 kN MR fluid dampers have 
been installed in Nihon-Kagaku-Miraikan, the 
Tokyo National Museum of Emerging Science 
and Innovation. Similarly in 2003, 400kN MR 
fluid dampers have been installed in a residential 
building in Japan (Fujitani et al. 2003). 

Although MR fluid dampers have been con¬ 
trolled by various laws (Jansen and Dyke 2000), a 
gain-scheduled control method was introduced to 
control the electric current generated by the elec¬ 
tromagnet of the MR damper (Nishimura et al. 
2002). A system controlled by a damping force is 
a bilinear control system, where the control input 
depends not only on the relative velocity but also 
on the damping coefficient. A virtual semi-active 
damper model was proposed that is capable of 
changing the damping coefficient with the valve 
open ratio, which is assumed to be governed by 
the input to the dynamics of 2nd-order system. 
In this device, the optimized variable damping 
coefficient is determined by the input. Moreover, 
the valve opening ratio is limited to certain values 
to constrain the damping force to the maximum 
value. However, the controllability of the bilinear 
control system using the variable damping force 
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depends on the relative velocity of the damper. If 
the relative velocity equals zero, then the system 
is uncontrollable. Thus, the systems relative to 
the positive and the negative sides of the relative 
velocity are separated. Furthermore, the current- 
force relationship of the MR damper is consid¬ 
ered. 

The control method using an MR damper was 
verified on a 9 m high building-like structure. The 
structure had four degrees of freedom and a total 
weight of about 33,000 kg. The MR damper has a 
maximum force of about 40 kN and its current- 
force relationship is nonlinear (Watakabe et al. 
2008). The experimental results demonstrated a 
good seismic isolation performance in compari¬ 
son with the skyhook control. The gain-scheduled 
control proposed gently varied the damping force 
according to the input current. 

Full Active Base Isolation 

Full active base isolation systems have been stud¬ 
ied by many researchers (Nishimura and Kojima 
1999) who evidenced that they are affected by 
the saturation of the force generated by the ac¬ 
tuator following a large earthquake. The seis¬ 
mic isolation performance should be held even 
though the force saturation occurred. To control 
the vibrations, it was proposed to use a hyperbolic 
function for representing the saturation to smooth 
the input force (Itagaki and Nishimura 2005). 

In 2010, the Obayashi Corporation imple¬ 
mented the active base isolation system in real 
buildings (Endo et al. 2011). Two hydraulic 
actuators are connected to the building through a 
spring in each direction of the X-Y axes to avoid 
the transmission of the high-frequency vibration 
from the actuator to the building. The control 
system is based on the displacement control 
of the hydraulic actuator and achieves absolute 
seismic control. The control force is necessary 
to eliminate the spring and damper forces in the 
isolation layer, and the skyhook damper force is 
added to the control force for the stabilization of 
the whole system. 

A trigger mechanism using a friction damper 
is equipped with serial hydraulic actuators and 


can avoid the transmission of the excess input 
force from the actuator to the building. If the 
excess input force is generated from the actu¬ 
ator in fail, the friction damper can absorb a 
force of about 1,000 kN so as not to damage the 
building and the actuator itself. The maximum 
force of the hydraulic actuator is l,100kN, the 
maximum displacement of the hydraulic actuator 
is 200 mm, the maximum displacement of the 
lead-rubber bearing is 500 mm, the maximum 
displacement of the trigger mechanism with the 
friction damper is 750 mm, the spring constant of 
the connected spring is 16,300 kN/mm, and the 
maximum stroke is 58 mm. Compared to passive 
isolation, simulations demonstrated that the base 
isolation system performed well, especially dur¬ 
ing earthquakes with maximum acceleration less 
than 200 cm/s 2 . 

Summary and Future Directions 

Seismic retrofitting may become increasingly 
important for protecting buildings from large and 
long-period earthquakes. The optimization of the 
structural mitigation as a whole system must be 
the objective of future studies aiming to achieve 
an effective energy dissipation and seismic 
isolation of buildings. Energy harvesting from 
vibration control or three-dimensional isolation 
devices will draw attention in the near future. 

Cross-References 

► H-Infinity Control 

► Linear Quadratic Optimal Control 

► LMI Approach to Robust Control 

► Modeling of Dynamic Systems from First Prin¬ 
ciples 

► Stochastic Linear-Quadratic Control 

Recommended Reading 

Vibration control system design for buildings 
has been summarized in many journal papers 
over the last several years. Spencer and Sain 
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(1997), Soong and Spencer (2002), and Spencer 
and Nagarajaiah (2003) discuss applications of 
vibration control systems to buildings or bridges 
to support infrastructures. Rossetto and Duffour 
(2012) and Saatcioglu (2012) discuss earthquake- 
resistant design and structural mitigation of earth¬ 
quakes with structural control including with pas¬ 
sive devices. 
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Abstract 

This article presents an overview of mobile 
“walking” robots that use their legs to move from 
one place to another. Walking robots represent 
a fascinating class of machines which holds 
the potential for breakthrough applications and 
inspires multidisciplinary research with rich 
scientific content. The key feature that separates 
walking robots from all other classes of mobile 
robots is their ability to explore unprepared 
surfaces using discrete footholds. In this respect, 
these robots are truly the machine counterparts of 
biological land animals. 


Keywords 

Balance; Fall; Gait; Humanoid robots 


Introduction 

The adventure of modern robotics is generally 
considered to have started from the middle of 
the twentieth century (International Federation 


of Robotics 2011). During the first few decades 
of this new journey, robots were not mobile. 
Somewhat similar to trees, these so-called 
“arm” manipulator robots were securely rooted 
to the ground. The free end of these robots 
typically consisted of an end-effector “hand” 
with which a number of mostly manufacturing- 
related tasks, such as welding, spray-painting, 
and pick-and-place operations, were performed. 
Life was simple, if a bit boring. However, 
from the end of the 1960s, this started to 
change. 

Fiction writers had earlier imagined a variety 
of mobile robots such as in “I, Robot” (Asimov 
1950), Otho (Hamilton 1940), and Maria 
(Malone 2004). Scientists and engineers 
also ventured to build a number of quite 
sophisticated machines such as the General 
Electric experimental “walking truck” quadruped 
robot by Mosher shown in Fig. 1 and the 
Sparko and Elektro by Westinghouse (http:// 
en.wikipedia.org/wiki/Elektro). However, they 
were not considered truly autonomous in the 
sense we describe modern robots. Some of the 
major personalities who are primarily responsible 
for forever transforming the state of stationary 
existence of robots and giving them intelligent 
mobility are Profs. I. Kato, M. Vukobratovic, and 
R. McGhee, followed by Prof. M. Raibert. 

Because walking robots used legs for locomo¬ 
tion, they immediately became the mechatronic 
cousins to the entire range of biological legged 
creatures, starting from tiny creatures to large 
animals. Indeed, today we have robotic versions 
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Walking Robots, Fig. 1 GE “walking truck” developed 
by Mosher 



Walking Robots, Fig. 2 Adaptive suspension vehicle 
(ASV), Ohio State University 


of spiders and cockroaches, geckoes and lizards, 
dogs and cheetah, and even humanoids. We have 
seen very large robots such as the ASV (Wal¬ 
dron and McGhee 1986) shown in Fig. 2 and the 
Dante (Bares and Wettergreen 1999), shown in 
Fig. 4. We have also seen single-legged robots, 
which even Mother Nature has not considered 
creating so far. 


Early History 

The early researchers whom we mentioned 
above started paving the way for walking robots. 
These robots walked with their legs, explored 
their own environments, and sometimes even 
ventured outside. Once these walking robots 
started appearing on the scene, life was never the 
same. 

Prof. Kato pioneered walking robot research 
at Waseda University (Japan) through a series 
of remarkable biped humanoid robots, of which 
WL-5 is credited with genuine bipedal walk¬ 
ing and WL-6 with displaying the first dynamic 
gait. At the same time, Prof. Vukobratovic was 
conducting research activities in exoskeleton and 
other areas at the Mihailo Pupin Institute (former 
Yugoslavia). He was instrumental in formaliz¬ 
ing the concept of dynamic balance using the 
zero moment point (ZMP) concept (Sardain and 
Bessonnet 2004; Vukobratovic and Juricic 1969), 
which is used to this day. In the USA, Prof. 
McGhee conducted path-breaking research on 
computer-controlled machines at the Ohio State 
University. He created the Ohio hexapod and 
later, with colleague Prof. Ken Waldron, devel¬ 
oped the truly spectacular Adaptive Suspension 
Vehicle (ASV) hexapod. 

Prof. Raibert started building robots in the 
USA, first at Carnegie Mellon University and 
then at Massachusetts Institute of Technol¬ 
ogy (Raibert 1989). With his colleagues, he 
created a series of robots, which, unlike their 
stationary predecessors, were characteristically 
full of energy. Situation permitting, they 
would occasionally deviate from conventional 
walking and running and would burst into 
aerial somersaults and other acrobatic motions. 
Prof. Raibert continues to actively shape the 
field of walking robots to the present day; his 
company Boston Dynamics (recently acquired 
by Google Inc.) has introduced a number of high- 
performance robots, such as LittleDog, BigDog, 
RHex, Petman, and Atlas. 

The hardware, sensing, and control aspects 
of walking robots were steadily gaining 
sophistication during the 1990s. However, except 
for the new appreciation of walking dynamics 
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in the study of passive bipedal gait (McGeer 
1990), there was no unexpected leap in the world 
of walking robots. This changed in 1996 when 
Honda publicly announced the humanoid robot 
P2, the result of their robotics project, till then 
unknown to the outside world. This was to be 
superseded by the P3 robot and then the ASIMO 
humanoid robot project in 2000, which became 
another important event in the humanoid robot 
history. 


Characteristics of Walking Robots 

Compared to other forms of land locomotion, 
legged walking possesses the distinct capability 
of locomotion using discrete footholds (Raibert 
1989). Unlike wheeled mobile robots or cars, 
walking robots do not need a continuous prepared 
surface such as paved road, trail, or track in order 
to travel. By virtue of this single feature, a vast 
extent of land surface, which is not accessible 
to wheeled robots, opens up to walking robots. 
Indeed, at least in principle, walking robots are 
able to reach almost any location, on earth and on 
other planets, wherever human and other legged 
creatures can go. 

Legged locomotion is natural to terrains where 
the only means of locomotion must be through 
the use of unstructured footholds, which can be 
irregularly spaced both horizontally and verti¬ 
cally. Due to the unique design of the leg, legged 
creatures can largely isolate the “payload” or 
the upper body from the geometric details of 
the terrain profile during locomotion. Both for 
biological creatures and for walking robots, this 
brings benefit in the form of significant energy 
savings. For walking robots this also reduces 
mechanical stress, vibration, and wear on the 
system hardware, which makes them suitable for 
locomotion in rough natural terrain. 

In contrast, wheeled robots are typically faster, 
mechanically less complex, and energetically 
more efficient. However, these benefits must 
be supported by very expensive infrastructure 
overhead. In many places such expenditure is not 
practical or not even desirable. 


Classification of Walking Robots 

Walking robots have been built in different sizes 
and morphologies. These robots have ranged in 
sizes from small hexapods (Lewinger et al. 2005), 
medium-sized robots (Fig. 4), and relatively large 
robots such as the BigDog (Raibert et al. 2008) 
from Boston Dynamics and Toyota iWalk (Fig. 4) 
and also a few giant robots such as Dante (Bares 
and Wettergreen 1999) and Ambler (Fig. 4) from 
CMU and the ASV (Waldron and McGhee 1986) 
from OSU. With further miniaturization, it is 
conceivable that we will see even smaller walking 
robots in the future with unanticipated and sur¬ 
prising application domains. One can also imag¬ 
ine gigantic walking robots in potential applica¬ 
tions in large construction sites such as in bridge, 
building, or ships, but we have not started seeing 
them just yet. 

In terms of the number of legs, we have 
already seen monopods, Figs. 3b and 4a; bipeds, 
Fig. 8a-c; tripod, Fig. 4b; quadruped, Fig. 4a, b; 
hexapods, Figs. 4c, d and 2; octopod, Fig. 4e; and 
“centipede” robots with many legs, Fig. 4f. 

Other than monopods, robots with odd- 
numbered legs are curiously absent in this list. 
Creatures with odd-numbered legs are also not 
found in nature. It is not clear if an engineering 
rationale is present behind this trend or the 
biological inspiration is simply missing for the 
creators of legged robots. 

In addition to size and morphology, walking 
robots can be classified in terms of the number 
and types of leg joints, type of gait (e.g., walking 
or running), or the domain of movement. The 
next section is devoted to the humanoid robots, 
which is perhaps the most popular class of walk¬ 
ing robots. 


Humanoid Robots 

Humanoid robots belong to a unique class of 
two-legged walking robots that has a special 
place in the popular psyche. These robots are the 
subject of special affection and fascination due 
to their similarity with human beings. In fact, 
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Walking Robots, Fig. 3 Early walking robots: (a) Waseda WL-10 (Image courtesy Atsuo Takanishi) and (b) one- 
legged robot (Image courtesy of Boston dynamics) 


humanoid robots might be the original inspiration 
behind the entire field of robotics and perhaps 
also its ultimate goal. Being perpetually inspired 
by movies and novels, a long-standing dream 
of the human has been to create a mechatronic 
replica of themselves, the human, which will be 
fully general-purpose endowed with all human 
functionalities except perhaps the full indepen¬ 
dence of thought and action. 

Humanoid robots exist in different sizes, in¬ 
cluding smaller robots such as NAO (Gouaillier 
et al. 2009), HOAP, and QRIO (Ishida et al. 2004) 
and life-sized robots such as HRP, HUBO, and 
ASIMO. Despite their differences, these robots 
bear a close resemblance to the kinematic design 
and proportions of a human being and share 
a common human-mimicking morphology. In¬ 
deed, the perceived similarity between humanoid 
robots and the human is so close that we routinely 
describe aspects of such robots using anthropo¬ 
morphic terms. Terms like head, arm, hand, leg, 
thigh, shank, ankle, spine, gait, stumble, fall, 
facial expression, and even emotion are hardly 
ever used to describe any other man-made device. 
Some popular humanoid robots are shown in 
Fig. 9. 


At current technical level, humanoid robots 
cannot compete in their actual utility with 
robots such as Roomba the vacuum cleaner, 
the bomb-sniffing robot, or the huge population 
of fully active and cost-effective welding and 
spray-painting robots. Yet, our fascination 
with humanoids remains as strong as ever, 
and novel applications of such robots are 
continuously being explored (Fig. 7). Humanoid 
robots are currently considered in roles of 
educators (Falconer 2013; Yamasaki and 
Nakagawa 2006), dance partners (Kosuge 2010), 
waiters, babysitters, companions for autistic 
children or for seniors (Robins et al. 2012), 
security, or emergency response team. Curiously, 
the functionality of walking is not relevant or 
central to many of these roles. 

Dynamic Equations of Walking 
Robots 

The dynamic equations of a walking robot can be 
expressed in the following form: 

H ( q ) q + C(q, q) q + r g (q) = T + T c + T ext , 

( 1 ) 
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Walking Robots, Fig. 4 Walking robots with different (RHex robot image courtesy of Boston Dynamics); 
number of legs: (a) monopod, Toyota hopping robot; (e) octopod, Spider, RoMeLa (Image courtesy of Dennis 
(b) tripod, STriDER, RoMeLa (Image courtesy of Den- Hong); and (£) many legs, centipede, Harvard 
nis Hong); (c) large hexapod, McGhee, OSU; (d) RHex 




















1542 


Walking Robots 



Walking Robots, Fig. 5 Two quadruped robots: (a) Sony Aibo (Image courtesy of Sony) and (b) BigDog robot (Image 
courtesy of Boston Dynamics) 



Walking Robots, Fig. 6 Large walking robots: (a) Dante II, CMU; (b) Ambler, CMU; and (c) John Deere Walking 
Tractor 
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Walking Robots, Fig. 7 Novel application of walking robots: human-carrying “chair” robots, (a) iWalk of Toyota and 
(b) WL-16RV multi-purpose biped locomotor from Waseda University (Image courtesy of Atsuo Takanishi) 


where q is the vector of the robot’s generalized 
coordinates, which contains the world frame 
transformation matrix of its base link and all its 
joint angles. The generalized velocity vector is 
expressed as q = [vb 0] t where vb is the base 
velocity and 0 is the vector of joint velocities. 
Additionally, H is the joint-space inertia matrix; 
C is the matrix of Coriolis, centrifugal, and 
gyroscopic terms; and r g is the vector of gravity 
terms. Finally, r = [0 r] T is the joint torque 
vector, r c = J T C f c is the joint torque resulting 
from the contact forces f c such as from the 
ground, and r ext = J T e f e is the joint torque due 
to external interaction forces f e . 

The contact conditions which the robot must 
satisfy can be written in the form of Eq. 2. The 
physical constraints due to ground friction, center 
of pressure (CoP) condition (explained subse¬ 
quently), torque limits, etc., can be expressed as 
in Eq. 3 

Jc(q) =b(q,q), (2) 

A[q r f c ] T <b(q,q), (3) 

The friction condition ensures that the robot 
feet do not slide on the ground, and the CoP 
condition corresponds to maintaining the resul¬ 
tant of the ground reaction force (GRF) within 


the perimeter of the support polygon (Sardain and 
Bessonnet 2004) so that toppling is prevented. 

Some of the generalized coordinates of the 
robot, specifically those which describe the base 
link of the robot to the world frame, are not 
powered, as apparent from the joint torque vector 
representation r = [0 r] r , in Eq. 1. In other 
words, the robot is called underactuated. In fact, 
all walking robots are underactuated, and it is one 
of the central characteristics that sets these robots 
apart from other robots. Underactuation plays a 
very important role in the dynamics, motion plan¬ 
ning, and control of walking robots (Chevallereau 
et al. 2005). 


Balance and Stability 

Even after several decades of research, balance 
maintenance has remained one of the most im¬ 
portant issues of walking robots and especially 
of humanoid robots. Although the basic dynam¬ 
ics of balance are currently understood (Sardain 
and Bessonnet 2004; Vukobratovic and Juricic 
1969), robust and general controllers that can 
deal with discrete and nonlevel foot support as 
well as large, unexpected, and unknown exter¬ 
nal disturbances such as from a moving sup¬ 
port, a slip, and a trip have not yet emerged. 
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Walking Robots, Fig. 8 

Two well-known 
human-sized humanoid 
robots: (a) ASIMO, 
Honda, (b) HRP-2, AIST 
(Image courtesy of AIST). 
(c) HUBO, Korea 



In comparison with the elegance and versatility 
of human balance, present-day humanoid robots 
appear quite deficient. 

Balance generally refers to the ability of a 
walking robot to maintain a sustained gait with 
a reasonably upright posture without falling 
(Kajita and Espiau 2008). Robot gait can be static 
or dynamic. A robot with a static gait would 
continue to stay upright even if its joints were 
suddenly frozen. Static gait and movement under 
static balance are safe but are slow and lacks 


elegance. A dynamic gait is fluid and natural 
looking as it harnesses and exploits the inertial 
characteristics of the physical robot. However, 
the robot must be in motion for it to sustain an 
upright stature. Suddenly locking the joints may 
cause a fall. 

The location and the nature of the resultant 
GRF on the support polygon of the robot have 
been traditionally used to interpret the dynamic 
state of the robot’s movement. The point where 
the resultant GRF acts on the robot is called its 
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Walking Robots, Fig. 9 Three popular humanoid robots: (a) AIST HRP-4 (Image courtesy of AIST), (b) Toyota 
Partner Robot, and (c) Waseda University Wabian (Image courtesy of Atsuo Takanishi) 



Walking Robots, Fig. 10 Three small humanoid robots: Aldebaran NAO (Image courtesy of Aldebaran), Fujitsu 
HOAP-2, and Sony QRIO (Image courtesy of Sony) 


zero moment point (ZMP), and it is equivalent to 
the CoP for planar support. Figure 11 explains the 
concept of CoP. 

As shown in Fig. 11, two types of interaction 
forces act on the foot at the foot/ground interface. 


They are the normal forces f ni , always directed 
upward (Fig. 11, left), and the frictional tangen¬ 
tial forces f ti (Fig. 11, middle). CoP, denoted 
by P, is the point where the resultant R n = 
J2 f ni acts - With respect to a coordinate origin 
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Walking Robots, Fig. 11 Definition of center of pres- a general setting, a single footprint is replaced by the 
sure (CoP), shown for one foot of a humanoid robot. support polygon which is the convex hull of all ground 
The idea can be extended to any walking robot, and in contact of the robot 


O , OP 


T, r ifm 

Zfni 


where r, is the vector to 


the point of action of force f i and f n i is the 
magnitude of f ni . 

Because of the unilaterality of the foot/ground 
constraint f ni > 0, which implies that P must 
lie within the support polygon. The resultant of 
the tangential forces may be represented at P 
by a force R t = fa an d a moment M = 
^2 r i x fa where is the vector from P to 
the point of application of J2 f n A basic control 
objective for walking robots is to maintain the 
CoP within the perimeter of the support polygon. 


Safety 

Safety is a serious concern that is paramount to 
any application where robots are likely to coexist 
in interactive human environments. The power of 
mobility of walking robots adds to this concern. 

Out of a number of possible situations where 
safety is an issue, one that involves a balance 


loss and fall is particularly worrisome for walking 
robots. All walking robots, and in fact all mo¬ 
bile robots, are subjected to this unique “failure” 
mode. A fall may be caused due to unexpected 
or excessive external forces, unusual or unknown 
slipperiness, and slope or profile of the ground, 
causing the robot to slip, trip, or topple. Fall can 
also result when the balance controller is partially 
or fully incapacitated due to an internal failure of 
the robot involving its sensor or actuator. 

Fall can be costly in terms of the damage to 
the robot and also, depending on the shape and 
size of the robot, can result in external damage 
and injury to human. 

For humanoid robots, fall is a particularly 
serious issue (Fujiwara et al. 2002). Humanoid 
robots, similar to humans, have a larger ratio of 
CoM height to support area size, which makes 
them more susceptible to fall, in case of a failure. 
At the same time, due to their higher CoM, a fall 
of such robots contains generally higher kinetic 
energy which is able to cause higher damage and 
injury. 
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Summary 

Walking robots represent an important class of 
autonomous machines which can find application 
in the general area of service robotics. The power 
of mobility makes these robots uniquely capable 
of serving in niche need areas such as plant main¬ 
tenance and security, disaster response, personal 
companion, and so on. Humanoid walking robots 
have attracted strong popular fascination, and this 
has fueled their rapid development. At present it 
appears that defense-related applications are the 
most likely to experience practical use of walking 
robots. 

Walking robots possess interesting and com¬ 
plex kinematics and dynamics. Control of such 
machines, especially with regard to balancing, 
motion planning, and reactive behavior, is a rich 
research area that is challenging and demands 
special skill-sets. 


Cross-References 

► Disaster Response Robot 

► Redundant Robots 

► Robot Motion Control 

► Robot Teleoperation 

► Underactuated Robots 


Recommended Reading 

Out of the references listed below, Vukobratovic 
and Juricic (1969) is the earliest paper dealing 
with bipedal robot balance, and it introduces the 
concept of ZMR A very good recent overview of 
legged robots can be found in Kajita and Espiau 
(2008). Also of interest is the foundational paper 
on passive bipedal gait by McGeer (1990). 
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Wheeled Robots 
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Abstract 

The use of mobile robots in service applications 
is steadily increasing. Most of these systems 
achieve locomotion using wheels. As a conse¬ 
quence, they are subject to differential constraints 
that are nonholonomic, i.e., non-integrable. This 
article reviews the kinematic models of wheeled 
robots arising from these constraints and dis¬ 
cusses their fundamental properties and limita¬ 
tions from a control viewpoint. An overview of 
the main approaches for trajectory planning and 
feedback motion control is provided. 


Keywords 

Differential flatness; Nonholonomic constraints; 
Nonlinear controllability; Smooth stabilizability 


Introduction 

Although all robots are, by definition, capable 
of movement, the expression mobile robots 
is mainly used to indicate robots that can 
displace their own base by means of some 
locomotion mechanism. Most often, this consists 
of a set of wheels. The main advantage of 
mobile robots over fixed-base manipulators 


is their virtually unlimited workspace. As a 
consequence, such robots are fundamental in 
service applications, which require increased 
capabilities of autonomous motion. 

More precisely, from a mechanical viewpoint, 
a wheeled robot essentially consists of a rigid 
body (base) equipped with a system of wheels. 
This basic arrangement may be complicated, for 
example, by attaching to the base one or more 
trailers, or by mounting a manipulator on the base 
(mobile manipulator). 

Any wheeled vehicle is subject to kinematic 
constraints that in general reduce its local mobil¬ 
ity while leaving intact the possibility of reaching 
arbitrary configurations by appropriate maneu¬ 
vers. For example, any driver knows by experi¬ 
ence that, while it is impossible to move instan¬ 
taneously a car in the direction orthogonal to its 
heading, it is still possible to park it anywhere, 
at least in the absence of obstacles. This peculiar 
feature makes wheeled mobile robots very chal¬ 
lenging from the control viewpoint, and in fact, 
some recent developments in nonlinear control 
were triggered by the study of these systems. 

Here, we will consider only mobile robots that 
are equipped with conventional wheels, either 
orientable or fixed (as the front or rear wheels of a 
car, respectively). Omnidirectional mobile robots 
realized using, e.g., Mecanum wheels, are not 
covered in this article. Indeed, the local mobility 
of these vehicles is unrestricted, and therefore no 
special control treatment is necessary. 

The most popular wheel arrangement for mo¬ 
bile robots is the differential drive , in which two 
fixed wheels whose axes of rotation coincide are 
controlled by separate actuators (see Fig. 1). One 
or more passive (caster) wheels are usually added 
for statical balance. This wheeled robot is the 
most agile, in that it can rotate on the spot by 
applying equal and opposite angular speeds to the 
wheels. A kinematically equivalent arrangement 
is the synchro drive , in which three orientable 
wheels are synchronously driven by two motors 
through mechanical coupling; the first motor pro¬ 
vides traction, whereas the second controls the 
common orientation of the wheels. 

Other possible wheel arrangements are those 
of a tricycle (one steering and two fixed wheels) 
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Wheeled Robots, Fig. 1 The Pioneer by Adept is a 
popular differential-drive platform 

and of a car (two steering and two fixed wheels). 
Vehicles of this type are however less common 
in robotics, due partly to their reduced maneuver¬ 
ability (they have a nonzero turning radius) and 
partly to their increased mechanical complexity. 
For example, both these vehicles require a spe¬ 
cific device (differential) for distributing traction 
torque to the driving wheels. 

Modeling 

The starting point for modeling wheeled mobile 
robots is the single wheel. This may be repre¬ 
sented as an upright disk rolling on the ground. 
Its configuration is described by three generalized 
coordinates: the Cartesian coordinates (v,y) of 
the contact point with the ground, measured in 
a fixed reference frame, and the orientation 9 
of the disk plane with respect to the x axis 
(see Fig. 2). The configuration vector is therefore 
q = (x y 9 ) T . The pure rolling constraint is 
expressed as 



and entails that, in the absence of slipping, 
the velocity of the contact point has a zero 
component in the direction orthogonal to the 
wheel plane. The angular speed of the wheel 
around the vertical axis is instead unconstrained. 


x 

Wheeled Robots, Fig. 2 Generalized coordinates for a 
single wheel 

The kinematic constraint (1) is nonholonomic , 
i.e., it cannot be integrated to a geometric 
constraint; this may be easily shown using 
Frobenius theorem, a well-known differential 
geometry result on integrability of differential 
forms. An important consequence of this fact is 
that constraint (1) implies no loss of accessibility 
in the configuration space of the wheel. 

In a single-body vehicle equipped with multi¬ 
ple wheels, the n -dimensional configuration vec¬ 
tor q consists of the Cartesian coordinates of a 
representative point on the robot, the orientation 
of all independently orientable wheels, plus the 
orientation of the body if there are fixed wheels. 
By writing one pure rolling constraint like (1) 
for each independent wheel, orientable or fixed, 
and expressing it in the chosen generalized coor¬ 
dinates, one obtains a set of k constraints in the 
form 

A r (q)q = 0. (2) 

Kinematic constraints of this form (i.e., linear in 
the generalized velocities) are called Pfaffian. In 
wheeled mobile robots, Pfaffian constraints are in 
general completely nonholonomic. 

The k Pfaffian constraints (2) reduce the num¬ 
ber of degrees of freedom (i.e., independent in¬ 
stantaneous motions) of the robot to m = n — k. 
In particular, at each configuration q , the general¬ 
ized velocities must belong to the m -dimensional 
null space of matrix A T (q): 

m 

q = Y^gj ( q)uj = G(q)u, (3) 
j =i 

where vectors g\(q ),..., g m (q) are a basis of 
J\f(A T (q)) and u = (u\ ... u m ) T is a coefficient 
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vector. Kinematically admissible trajectories are 
the solutions of (3), which is called kinematic 
model of the wheeled mobile robot. This model 
can be seen as a nonlinear dynamic system, with 
q as state and u as input. In particular, system (3) 
is driftless and has more state variables than 
control inputs. 

For example, consider the unicycle , a rather 
ideal mobile robot equipped with a single, ori- 
entable wheel. The generalized coordinates for 
this robot are q = (xy 0) T , the same as the single 
wheel, and the vehicle is subject to the rolling 
constraint (1). One possible kinematic model for 
the unicycle is then 

cos 6\ /0\ 

sinO I v + I 0 I co, (4) 

0 ) W 

where v = y/x 2 + y 2 and co = 0 represent, 

respectively, the driving and steering velocity of 
the wheel. Both the differential-drive and the 
synchro-drive robots are kinematically equivalent 
to the unicycle, i.e., their kinematic model can be 
put in the form (3) by properly defining q and u. 

Similar to what is done for robot manipulators, 
the dynamic models of wheeled mobile robots 
may be derived following the Euler-Lagrange 
method. The main difference is the presence of 
the nonholonomic Pfaffian constraints, which 
give rise to reaction forces expressed via 
Lagrange multipliers (Neimark and Fufaev 
1972). 


Structural Properties 

The nonholonomic nature of wheeled mobile 
robots has precise consequences in terms of struc¬ 
tural properties of the kinematic model (3). 

The first, and most important, is that in spite 
of the reduced number of degrees of freedom, a 
wheeled robot is controllable in its configuration 
space; i.e., given two arbitrary configurations, 
there always exists a kinematically admissible 
trajectory (with the associated velocity inputs) 
that transfers the robot from one to the other 



Wheeled Robots, Fig. 3 In spite of its restricted local 
mobility, a nonholonomic wheeled robot can reach any 
point in its configuration space 

(Fig. 3). Since the kinematic model (3) is drift¬ 
less, a well-known result (Chow theorem) implies 
that it is controllable if and only if the accessibil¬ 
ity rank condition holds: 

dim A = n, (5) 

where A denotes the involutive closure of distri¬ 
bution A = {g \,..., gm) under the Lie bracket 
operation. In turn, this is guaranteed to be true in 
view of the nonholonomy of constraints (2). For 
example, since the Lie bracket of the two input 
vector fields in (4) is always linearly independent 
from them, the kinematic model of the unicycle 
is controllable. 

However, the controllability of wheeled mo¬ 
bile robots is intrinsically nonlinear. In fact, the 
linear approximation of (3) at any configuration 
clearly results to be uncontrollable due to the 
reduced number of inputs. In practice, this means 
that no linear feedback can stabilize the system 
at a given configuration. The situation is actu¬ 
ally worse: for nonholonomic robots, there exists 
no continuous time-invariant feedback law that 
provides point stabilization. This negative result 
can be established on the basis of a celebrated 
result on smooth stabilizability of control systems 
due to Brockett (1983). Note that the result does 
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not apply to time-varying stabilizing controllers, 
which may thus be continuous in q. 

Another related drawback of wheeled mo¬ 
bile robots is that in general, they do not admit 
universal controllers, i.e., feedback control laws 
that can asymptotically stabilize arbitrary state 
trajectories, either persistent or not (Lizarraga 
2004). This means that, in principle, tracking and 
regulation problems in wheeled robots should be 
addressed using separate approaches. 

All the above limitations of nonholonomic 
systems are established with reference to the 
kinematic model, but of course, they are passed 
on to dynamic models. Altogether, they con¬ 
tribute to making the control problem for wheeled 
mobile robots much more difficult than, for 
example, for robotic manipulators, which are 
linearly controllable, smoothly stabilizable and 
admit universal controllers. 

Trajectory Planning 

Trajectory planning for wheeled robots is a 
nontrivial problem, because not all trajectories 
are feasible - once again, a consequence of 
nonholonomy. This leads to the necessity of 
maneuvering, i.e., performing certain specific 
movements, in order to execute transfer 
motions. 

Most kinematic models of wheeled mobile 
robots exhibit a property known as differential 
flatness (Fliess et al. 1995): namely, there exists a 
set of outputs z, called flat outputs, such that the 
state q and the control inputs u can be expressed 
algebraically as a function of z and its time 
derivatives up to a certain order a: 


q = (p (z,z,z, ■ ■ ■ ,z (<7) ) 

(6) 

u = y (z,z,z, ■ ■ ■ 

(7) 


As a consequence, once an output trajectory z(t) 
is specified, the associated state trajectory q(t) 
and control history u(t ) are uniquely determined. 
For example, the unicycle admits z = (x y) T as 
flat outputs. In fact, once a Cartesian trajectory is 
assigned for the contact point with the ground, 
the wheel orientation 6{t) is constrained to be 


tangent to the trajectory; the associated control 
input v and co are then uniquely and algebraically 
computable from q(t). 

Differential flatness is particularly useful for 
planning. For example, assume that we want to 
transfer a wheeled mobile robot from an initial 
configuration q t to a final configuration q /. One 
then computes the corresponding values zt and z / 
of the flat outputs, plus the appropriate boundary 
conditions, and uses any interpolation scheme 
(e.g., polynomial interpolation) to plan the tra¬ 
jectory of z. The evolution of the generalized co¬ 
ordinates q, together with the associated control 
inputs u, can then be computed algebraically from 
(6-7). The resulting configuration space trajec¬ 
tory will automatically satisfy the nonholonomic 
constraints (2). 

Another approach to nonholonomic trajectory 
planning is based on the possibility of putting the 
equations of most wheeled robots into a canonical 
format known as a 2-input chained form 

Z\ = Wj 

Z2 = W 2 

Z.3 = Z 2 W 1 (8) 


Z n = Zn-lWl 

by means of a feedback transformation, i.e., a 
change of coordinates z = ot(q) coupled with an 
input transformation w = fi(q)u. In particular, 
this is always possible with kinematic models (3) 
for which n < 4 and m = 2 (e.g., unicy¬ 
cle or car-like robots). Once the system is cast 
in the form (8), one may use sinusoidal open- 
loop controls at integrally related frequencies 
to drive all variables sequentially to their final 
values (Murray and Sastry 1993). This approach 
is particularly interesting from a theoretical view¬ 
point because such control maneuvers achieve 
motion in the direction of the Lie brackets of the 
input vector fields. 

Note that differential flatness and chained- 
form transformability are equivalent properties 
for 2-input nonholonomic mobile robots. 
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Feedback Control 

The motion control problem for wheeled mobile 
robots is generally formulated with reference to 
the kinematic model (3). For example, in the 
case of the unicycle (4), this means that the 
control inputs are directly v and co, the driving 
and steering velocities. There are essentially two 
reasons for taking this simplifying assumption. 

First, the kinematic model (3) fully captures 
the essential nonlinearity of single-body wheeled 
robots, which stems from their nonholonomic 
nature. This is another fundamental difference 
with respect to the case of robotic manipula¬ 
tors, in which the main source of nonlinearity 
is the inertial coupling among multiple bodies. 
Second, in mobile robots it is typically not pos¬ 
sible to command directly the wheel torques, 
because there are low-level wheel control loops 
integrated in the hardware or software architec¬ 
ture. Any such loop accepts as input a reference 
value for the wheel angular speed, which is then 
reproduced as accurately as possible by stan¬ 
dard regulation actions (e.g., PID controllers). 
In this situation, the actual inputs available for 
high-level control are precisely these reference 
velocities. 

Two basic control problems can be considered: 

• Trajectory tracking : the robot must asymp¬ 
totically track a desired Cartesian trajectory 
(xd{t),yd(t)). 

• Point stabilization : the robot must asymptoti¬ 
cally reach a desired configuration qd . 

From a practical point of view, the most rele¬ 
vant of these problems is certainly the first. This 
is because mobile robots must be able to operate 
in unstructured workspaces that invariably con¬ 
tain obstacles. Clearly, forcing the robot to move 
along (or close to) a trajectory planned in advance 
reduces considerably the risk of collisions. The 
point stabilization problem, however, is more dif¬ 
ficult and therefore particularly interesting from a 
scientific perspective. In a certain sense, the rela¬ 
tive difficulty of the two problems is reminescent 
of human car driving: learning to drive a car along 
a road is relatively easy, whereas parking poses a 
greater challenge. 


T rajectory T racking 

Several methods are available to drive a wheeled 
mobile robot in feedback along a desired trajec¬ 
tory. A straightforward possibility is to compute 
first the linear approximation of the system along 
the desired trajectory (which, unlike the approx¬ 
imation at a configuration, results to be control¬ 
lable) and then stabilize it using linear feedback. 
Only local convergence, however, can be guar¬ 
anteed with this approach. For the kinematic 
model of the unicycle, global asymptotic stability 
may be achieved by suitably morphing the linear 
control law into a nonlinear one (Canudas de Wit 
et al. 1993). 

In robotics, a popular approach for trajectory 
tracking is input-output linearization via static 
feedback. In the case of a unicycle, consider as 
output the Cartesian coordinates of a point B 
located ahead of the wheel, at a distance b from 
the contact point with the ground. The linear 
mapping between the time derivatives of these 
coordinates and the velocity control inputs turns 
out to be invertible provided that b is nonzero; 
under this assumption, it is therefore possible to 
perform an input transformation via feedback that 
converts the unicycle to a parallel of two simple 
integrators, which can be globally stabilized with 
a simple proportional controller (plus feedfor¬ 
ward). This simple approach works reasonably 
well. However, if one tries to improve tracking 
accuracy by reducing b (so as to bring B close 
to the ground contact point), the control effort 
quickly increases. 

Trajectory tracking with b = 0 (i.e., 

for the actual contact point on the ground) 
can be achieved using dynamic feedback 
linearization (Oriolo et al. 2002). In particular, 
this method provides a one-dimensional dynamic 
compensator that transforms the unicycle into a 
parallel of two double integrators, which is then 
globally stabilized with a proportional-derivative 
controller (plus feedforward). In contrast to static 
feedback linearization, no residual zero dynamics 
is present in the transformed system. However, 
the dynamic compensator has a singularity 
when the unicycle driving velocity is zero. 
This is expected, because otherwise the tracking 
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controller would represent a universal controller. 
Note that dynamic feedback linearizability using 
the x, y outputs is related to them being flat - the 
two properties are equivalent. 


Point Stabilization 

The impossibility of stabilizing a nonholonomic 
mobile robot using continuous pure-state feed¬ 
back has generated two main directions of re¬ 
search to solve the problem: 

• Discontinuous feedback, i.e., time-invariant 
control laws u = y(q ), where y is discon¬ 
tinuous precisely at the configuration that one 
seeks to stabilize. 

• Time-varying feedback, in the form u = 
y(q,t) where y may or may not be continuous 
at the desired configuration. 

For the unicycle, a well-known stabilizing 
controller belonging to the first category was 
designed by Aicardi et al. (1995) by formulating 
the problem in polar coordinates centered at the 
goal and then using a Lyapunov-like analysis to 
establish asymptotic convergence. The controller, 
once rewritten in original coordinates, turns out 
to be discontinuous at the goal (not surprisingly). 
Although this rules out proper stability in the 
sense of Lyapunov, this controller is effective in 
that it produces rather natural approach trajecto¬ 
ries to the goal. 

Continuous time-varying stabilizers in the 
sense of Lyapunov exist (Samson 1993) but have 
mainly theoretical interest due to their provably 
slow (polynomial) rate of convergence; this is a 
direct consequence of the fact that the linear ap¬ 
proximation of the system is not controllable. A 
more effective approach is to give up (Lipschitz-) 
continuity at the desired configuration. As shown 
by M’Closkey and Murray (1997) and Morin and 
Samson (2000), this allows to design control laws 
that guarantee a modified form of exponential 
convergence to the goal. 

Most of the aforementioned control designs - 
both for trajectory tracking and point stabilization 
- were first developed with reference to the unicy¬ 
cle robot but can be carried out on chained forms, 


thereby providing an effective extension to other 
kinematic models, e.g., the car-like robot. 

Summary and Future Directions 

Wheeled mobile robots are increasingly present 
in applications. Over the last two decades, sig¬ 
nificant results have been reached in terms of 
modeling, planning and control of these systems, 
and the field is now considered to be well estab¬ 
lished, at least from an application point of view. 
Nevertheless, a number of research directions are 
still open, including the following: 

• Planning and control for non-flat systems : 
Relatively harmless wheeled robots (such as 
a unicycle towing more than one off-hooked 
trailer) are not flat. 

• Robustness : The performance of controllers in 
the presence of disturbances and model pertur¬ 
bations has not received sufficient attention so 
far. 

• Localization : Feedback control requires 
accurate measurements of the configuration 
variables, which in mobile robots cannot be 
reliably reconstructed from onboard sensors 
(odometric data). Integration of exteroceptive 
sensing is essential to this end. 

• Vision-based control : As an alternative to 
localization-based methods, the feedback 
loop may be closed directly in the image 
plane, with significant advantages in terms of 
simplicity and robustness. 

• Multi-robot systems : The problem is to control 
the motion of multiple mobile robots in order 
to perform a cooperative motion task, e.g., 
formation control. 

Cross-References 

► Differential Geometric Methods in Nonlinear 
Control 

► Feedback Linearization of Nonlinear Systems 

► Feedback Stabilization of Nonlinear Systems 

► Lie Algebraic Methods in Nonlinear Control 

► Vehicle Dynamics Control 
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Recommended Reading 

For background material on nonlinear control¬ 
lability, including the necessary concepts of 
differential geometry, see Sastry (2005). General 
introductions to mobile robots can be found 
in Siegwart and Nourbakhsh (2004), Choset et al. 
(2005), Morin and Samson (2008), and Siciliano 
et al. (2009). A classification of wheeled mobile 
robots based on the number, placement, and type 
of wheels was proposed by Bastin et al. (1996). 
A detailed extension of some of the planning and 
control techniques reviewed in this article to the 
case of car-like kinematics is given in De Luca 
et al. (1998). A framework for the stabilization 
of non-flat nonholonomic robots was presented 
by Oriolo and Vendittelli (2005). Recent work 
aimed at designing practical universal controllers 
was carried out by Morin and Samson (2009). 
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